menu
Tatoeba
language
Nýskráning Innskrá
language Íslenska
menu
Tatoeba

chevron_right Nýskráning

chevron_right Innskrá

Vafra

chevron_right Sýna setningu af handahófi

chevron_right Vafra eftir tungumáli

chevron_right Vafra eftir lista

chevron_right Vafra eftir merki

chevron_right Vafra upptökum

Samfélag

chevron_right Veggur

chevron_right Meðlimalisti

chevron_right Listi tungumála meðlima

chevron_right Innfæddir

search
clear
swap_horiz
search

Ath.

The data you will find here will NOT be useful unless you are coding a language tool or processing data.

If you simply want sentences that you can use to learn a language, check out the sentence lists. You can build your own, or view the ones that others have created. The lists can be downloaded and printed.

General information about the files

Many of the Japanese and English sentences are from the Tanaka Corpus, which belongs to the public domain.

Creative commons

These files are released under CC BY 2.0 FR.

Creative Commons License CC-BY

A part of our sentences are also available under CC0 1.0.

Creative Commons License CC0

Leyfi fyrir upptökur

The license covering an audio file is chosen by the contributor, and is indicated on the page that lists the audio files that he or she has contributed.

Spurningar?

Ef þú hefur spurningar eða beiðnir vertu velkomin að hafa samband. Venjulega svörum við fljótlega.

Niðurhöl

arrow_back

Sérsniðinn útflutningur

Setningapör

Notaðu þetta verkfæri til að búa til og hlaða niður sérsniðnum útflutningi á eftirspurn.

translate Setningapör
Download all sentences in language A with translations in language B

Download all sentences in language A that are translated into language B, along with the translations.

Vikulegar útflutningar

info The files provided below are updated every Saturday at 6:30 a.m. (UTC).

Setningar

Skráarnafn

{{sentences | filename}}

Öllum tungumálum
Aðeins setningar á: Abasínska Abkasíska Adýge Afríhílí Afríkanska Aímara Aínu Aklanska Albanska Algerian Arabic Alþjóðatunga Amharíska Ancient Hebrew Arabíska Aragonska Aserska Assamska Assyríska Astúríska Avadí Avarska Balíska Balúkí Bambara Banjar Baskír Baskneska Bavarian Baybayanon Bengalska Berber Berom Bíslama Bojpúrí Bosníska Bódó Bretónska Brithenig Búlgarska Búrjatíska Búrmíska Central Bikol Central Dusun Central Huasteca Nahuatl Central Kanuri Central Kurdish (Soranî) Central Mnong Chavacano Cherokee-mál Chinese Pidgin English Chinyanja Chukchi Coastal Kadazan Cuyonon CycL Danska Dhivehi Drents Dungan Dutton World Speedwords Eastern Armenian Egyptian Arabic Eistneska Emilian Enska Erromintxela Ersja Esperantó Evenkí Ewe Extremaduran Finnska Fídjeyska Fídji-hindí Fornausturslavneska Fornenska Fornfranska Forngríska Franska Frísneska Fríúlska Færeyska Fönikíska Ga Gagás Galíanska Gan Garhwali Georgíska Gheg Albanian Gilberska Gotneska Gríska Gronings Grænlenska Guadeloupean Creole French Guerrero Nahuatl Gulf Arabic Gun Gújaratí Gvaraní haída Haítíska Hakka Chinese Havaíska Hása Hásorbneska Hebreska Hill Mari Hindí Hitchiti Híligaínon Hmong Daw (White) Hmong Njua (Green) Ho Hollenska Hunsrik Hvítrússneska Ilocano Indónesíska Ingrian Interglossa Interlingve Interslavic Inúktitút Iraqi Arabic Isan Íban Ídó Ígbó Írska Íslenska Ítalska jakút Jamaican Patois Japanska Javanska Jewish Babylonian Aramaic Jewish Palestinian Aramaic Jiddíska Jin Chinese Jórúba Juhuri (Judeo-Tat) K'iche' Kabardíska Kabíle Kajúga Kalmúkska Kamba Kambódíska Kamorró Kannada Kantónska Kapampangan Karakalpak Karakhanid Karasaíbalkar Karélska Kasakska Kasí Kasjkaí Kasmírska Kasúbíska Katalónska Kebúanó Kekchi (Q'eqchi') Kelantan-Pattani Malay Keningau Murut Khakas Khalaj Kirgiska Kínjarvanda Kínverska Kírúndí Klingonska Komi-Zyrian kongósvahílí Konkani (Goan) Kornbreska Korsíska Kotava Kómí-permyak Kóreska Kreólska (louisiana) Krímtataríska Króatíska Kúmík Kven Finnish Kvesjúa Kölsch Ladin Ladínska Lakóta Laó Latgalian Latína Laz Láadan Lágsorbneska Lettneska Lezgi Libyan Arabic Ligurian Limbúrgíska Lingala Lingua franca nova Literary Chinese Litháíska Livonian Lojban Lombard Low German (Low Saxon) Luganda Lushootseed Lúxemborgíska Madúrska Mahasu Pahari Maítílí Makedónska Malagasíska Malajalam Malajíska Malay (Vernacular) Maltneska Mambae Mandar Mansjú Manska Maorí mapuche Maratí Marshallska Máritíska Meadow Mari Meitei Middle Persian (Pahlavi) Miðenska Miðfranska Mikmak Min Nan Chinese Mingrelian Mirandesíska Mínangkabá Moksa Mon Mongólska Mono (USA) Moroccan Arabic Móhíska Muskogee (Creek) Naga (Tangshang) Nahuatl Nande napólíska Nauruan Navahó nehiyawak Nepalska Nevarí Ngeq Nigerian Fulfulde Níveska Norðursamíska Norræna Norskt bókmál North Frisian North Levantine Arabic North Moluccan Malay Northern Kurdish (Kurmancî) Northern Zaza (Kirmanjki) Novial Nógaí Nuosu Núer Nyungar Nýnorska O'odham Odia (Oriya) Ojibwe Okinawan Oksítaníska Old Aramaic Old Frisian Old Prussian Old Saxon Old Spanish Old Turkish Orizaba Nahuatl Ossetíska Palatine German Paláska Palí Pangasínmál Papíamentó Pastú Pennsylvaníska Persneska Picard Piedmontese Pipil Portúgalska Pólska Pulaar Punjabi (Eastern) Punjabi (Western) Quenya Rapa Nui Rendille rohingja Rómanska Rusyn Rúmenska Rússneska Samogitian Samóska Sangó Sanskrít Santalí Saraiki Sardínska Saterland Frisian Serbneska Setswana Seychellois Creole Shanghainese Shona Shuswap Sikileyska Silesian Sindarin Sindí Singalíska Sígaunamál Sínúk Sjagataí Sjoktá Sjúvas Skosk gelíska Skoska Slóvakíska Slóvenska South Levantine Arabic Southern Subanen Southern Zaza (Dimli) Sómalska Spænska Sranan tongo Staðlað marokkóskt tamazight suður-haída Suðuraltaíska Suðurkúrdíska Suðursamíska Suðursótó Súlúmál Súmerska Súndanska Svahílí Svissnesk þýska Swabian Swazi Sylheti Syriac Sænska Tachawit Tadsjikska Tagal Murut Tagalog Tahaggart Tamahaq Tahítíska Taílenska Talossan Talysh Tamílska Tarifit Tashelhit Tatarska Telúgú Temuan Tetun Tékkneska Tíbeska Tígre Tígrinja Tokelauan toki pona Tokpisin Tonga (Zambezi) Tongverska Tsjetsjenska Tsonga Tupinambá Tuvaluan túmbúka Túrkmenska Túvínska Tyrkneska Tyrkneska, ottóman Uab meto Ungverska Urhobo Údmúrt úígúr Úkraínska Úmbúndú Úrdú Úsbekska Vallónska Varaí Velska Venetian Vepsíska Víetnamíska Volapyk Volof Võro Wayuu West-Central Oromo Western Armenian Xhosa Xiang Chinese Yucatec Maya Zázáíska Zeelandic Þýska Óþekkt tungumál
File description
Contains all the sentences in the selected language. Each sentence is associated with a unique id and an ISO 639-3 language code.
Fields and structure
Sentence id [\ŧ] Mál [\ŧ] Texti

Detailed Sentences

Skráarnafn

{{sentencesDetailed | filename}}

Öllum tungumálum
Aðeins setningar á: Abasínska Abkasíska Adýge Afríhílí Afríkanska Aímara Aínu Aklanska Albanska Algerian Arabic Alþjóðatunga Amharíska Ancient Hebrew Arabíska Aragonska Aserska Assamska Assyríska Astúríska Avadí Avarska Balíska Balúkí Bambara Banjar Baskír Baskneska Bavarian Baybayanon Bengalska Berber Berom Bíslama Bojpúrí Bosníska Bódó Bretónska Brithenig Búlgarska Búrjatíska Búrmíska Central Bikol Central Dusun Central Huasteca Nahuatl Central Kanuri Central Kurdish (Soranî) Central Mnong Chavacano Cherokee-mál Chinese Pidgin English Chinyanja Chukchi Coastal Kadazan Cuyonon CycL Danska Dhivehi Drents Dungan Dutton World Speedwords Eastern Armenian Egyptian Arabic Eistneska Emilian Enska Erromintxela Ersja Esperantó Evenkí Ewe Extremaduran Finnska Fídjeyska Fídji-hindí Fornausturslavneska Fornenska Fornfranska Forngríska Franska Frísneska Fríúlska Færeyska Fönikíska Ga Gagás Galíanska Gan Garhwali Georgíska Gheg Albanian Gilberska Gotneska Gríska Gronings Grænlenska Guadeloupean Creole French Guerrero Nahuatl Gulf Arabic Gun Gújaratí Gvaraní haída Haítíska Hakka Chinese Havaíska Hása Hásorbneska Hebreska Hill Mari Hindí Hitchiti Híligaínon Hmong Daw (White) Hmong Njua (Green) Ho Hollenska Hunsrik Hvítrússneska Ilocano Indónesíska Ingrian Interglossa Interlingve Interslavic Inúktitút Iraqi Arabic Isan Íban Ídó Ígbó Írska Íslenska Ítalska jakút Jamaican Patois Japanska Javanska Jewish Babylonian Aramaic Jewish Palestinian Aramaic Jiddíska Jin Chinese Jórúba Juhuri (Judeo-Tat) K'iche' Kabardíska Kabíle Kajúga Kalmúkska Kamba Kambódíska Kamorró Kannada Kantónska Kapampangan Karakalpak Karakhanid Karasaíbalkar Karélska Kasakska Kasí Kasjkaí Kasmírska Kasúbíska Katalónska Kebúanó Kekchi (Q'eqchi') Kelantan-Pattani Malay Keningau Murut Khakas Khalaj Kirgiska Kínjarvanda Kínverska Kírúndí Klingonska Komi-Zyrian kongósvahílí Konkani (Goan) Kornbreska Korsíska Kotava Kómí-permyak Kóreska Kreólska (louisiana) Krímtataríska Króatíska Kúmík Kven Finnish Kvesjúa Kölsch Ladin Ladínska Lakóta Laó Latgalian Latína Laz Láadan Lágsorbneska Lettneska Lezgi Libyan Arabic Ligurian Limbúrgíska Lingala Lingua franca nova Literary Chinese Litháíska Livonian Lojban Lombard Low German (Low Saxon) Luganda Lushootseed Lúxemborgíska Madúrska Mahasu Pahari Maítílí Makedónska Malagasíska Malajalam Malajíska Malay (Vernacular) Maltneska Mambae Mandar Mansjú Manska Maorí mapuche Maratí Marshallska Máritíska Meadow Mari Meitei Middle Persian (Pahlavi) Miðenska Miðfranska Mikmak Min Nan Chinese Mingrelian Mirandesíska Mínangkabá Moksa Mon Mongólska Mono (USA) Moroccan Arabic Móhíska Muskogee (Creek) Naga (Tangshang) Nahuatl Nande napólíska Nauruan Navahó nehiyawak Nepalska Nevarí Ngeq Nigerian Fulfulde Níveska Norðursamíska Norræna Norskt bókmál North Frisian North Levantine Arabic North Moluccan Malay Northern Kurdish (Kurmancî) Northern Zaza (Kirmanjki) Novial Nógaí Nuosu Núer Nyungar Nýnorska O'odham Odia (Oriya) Ojibwe Okinawan Oksítaníska Old Aramaic Old Frisian Old Prussian Old Saxon Old Spanish Old Turkish Orizaba Nahuatl Ossetíska Palatine German Paláska Palí Pangasínmál Papíamentó Pastú Pennsylvaníska Persneska Picard Piedmontese Pipil Portúgalska Pólska Pulaar Punjabi (Eastern) Punjabi (Western) Quenya Rapa Nui Rendille rohingja Rómanska Rusyn Rúmenska Rússneska Samogitian Samóska Sangó Sanskrít Santalí Saraiki Sardínska Saterland Frisian Serbneska Setswana Seychellois Creole Shanghainese Shona Shuswap Sikileyska Silesian Sindarin Sindí Singalíska Sígaunamál Sínúk Sjagataí Sjoktá Sjúvas Skosk gelíska Skoska Slóvakíska Slóvenska South Levantine Arabic Southern Subanen Southern Zaza (Dimli) Sómalska Spænska Sranan tongo Staðlað marokkóskt tamazight suður-haída Suðuraltaíska Suðurkúrdíska Suðursamíska Suðursótó Súlúmál Súmerska Súndanska Svahílí Svissnesk þýska Swabian Swazi Sylheti Syriac Sænska Tachawit Tadsjikska Tagal Murut Tagalog Tahaggart Tamahaq Tahítíska Taílenska Talossan Talysh Tamílska Tarifit Tashelhit Tatarska Telúgú Temuan Tetun Tékkneska Tíbeska Tígre Tígrinja Tokelauan toki pona Tokpisin Tonga (Zambezi) Tongverska Tsjetsjenska Tsonga Tupinambá Tuvaluan túmbúka Túrkmenska Túvínska Tyrkneska Tyrkneska, ottóman Uab meto Ungverska Urhobo Údmúrt úígúr Úkraínska Úmbúndú Úrdú Úsbekska Vallónska Varaí Velska Venetian Vepsíska Víetnamíska Volapyk Volof Võro Wayuu West-Central Oromo Western Armenian Xhosa Xiang Chinese Yucatec Maya Zázáíska Zeelandic Þýska Óþekkt tungumál
File description
Contains additional fields for each sentence (owner name, date created/modified).
Fields and structure
Sentence id [\ŧ] Mál [\ŧ] Texti [\ŧ] Notandanafn [\ŧ] Date added [\ŧ] Date last modified

Upprunalegar og Þýddar Setningar

Skráarnafn
sentences_base.tar.bz2
File description
Each sentence is listed as original or a translation of another. The "base" field can have the following values:
  • zero: The sentence is original, not a translation of another.
  • greater than zero: The id of the sentence from which it was translated.
  • \N: Unknown (rare).
Fields and structure
Sentence id [\ŧ] Base field

Setningar (CC0)

Skráarnafn

{{sentencesCC0 | filename}}

Öllum tungumálum
Aðeins setningar á: Algerian Arabic Alþjóðatunga Ancient Hebrew Arabíska Bengalska Berber Danska Enska Esperantó Finnska Forngríska Franska Fönikíska Hebreska Hindí Ho Hollenska Hvítrússneska Interlingve Ídó Ítalska Japanska Jewish Babylonian Aramaic Jewish Palestinian Aramaic Jiddíska Kabíle Kantónska Karélska Katalónska Kínverska Klingonska Konkani (Goan) Kven Finnish Ladínska Latína Láadan Ligurian Literary Chinese Miðenska Norræna Norskt bókmál Nyungar Odia (Oriya) Old Aramaic Old Frisian Portúgalska Pólska Rússneska Santalí Spænska Staðlað marokkóskt tamazight Sylheti Sænska Tachawit Tékkneska toki pona Ungverska Úkraínska Velska Volapyk Þýska Óþekkt tungumál
File description
Contains all the sentences available under CC0.
Fields and structure
Sentence id [\ŧ] Mál [\ŧ] Texti [\ŧ] Date last modified

Tengingar

Skráarnafn
links.tar.bz2
File description
Contains the links between the sentences. 1 [\ŧ] 77 means that sentence #77 is the translation of sentence #1. The reciprocal link is also present, so the file will also contain a line that says 77 [\ŧ] 1.
Fields and structure
Sentence id [\ŧ] Þýðingarauðkenni

Merki

Skráarnafn
tags.tar.bz2
File description
Contains the list of tags associated with each sentence. 381279 [\ŧ] proverb means that sentence #381279 has been assigned the "proverb" tag.
Fields and structure
Sentence id [\ŧ] Merkjanafn

Listar

Skráarnafn
user_lists.tar.bz2
File description
Contains the list of sentence lists.
Fields and structure
Listarauðkenni [\ŧ] Notandanafn [\ŧ] Date created [\ŧ] Date last modified [\ŧ] Listarnafn [\ŧ] Breytanlegt af

Setningar á listum

Skráarnafn
sentences_in_lists.tar.bz2
File description
Indicates the sentences that are contained by any lists. 13 [\ŧ] 381279 means that sentence #381279 is contained by the list that has an id of 13.
Fields and structure
Listarauðkenni [\ŧ] Sentence id

Japanese indices

Skráarnafn
jpn_indices.tar.bz2
File description
Contains the equivalent of the "B lines" in the Tanaka Corpus file distributed by Jim Breen. See this page for the format. Each entry is associated with a pair of Japanese/English sentences. Sentence id refers to the id of the Japanese sentence. Meaning id refers to the id of the English sentence.
Fields and structure
Sentence id [\ŧ] Meaning id [\ŧ] Texti

Setningar með upptöku

Skráarnafn
sentences_with_audio.tar.bz2
File description
Contains the ids of the sentences, in all languages, for which audio is available. Other fields indicate who recorded the audio, its license and a URL to attribute the author. If the license field is empty, you may not reuse the audio outside the Tatoeba project.
Downloading audio
A single sentence can have one or more audio, each from a different voice. To download a particular audio, use its audio id to compute the download URL. For example, to download the audio with the id 1234, the URL is https://tatoeba.org/audio/download/1234.
Fields and structure
Sentence id [\ŧ] Audio id [\ŧ] Notandanafn [\ŧ] Leyfi [\ŧ] Attribution URL

Kunnáttustig notenda eftir tungumáli

Skráarnafn
user_languages.tar.bz2
File description
Gefur til kynna hæfileika meðlima á einstökum tungumálum.
Fields and structure
Mál [\ŧ] Skill level [\ŧ] Notandanafn [\ŧ] Smáatriði

Setninga yfirför notanda

Skráarnafn
users_sentences.csv
File description
Contains sentences reviewed by users. The value of the review can be -1 (sentence not OK), 0 (undecided or unsure), or 1 (sentence OK). Warning: this data is still experimental.
Fields and structure
Notandanafn [\ŧ] Sentence id [\ŧ] Review [\ŧ] Date added [\ŧ] Date last modified

Umritunir

Skráarnafn

{{transcriptions | filename}}

Öllum tungumálum
Aðeins setningar á: Japanska Kantónska Kínverska Úsbekska
File description
Contains all transcriptions in auxiliary or alternative scripts. A username associated with a transcription indicates the user who last reviewed and possibly modified it. A transcription without a username has not been marked as reviewed. The script name is defined according to the ISO 15924 standard.
Fields and structure
Sentence id [\ŧ] Mál [\ŧ] Script name [\ŧ] Notandanafn [\ŧ] Umritun