menu
Tatoeba
language
Registreren Anmellen
language Plattdüütsch
menu
Tatoeba

chevron_right Registreren

chevron_right Anmellen

Dörkieken

chevron_right Show random sentence

chevron_right Na Spraak dörkieken

chevron_right Na List dörkieken

chevron_right Dörkieken na Tag

chevron_right Audiodatein dörkieken

Community

chevron_right Pinnwand

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search

Henwies

The data you will find here will NOT be useful unless you are coding a language tool or processing data.

If you simply want sentences that you can use to learn a language, check out the sentence lists. You can build your own, or view the ones that others have created. The lists can be downloaded and printed.

General information about the files

Many of the Japanese and English sentences are from the Tanaka Corpus, which belongs to the public domain.

Creative commons

These files are released under CC BY 2.0 FR.

Creative Commons License CC-BY

A part of our sentences are also available under CC0 1.0.

Creative Commons License CC0

Licenses covering audio

The license covering an audio file is chosen by the contributor, and is indicated on the page that lists the audio files that he or she has contributed.

Questions?

If you have questions or requests, feel free to contact us. In general, we answer quickly.

Dalladen

arrow_back

Custom exports

Sentence pairs

Use this tool to generate and download customized exports on demand.

translate Sentence pairs
Download all sentences in language A with translations in language B

Download all sentences in language A that are translated into language B, along with the translations.

Weekly exports

info The files provided below are updated every Saturday at 6:30 a.m. (UTC).

Sätz

Filename

{{sentences | filename}}

All Spraken
Only sentences in: Abkhaz Adyghe Afrihili Afrikaansch ägyptsch Araabsch Ainu Aklanon Albaansch Algerian Arabic am an Ancient Hebrew Angelsassische Sprake Araabsch Araukaansch as Assyrian Neo-Aramaic Astuursche Spraak Avar Awadhi ay az ba Baiersch Balinese Banjar Basksch Baybayanon Belutschisch Bengaalsch Berber Berom Bhojpuri Bislama bm bo Bodo Bosnisch Bretoonsch Brithenig Bulgaarsch Buryat Cayuga ce Cebuano Central Bikol Central Dusun Central Huasteca Nahuatl Central Kanuri Central Kurdish (Soranî) Central Mnong Chagatai Chamorro Chavacano Cherokee Chineesch Chinese Pidgin English Chinook Chinyanja Choctaw co Coastal Kadazan Congo Swahili Cuyonon cv cy CycL Däänsch Dhivehi Drents Dungan Dutton World Speedwords Eastern Armenian ee Eestnisch Elsässerdüütsch Emilian Engelsch Erromintxela Erzya Esperanto Evenki Extremaduran Färöösch Fiji Hindi Finnsch fj Franzöösch Furlan Ga Gagauz Galizisch Gan Chinese Garhwali Georgsch Gheg Albanian Gilbertese gn Gootsche Spraak Greeksch Greenlandic Grunnegs gu Guadeloupean Creole French Guerrero Nahuatl Gulf Arabic gv ha Haida Hakka Chinese Hawaiiaansche Spraak Hebrääsch Hiligaynon Hill Mari Hindi Hitchiti Hmong Daw (White) Hmong Njua (Green) Ho Hoochdüütsch ht Hunsrik Iban Ido ie Ieslandsch ig Ilocano Indoneesch Ingrian Interglossa Interlingua Inuktitut Iraksch Araabsch Irisch Isan Italieensch Jakuutsch Jamaican Patois Japaansch Jewish Babylonian Aramaic Jewish Palestinian Aramaic Jiddisch Jin Chinese Juhuri (Judeo-Tat) jv K'iche' Kabardiensch Kabyle Kalmyk Kamba Kantoneesch Kapampangan Karachay-Balkar Karakalpak Karakhanid Karelian Kasachsch Kaschuubsch Katalaansch Kekchi (Q'eqchi') Kelantan-Pattani Malay Keningau Murut Khakas Khalaj Khasi Kirundi Klassisch Chineesch Klingoonsch km kn Kölsch Komi-Permyak Komi-Zyrian Konkani (Goan) Koreaansch Kotava Krimtataarsch Kroaatsch ks Kumyk Kven Finnish kw ky Láadan Ladin Ladino Lakota Latgalian Latiensch Laz lb Lettsch Libyan Arabic Ligurian Limborgsch Lingua Franca Nova Litausch Livonian ln lo Lojban Lombard Louisiana Creole Low German (Low Saxon) Luganda Lushootseed Madurese Mahasu Pahari Maithili Malaisch Malay (Vernacular) Malayalam Mambae Manchu Meadow Mari Meitei mg mh mi Micmac Middle English Middle French Middle Persian (Pahlavi) Min Nan Chinese Minangkabau Mingrelian mk Mohawk Moksha Mon Mongoolsch Mono (USA) Morisyen Moroccan Arabic mr mt Muskogee (Creek) my Naga (Tangshang) Nahuatl Nande Nauruan nb Neapolitaansch Nedderlandsch neddersorbsche Spraak Nepaleesch Newari Ngeq Nigerian Fulfulde Niuean nn Nogai Noordfreesch North Levantine Arabic North Moluccan Malay Northern Kurdish (Kurmancî) Northern Zaza (Kirmanjki) Novial Nuer Nuosu nv Nyungar O'odham oc Odia (Oriya) Ojibwe Okinawan Old Aramaic Old Norse Old Prussian Old Spanish Old Tupi Old Turkish Ooldfranzöösch Ooldfreesch Ooldoostslaavsch Ooldsassisch ooltgreeksche Spraak Orizaba Nahuatl Osseetsch Ottoman Turkish Palatine German Palauan Pali Pangasinan Papiamento Pennsylvaniadüütsch Pers’sch Phönieksch Picardsch Piedmontese Pipil Plains Cree Poolsch Portugeesch ps Pulaar Punjabi (Eastern) Punjabi (Western) Qashqai Quechua Quenya Rapa Nui Rätoromaansch Rendille Rohingya Romani Rumäänsch Russ’sch Rusyn rw Samogitian Sanskrit Santali Saraiki Saterfreesche Spraak sc Schottsch-Gäälsch Schwääbsch Scots sd se Serbsch Setswana Seychellois Creole sg Shanghaineesch Shuswap si Silesian Sindarin Siziliaansch Slowaaksch Sloweensch sm sn so South Levantine Arabic Southern Altai Southern Haida Southern Kurdish Southern Sami Southern Subanen Southern Zaza (Dimli) Spaansch st Standard Moroccan Tamazight su Suaheli Sumerian Surinaamsch Swazi Sweedsch Sylheti Syriac ta Tachawit Tagal Murut Tagalog Tahaggart Tamahaq Talossan Talysh Tarifit Tashelhit Tataarsch te Temuan Tetun tg Thai ti Tigre tk to Tok Pisin Tokelauan Toki Pona Tonga (Zambezi) Törksch ts Tschechsch Tschuktsche Spraak Tumbuka Tuvaluan Tuvinian ty Uab Meto Udmurt Uighuursch Ukrainsch Umbundu Ungaarsch Upper Sorbian Urdu Urhobo Usbeeksch Venetian Veps Vietnameesch Volapük Võro wa Waray Wayuu Western Armenian Westfreesch Wiktionary:Över Mirandeesch Wittruss’sch wo xh Xiang Chinese yo Yucatec Maya Zaza Ziews zu Unknown language
File description
Contains all the sentences in the selected language. Each sentence is associated with a unique id and an ISO 639-3 language code.
Feller un Struktur
Sentence id [tab] Lang [tab] Text

Detailed Sentences

Filename

{{sentencesDetailed | filename}}

All Spraken
Only sentences in: Abkhaz Adyghe Afrihili Afrikaansch ägyptsch Araabsch Ainu Aklanon Albaansch Algerian Arabic am an Ancient Hebrew Angelsassische Sprake Araabsch Araukaansch as Assyrian Neo-Aramaic Astuursche Spraak Avar Awadhi ay az ba Baiersch Balinese Banjar Basksch Baybayanon Belutschisch Bengaalsch Berber Berom Bhojpuri Bislama bm bo Bodo Bosnisch Bretoonsch Brithenig Bulgaarsch Buryat Cayuga ce Cebuano Central Bikol Central Dusun Central Huasteca Nahuatl Central Kanuri Central Kurdish (Soranî) Central Mnong Chagatai Chamorro Chavacano Cherokee Chineesch Chinese Pidgin English Chinook Chinyanja Choctaw co Coastal Kadazan Congo Swahili Cuyonon cv cy CycL Däänsch Dhivehi Drents Dungan Dutton World Speedwords Eastern Armenian ee Eestnisch Elsässerdüütsch Emilian Engelsch Erromintxela Erzya Esperanto Evenki Extremaduran Färöösch Fiji Hindi Finnsch fj Franzöösch Furlan Ga Gagauz Galizisch Gan Chinese Garhwali Georgsch Gheg Albanian Gilbertese gn Gootsche Spraak Greeksch Greenlandic Grunnegs gu Guadeloupean Creole French Guerrero Nahuatl Gulf Arabic gv ha Haida Hakka Chinese Hawaiiaansche Spraak Hebrääsch Hiligaynon Hill Mari Hindi Hitchiti Hmong Daw (White) Hmong Njua (Green) Ho Hoochdüütsch ht Hunsrik Iban Ido ie Ieslandsch ig Ilocano Indoneesch Ingrian Interglossa Interlingua Inuktitut Iraksch Araabsch Irisch Isan Italieensch Jakuutsch Jamaican Patois Japaansch Jewish Babylonian Aramaic Jewish Palestinian Aramaic Jiddisch Jin Chinese Juhuri (Judeo-Tat) jv K'iche' Kabardiensch Kabyle Kalmyk Kamba Kantoneesch Kapampangan Karachay-Balkar Karakalpak Karakhanid Karelian Kasachsch Kaschuubsch Katalaansch Kekchi (Q'eqchi') Kelantan-Pattani Malay Keningau Murut Khakas Khalaj Khasi Kirundi Klassisch Chineesch Klingoonsch km kn Kölsch Komi-Permyak Komi-Zyrian Konkani (Goan) Koreaansch Kotava Krimtataarsch Kroaatsch ks Kumyk Kven Finnish kw ky Láadan Ladin Ladino Lakota Latgalian Latiensch Laz lb Lettsch Libyan Arabic Ligurian Limborgsch Lingua Franca Nova Litausch Livonian ln lo Lojban Lombard Louisiana Creole Low German (Low Saxon) Luganda Lushootseed Madurese Mahasu Pahari Maithili Malaisch Malay (Vernacular) Malayalam Mambae Manchu Meadow Mari Meitei mg mh mi Micmac Middle English Middle French Middle Persian (Pahlavi) Min Nan Chinese Minangkabau Mingrelian mk Mohawk Moksha Mon Mongoolsch Mono (USA) Morisyen Moroccan Arabic mr mt Muskogee (Creek) my Naga (Tangshang) Nahuatl Nande Nauruan nb Neapolitaansch Nedderlandsch neddersorbsche Spraak Nepaleesch Newari Ngeq Nigerian Fulfulde Niuean nn Nogai Noordfreesch North Levantine Arabic North Moluccan Malay Northern Kurdish (Kurmancî) Northern Zaza (Kirmanjki) Novial Nuer Nuosu nv Nyungar O'odham oc Odia (Oriya) Ojibwe Okinawan Old Aramaic Old Norse Old Prussian Old Spanish Old Tupi Old Turkish Ooldfranzöösch Ooldfreesch Ooldoostslaavsch Ooldsassisch ooltgreeksche Spraak Orizaba Nahuatl Osseetsch Ottoman Turkish Palatine German Palauan Pali Pangasinan Papiamento Pennsylvaniadüütsch Pers’sch Phönieksch Picardsch Piedmontese Pipil Plains Cree Poolsch Portugeesch ps Pulaar Punjabi (Eastern) Punjabi (Western) Qashqai Quechua Quenya Rapa Nui Rätoromaansch Rendille Rohingya Romani Rumäänsch Russ’sch Rusyn rw Samogitian Sanskrit Santali Saraiki Saterfreesche Spraak sc Schottsch-Gäälsch Schwääbsch Scots sd se Serbsch Setswana Seychellois Creole sg Shanghaineesch Shuswap si Silesian Sindarin Siziliaansch Slowaaksch Sloweensch sm sn so South Levantine Arabic Southern Altai Southern Haida Southern Kurdish Southern Sami Southern Subanen Southern Zaza (Dimli) Spaansch st Standard Moroccan Tamazight su Suaheli Sumerian Surinaamsch Swazi Sweedsch Sylheti Syriac ta Tachawit Tagal Murut Tagalog Tahaggart Tamahaq Talossan Talysh Tarifit Tashelhit Tataarsch te Temuan Tetun tg Thai ti Tigre tk to Tok Pisin Tokelauan Toki Pona Tonga (Zambezi) Törksch ts Tschechsch Tschuktsche Spraak Tumbuka Tuvaluan Tuvinian ty Uab Meto Udmurt Uighuursch Ukrainsch Umbundu Ungaarsch Upper Sorbian Urdu Urhobo Usbeeksch Venetian Veps Vietnameesch Volapük Võro wa Waray Wayuu Western Armenian Westfreesch Wiktionary:Över Mirandeesch Wittruss’sch wo xh Xiang Chinese yo Yucatec Maya Zaza Ziews zu Unknown language
File description
Contains additional fields for each sentence (owner name, date created/modified).
Feller un Struktur
Sentence id [tab] Lang [tab] Text [tab] Brukernaam [tab] Date added [tab] Date last modified

Original and Translated Sentences

Filename
sentences_base.tar.bz2
File description
Each sentence is listed as original or a translation of another. The "base" field can have the following values:
  • zero: The sentence is original, not a translation of another.
  • greater than zero: The id of the sentence from which it was translated.
  • \N: Unknown (rare).
Feller un Struktur
Sentence id [tab] Base field

Sentences (CC0)

Filename

{{sentencesCC0 | filename}}

All Spraken
Only sentences in: Algerian Arabic Ancient Hebrew Araabsch Bengaalsch Berber Chineesch cy Däänsch Engelsch Esperanto Finnsch Franzöösch Hebrääsch Hindi Ho Hoochdüütsch Ido Interlingua Italieensch Japaansch Jewish Babylonian Aramaic Jewish Palestinian Aramaic Jiddisch Kabyle Kantoneesch Karelian Katalaansch Klassisch Chineesch Klingoonsch Kven Finnish Láadan Ladino Latiensch Ligurian Middle English nb Nedderlandsch Nyungar Old Aramaic Old Norse Ooldfreesch ooltgreeksche Spraak Phönieksch Poolsch Portugeesch Russ’sch Santali Spaansch Standard Moroccan Tamazight Sweedsch Sylheti Tachawit Toki Pona Tschechsch Ukrainsch Ungaarsch Volapük Wittruss’sch Unknown language
File description
Contains all the sentences available under CC0.
Feller un Struktur
Sentence id [tab] Lang [tab] Text [tab] Date last modified

Links

Filename
links.tar.bz2
File description
Contains the links between the sentences. 1 [tab] 77 means that sentence #77 is the translation of sentence #1. The reciprocal link is also present, so the file will also contain a line that says 77 [tab] 1.
Feller un Struktur
Sentence id [tab] Translation id

Tags

Filename
tags.tar.bz2
File description
Contains the list of tags associated with each sentence. 381279 [tab] proverb means that sentence #381279 has been assigned the "proverb" tag.
Feller un Struktur
Sentence id [tab] Tag name

Lists

Filename
user_lists.tar.bz2
File description
Contains the list of sentence lists.
Feller un Struktur
List id [tab] Brukernaam [tab] Date created [tab] Date last modified [tab] List name [tab] Editable by

Sentences in lists

Filename
sentences_in_lists.tar.bz2
File description
Indicates the sentences that are contained by any lists. 13 [tab] 381279 means that sentence #381279 is contained by the list that has an id of 13.
Feller un Struktur
List id [tab] Sentence id

Japanese indices

Filename
jpn_indices.tar.bz2
File description
Contains the equivalent of the "B lines" in the Tanaka Corpus file distributed by Jim Breen. See this page for the format. Each entry is associated with a pair of Japanese/English sentences. Sentence id refers to the id of the Japanese sentence. Meaning id refers to the id of the English sentence.
Feller un Struktur
Sentence id [tab] Meaning id [tab] Text

Sätz mit Audio

Filename
sentences_with_audio.tar.bz2
File description
Contains the ids of the sentences, in all languages, for which audio is available. Other fields indicate who recorded the audio, its license and a URL to attribute the author. If the license field is empty, you may not reuse the audio outside the Tatoeba project.
Downloading audio
A single sentence can have one or more audio, each from a different voice. To download a particular audio, use its audio id to compute the download URL. For example, to download the audio with the id 1234, the URL is https://tatoeba.org/audio/download/1234.
Feller un Struktur
Sentence id [tab] Audio id [tab] Brukernaam [tab] License [tab] Attribution URL

User skill level per language

Filename
user_languages.tar.bz2
File description
Indicates the self-reported skill levels of members in individual languages.
Feller un Struktur
Lang [tab] Skill level [tab] Brukernaam [tab] Details

Users' sentence reviews

Filename
users_sentences.csv
File description
Contains sentences reviewed by users. The value of the review can be -1 (sentence not OK), 0 (undecided or unsure), or 1 (sentence OK). Warning: this data is still experimental.
Feller un Struktur
Brukernaam [tab] Sentence id [tab] Review [tab] Date added [tab] Date last modified

Transcriptions

Filename

{{transcriptions | filename}}

All Spraken
Only sentences in: Chineesch Japaansch Kantoneesch Usbeeksch
File description
Contains all transcriptions in auxiliary or alternative scripts. A username associated with a transcription indicates the user who last reviewed and possibly modified it. A transcription without a username has not been marked as reviewed. The script name is defined according to the ISO 15924 standard.
Feller un Struktur
Sentence id [tab] Lang [tab] Script name [tab] Brukernaam [tab] Transcription