menu
Tatoeba
language
Înregistrare Autentificare
language Română
menu
Tatoeba

chevron_right Înregistrare

chevron_right Autentificare

Navigați

chevron_right Afișați o propoziție aleatorie

chevron_right Navigați după limbă

chevron_right Navigați după liste

chevron_right Navigați după etichete

chevron_right Navigați după conținut audio

Comunitate

chevron_right Perete

chevron_right Listă cu toți membrii

chevron_right Limbi vorbite de membri

chevron_right Vorbitori nativi

search
clear
swap_horiz
search

Perete (7.326 subiecte)

Sugestii

Înainte de a pune o întrebare, asigurați-vă că ați citit Întrebările frecvente.

Ne propunem să menținem o atmosferă pozitivă pentru discuții civilizate. Vă rugăm să citiți regulile noastre împotriva comportamentului necorespunzător.

Ultimele mesaje subdirectory_arrow_right

Pfirsichbaeumchen

ieri

feedback

kumakyoo

ieri

subdirectory_arrow_right

maaster

acum 4 zile

subdirectory_arrow_right

marafon

acum 5 zile

feedback

maaster

acum 5 zile

subdirectory_arrow_right

LeviHighway

acum 5 zile

subdirectory_arrow_right

cafoc64474

acum 6 zile

feedback

CK

acum 7 zile

subdirectory_arrow_right

CK

acum 7 zile

subdirectory_arrow_right

araneo

acum 8 zile

acum o oră 26 februarie 2026, 19:59:18 UTC link Link permanent
warning

Conținutul acestui mesaj contravine regulilor noastre și, prin urmare, a fost ascuns. Este afișat numai pentru admini și pentru autorul mesajului.

acum 2 ore 26 februarie 2026, 19:57:43 UTC link Link permanent
warning

Conținutul acestui mesaj contravine regulilor noastre și, prin urmare, a fost ascuns. Este afișat numai pentru admini și pentru autorul mesajului.

acum 2 ore 26 februarie 2026, 19:54:25 UTC link Link permanent
warning

Conținutul acestui mesaj contravine regulilor noastre și, prin urmare, a fost ascuns. Este afișat numai pentru admini și pentru autorul mesajului.

acum 2 ore 26 februarie 2026, 19:52:59 UTC link Link permanent
warning

Conținutul acestui mesaj contravine regulilor noastre și, prin urmare, a fost ascuns. Este afișat numai pentru admini și pentru autorul mesajului.

kumakyoo kumakyoo ieri 25 februarie 2026, 13:29:32 UTC flag Report link Link permanent

Kürzlich kam beim Satz https://tatoeba.org/de/sentence...68501#comments die Frage auf, wie man mit Großschreibung am Anfang von Verszeilen umgehen möchte. Allerdings sieht es so aus, als ob die Frage dort untergegangen ist, deswegen nochmal hier.

Worum geht es? Zumindest in Deutschland ist es möglich, dass man in Gedichten oder Liedtexten den Anfang jeder Textzeile groß schreibt, also so:

Auf der Lüneburger Heide,
In dem wunderschönen Land,
Ging ich auf und ging ich unter,
Allerlei am Weg ich fand.

Schreibt man dies aber als Satz auf (Auf der Lüneburger Heide, In dem wunderschönen Land, Ging ich auf und ging ich unter, Allerlei am Weg ich fand.), geht die Versstruktur verloren und zumindest für mich wirkt die Großschreibung von In, Ging und Allerlei eher merkwürdig.

Für mich steht deswegen die Frage im Raum, ob es schon Ideen gibt, wie man damit am Besten umgeht? Falls nicht, wäre mein Vorschlag, dass man die Verse durch einfügen eines Schrägstrichs (/) trennt. Das ist meines Wissen auch an anderen Stellen üblich, beispielsweise wenn man Platz sparen will. Also so: Auf der Lüneburger Heide, / In dem wunderschönen Land, / Ging ich auf und ging ich unter, / Allerlei am Weg ich fand.

Was ist eure Meinung dazu?

{{vm.hiddenReplies[41687] ? 'expand_more' : 'expand_less'}} ascundeți răspunsurile afișați răspunsurile
Pfirsichbaeumchen Pfirsichbaeumchen ieri 25 februarie 2026, 13:58:12 UTC flag Report link Link permanent

Da bei Tatoeba in einer Zeile geschrieben wird, verwende ich normale Groß- und Kleinschreibung und Trennung der Verse durch Schrägstriche mit einem geschützten Leerzeichen davor und einem normalen danach:

Auf der Lüneburger Heide, / in dem wunderschönen Land, / ging ich auf und ging ich unter, / allerlei am Weg ich fand.

acum 2 zile 24 februarie 2026, 16:29:59 UTC link Link permanent
warning

Conținutul acestui mesaj contravine regulilor noastre și, prin urmare, a fost ascuns. Este afișat numai pentru admini și pentru autorul mesajului.

maaster maaster acum 5 zile 21 februarie 2026, 13:53:07 UTC flag Report link Link permanent

I'm wondering why horus link sentences.
At the sentence #857032, there was a wrong link by horus.
How can it be?

{{vm.hiddenReplies[41683] ? 'expand_more' : 'expand_less'}} ascundeți răspunsurile afișați răspunsurile
marafon marafon acum 5 zile 21 februarie 2026, 14:55:33 UTC flag Report link Link permanent

It was you who added "Akarod, hogy segítsek?" as a translation of #4697681 (Voulez-vous que je vous aide ?). Horus just merged the duplicate sentences afterwards.

See https://tatoeba.org/en/sentences/show/4699053

Logs
Akarod, hogy segítsek?
added by maaster, November 13, 2015

#4697681
linked by maaster, November 13, 2015

#4697681
unlinked by Horus, December 24, 2016

Akarod, hogy segítsek?
deleted by Horus, December 24, 2016

{{vm.hiddenReplies[41684] ? 'expand_more' : 'expand_less'}} ascundeți răspunsurile afișați răspunsurile
maaster maaster acum 4 zile 22 februarie 2026, 13:05:30 UTC flag Report link Link permanent

Oh, really :)

CK CK acum 7 zile 19 februarie 2026, 09:54:44 UTC flag Report link Link permanent

🍎 I generated word frequency lists for each language.

https://aitstudy.com/temp/wordcounts/

These are all TSV files
word + tab + count

bz2 compressed files.

{{vm.hiddenReplies[41680] ? 'expand_more' : 'expand_less'}} ascundeți răspunsurile afișați răspunsurile
cafoc64474 cafoc64474 acum 6 zile, editat acum 6 zile 20 februarie 2026, 07:21:00 UTC, editat 20 februarie 2026, 07:52:36 UTC flag Report link Link permanent

I see you created site named AI study.

You could create dictionaries between languages based on how many times words appear in translation pairs.

Also you could use English as pivot language for sentences (although not 100% accurate) to build even bigger sentence pair list between languages.

Also you could train translation models (example MarianMT).

I'm sorry if I bothered you.
Currently, I don't have time to do these type of things myself.

LeviHighway LeviHighway acum 5 zile 21 februarie 2026, 03:06:35 UTC flag Report link Link permanent

I saw you made a proper version for Japanese. I would like to help you to do a proper version for Chinese and Korean as well. Can we work together on this?

Alex_M Alex_M acum 17 zile, editat acum 17 zile 9 februarie 2026, 07:01:11 UTC, editat 9 februarie 2026, 07:02:13 UTC flag Report link Link permanent

Question about index list.
I want to see an alphabetical list of say German words. Not sentences, but just words.
When I notice a word which I do not know and which I wish to learn, I click on the word and see a German sentence with this word.
I have a book with pairs of sentences, and this book includes two indexes for both languages with corresponding page numbers. It is very convenient for finding and learning unknown words.
Is there such functionality on the website or via API?

{{vm.hiddenReplies[41654] ? 'expand_more' : 'expand_less'}} ascundeți răspunsurile afișați răspunsurile
AlanF_US AlanF_US acum 15 zile 11 februarie 2026, 13:40:21 UTC flag Report link Link permanent

No, there is no such functionality on the website or via the API.

{{vm.hiddenReplies[41661] ? 'expand_more' : 'expand_less'}} ascundeți răspunsurile afișați răspunsurile
Alex_M Alex_M acum 14 zile 12 februarie 2026, 06:43:15 UTC flag Report link Link permanent

Thank you for the information.
Meanwhile I found the alphabetical index lists online. For example, for German language it is: https://de.wiktionary.org/wiki/...l:Präfixindex/
But this index is overwhelming as it includes all the words including rare terms.
It would be interesting to have an index list of words which native speakers use in the sentences.

gillux gillux acum 12 zile 14 februarie 2026, 11:01:12 UTC flag Report link Link permanent

I believe your idea would be very useful, but it might be a bit out of the scope of Tatoeba. What I mean is that Tatoeba focuses mainly on building the corpora and making it available to the world, while others can build upon this resource to make something more specific, such as an alphabetical list of German words.

That being said, the search engine that powers tatoeba.org precisely has word indexes for every language. There is a simple tool to dump these indexes, so we can consider exporting them as text files. However, because the purpose of these indexes is only to allow very fast retrieval of sentences, this data might be a bit too "raw" to be directly usable by language learners. But please let me know if you or anyone else is interested by such dumps.

{{vm.hiddenReplies[41665] ? 'expand_more' : 'expand_less'}} ascundeți răspunsurile afișați răspunsurile
Alex_M Alex_M acum 11 zile 14 februarie 2026, 22:18:53 UTC flag Report link Link permanent

Thank you for your input. For me personally a text file would not be useful. What I had in mind is the list of distinct words on an HTML page, where a word could be clicked to see the sentences in which it is used.

And it would be nice to have a possibility to sort it in alphabetical order and in frequency order.

Certainly, it would be a complex task. For example, there is a German word: Zugeständnis, but plural form of this word is: Zugeständnisse. I am not sure if both variants should be in such a list or only one. And what if a word exists in the database with a spelling error? That's to say there would be for sure an issue of rawness which you mentioned.

I asked about this index list only because I try to learn new words in a sentence. And it would be easier for me to spot in such a list words which I do not know yet. Especially if it is a word with high frequency.

{{vm.hiddenReplies[41668] ? 'expand_more' : 'expand_less'}} ascundeți răspunsurile afișați răspunsurile
kumakyoo kumakyoo acum 11 zile 15 februarie 2026, 09:53:01 UTC flag Report link Link permanent

Ich hab' mir kürzlich ein kleines PHP-Script geschrieben, dass aus dem Tatoeba-Datendump eine Liste der griechischen Wörter extrahiert und zu jedem Wort notiert, wie oft dieses Wort im Korpus vorkommt. (Mir ging es dabei darum, die Sätze im Korpus nach "Einfachheit" zu sortieren, mit der Idee, dass häufige Wörter einfacher sind, als seltene Wörter. Das hat ganz gut geklappt, auch wenn es nicht perfekt ist.)

Für dein Anliegen könnte man das Programm benutzen, um eine entsprechende Wortliste für den deutschen Korpus zu erstellen (evtl. beschränkt auf die 10.000 häufigsten Wörter). Die Ausgabe könnte man so gestalten, dass dabei eine HTML-Seite entsteht, die Links auf die Tatoeba-Suche enthält, sodass du mit einem Klick Sätze mit diesem Wort erhältst. Mein Programm kann dabei unterschiedliche Varianten eines Wortes (Zugeständnis/Zugeständnisse) nicht berücksichtigen, macht also zwei Einträge daraus. Aber, soweit ich weiß, kann die Tatoeba-Suche das bei einigen Sprachen und Deutsch war da dabei.

Ich denke, es ist nicht viel Arbeit für mich, das Programm entsprechend anzupassen. Ich könnte dir also anbieten, so eine HTML-Seite zu erstellen und dir per E-Mail zu schicken (schreib mir einfach eine private Nachricht mit deiner E-Mail-Adresse, dann mache ich das). Evtl. könnte ich das Programm auch so überarbeiten, dass ich es auf GitHub allen zur Verfügung stellen kann, mal sehen...

{{vm.hiddenReplies[41669] ? 'expand_more' : 'expand_less'}} ascundeți răspunsurile afișați răspunsurile
Alex_M Alex_M acum 11 zile 15 februarie 2026, 12:33:50 UTC flag Report link Link permanent

Vielen Dank! Selbstverständlich können Sie mir den Link per privater Nachricht schicken. Oder, vielleicht, Sie könnten ihn hier posten, damit auch andere Teilnehmer die Liste ansehen können?

{{vm.hiddenReplies[41670] ? 'expand_more' : 'expand_less'}} ascundeți răspunsurile afișați răspunsurile
kumakyoo kumakyoo acum 11 zile, editat acum 11 zile 15 februarie 2026, 17:47:31 UTC, editat 15 februarie 2026, 18:44:57 UTC flag Report link Link permanent

Ich hab' das Programm mal bei GitHub hochgeladen: https://github.com/kumakyoo42/tatoeba_stuff

Das Programm selber heißt "count_words.php". Man benötigt dafür die Sätze-Datei von https://tatoeba.org/de/downloads (sentences.tar.bz2). Diese muss entpackt sein. Als Beispiel habe ich die Top 10.000 der deutschen Wörter ebenfalls hochgeladen (top10000_deu.html).

Ich hoffe, das ist in etwa das, was du gesucht hast.

{{vm.hiddenReplies[41672] ? 'expand_more' : 'expand_less'}} ascundeți răspunsurile afișați răspunsurile
Alex_M Alex_M acum 10 zile 16 februarie 2026, 17:23:27 UTC flag Report link Link permanent

It is exactly what I need! Plus, there is a number near each word which shows how many times it was used in the sentences.

It works for every language (the language code to be uses is ISO 639-3, i.e. 3-letter code, not 2).

I changed the script a little for myself, - I removed converting to lowercase. It's easier for me to distinguish nouns this way.

I've already found a word which I did not know, and I could click on it in the list and see some sentences with it. Incredible programming! Thank you!

{{vm.hiddenReplies[41674] ? 'expand_more' : 'expand_less'}} ascundeți răspunsurile afișați răspunsurile
Alex_M Alex_M acum 9 zile 17 februarie 2026, 10:52:58 UTC flag Report link Link permanent

I created three HML pages based on the PHP script from https://github.com/kumakyoo42/tatoeba_stuff

These are frequency lists for German, English, and French languages, twenty thousand words in each:
https://labellechose.ch/frequency-lists/deu.html
https://labellechose.ch/frequency-lists/eng.html
https://labellechose.ch/frequency-lists/fra.html

Click on a word and the sentences with this word are displayed, click on the arrow and the definition from TFD dictionary is displayed.

These pages were created for personal use. They are located on the self-hosted mini-server which is not always online. The links are published for demonstration only as part of this discussion.

{{vm.hiddenReplies[41675] ? 'expand_more' : 'expand_less'}} ascundeți răspunsurile afișați răspunsurile
araneo araneo acum 8 zile 17 februarie 2026, 22:01:59 UTC flag Report link Link permanent

Thank you both!

CK CK acum 7 zile, editat acum 7 zile 19 februarie 2026, 07:58:44 UTC, editat 19 februarie 2026, 08:01:29 UTC flag Report link Link permanent

Here are English and Japanese TSV files that you can download.

https://aitstudy.com/temp/

eng_wordcounts 2026-02-19.tsv
eng_wordcounts-lemmatized 2026-02-19.tsv

Vocabulary size: 76704
Lemmatized vocabulary size: 67092

jpn_wordcounts 2026-02-19.tsv
jpn_wordcounts-lemmatized 2026-02-19.tsv

Vocabulary size: 47632
Lemmatized vocabulary size: 42192

acum 8 zile 18 februarie 2026, 20:50:22 UTC link Link permanent
warning

Conținutul acestui mesaj contravine regulilor noastre și, prin urmare, a fost ascuns. Este afișat numai pentru admini și pentru autorul mesajului.