menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search

Wall (7,147 threads)

Tips

Before asking a question, make sure to read the FAQ.

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

Latest messages feedback

sharptoothed

13 hours ago

subdirectory_arrow_right

frpzzd

10 days ago

subdirectory_arrow_right

EugeneGS

10 days ago

subdirectory_arrow_right

frpzzd

10 days ago

subdirectory_arrow_right

EugeneGS

11 days ago

subdirectory_arrow_right

frpzzd

11 days ago

subdirectory_arrow_right

gillux

11 days ago

feedback

frpzzd

13 days ago

feedback

sharptoothed

14 days ago

subdirectory_arrow_right

marafon

15 days ago

sharptoothed sharptoothed 13 hours ago August 31, 2025 at 2:55:48 PM UTC flag Report link Permalink

✹✹ Stats & Graphs ✹✹

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/

8 days ago, edited 8 days ago August 23, 2025 at 5:56:16 PM UTC, edited August 23, 2025 at 5:56:42 PM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

9 days ago August 22, 2025 at 6:25:19 AM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

frpzzd frpzzd 13 days ago, edited 13 days ago August 18, 2025 at 7:22:04 PM UTC, edited August 18, 2025 at 11:05:19 PM UTC flag Report link Permalink

Hey all! Wanted to share that recently I wrote some scripts that help find sentences to translate on based on lists of vocab. Taking as input the Tatoeba corpus and a tiered list of vocab (e.g. lists of words ranked by the chapters in which they appear in a textbook) as well as some limited grammar concepts (cases, parts of speech, verb forms, etc), it generates tiered lists of sentences such that the sentences at level N only use the vocab/grammar at level N or lower.

I've been working through a Russian textbook and using it to find sentences here that appropriate for attempting to translate into English/Spanish based on the chapter I'm on. My scripts use the spaCy Python library to rank sentences by chapter, and this library is sometimes flawed in terms of lemmatization and grammar annotation, but so far it has been a huge help for me. (Of course, my translation attempts can also be very flawed, as the Russian natives here know... thank you marafon, Ooneykcall and EugeneGS for your patience! :-) )

As an example, here are some sentences (as of the current data dumps) at chapter 10 level based on my book:

1006495 Да, у меня есть дочь.
1006500 Вы изучаете английский.
10074251 У нас всё свежее.
10101554 Вода слишком холодная.
10110989 Кофе слишком горячий.

And here are some at chapter 20 level:

10031177 Интересно, пойдёт ли снег.
10031179 Интересно, будет ли снег.
10038848 Ничего странного я в этом не вижу.
10043008 Я ушёл домой пешком.
10055872 Он бежал в школу.

Just wanted to share this info here in case any other learners would find this helpful for their own studies. Whether you intend to translate sentences found this way or not, it can be helpful for finding lots of example sentences at your level when working through a textbook. If this sounds useful to you, feel free to reply / send me a message and I'll be happy to try generating ranked translated/untranslated sentence lists for your choice of language too.

{{vm.hiddenReplies[41222] ? 'expand_more' : 'expand_less'}} hide replies show replies
gillux gillux 11 days ago August 20, 2025 at 11:03:21 AM UTC flag Report link Permalink

That’s amazing, Franklin, thank you for telling us. As I understand, you are making a curated list of Tatoeba sentences organized by textbook level. Do you plan to make your list available on Tatoeba or elsewhere?

{{vm.hiddenReplies[41223] ? 'expand_more' : 'expand_less'}} hide replies show replies
frpzzd frpzzd 11 days ago August 20, 2025 at 1:53:16 PM UTC flag Report link Permalink

That is correct!

The sentences are selected programmatically, and I could make them available as lists on Tatoeba (as I have a list of sentence IDs comprising the lists), but I'm not aware of a feature on the site allowing users to form lists other than by manually clicking to add each sentence. Right now, the Russian sentences ranked by my textbook number over 100,000 so of course I cannot add them manually this way. As far as I understand, the Tatoeba API is also read-only for the moment. Is there a way to do form lists by uploading (long) lists of sentence IDs that I'm not aware of?

In any case, I may post some of the lists on Github corresponding to my particular textbook. But I wanted to mention it here in case anyone is similarly working through another textbook for any other language and could also make use of ranked sentences. I am currently only using this for Russian but it would be easy to do for many other languages, provided a vocab list.

{{vm.hiddenReplies[41224] ? 'expand_more' : 'expand_less'}} hide replies show replies
EugeneGS EugeneGS 11 days ago, edited 11 days ago August 20, 2025 at 5:58:56 PM UTC, edited August 20, 2025 at 9:03:31 PM UTC flag Report link Permalink

Hello, Franklin! I'm not sure if there's currently a way to upload lists directly on Tatoeba, but I once made a small program for myself that can automatically add sentences to a list by their IDs. If you'd like, you could share your sentence ID lists somewhere (for example on GitHub), and I could help import them into Tatoeba lists. (By the way, the program is rather slow — in my testing it processes about a thousand sentences per hour, so a large list would take some time.)

{{vm.hiddenReplies[41225] ? 'expand_more' : 'expand_less'}} hide replies show replies
frpzzd frpzzd 10 days ago August 21, 2025 at 3:17:14 PM UTC flag Report link Permalink

Hi Eugene, that would be fantastic! Here's a link to a TSV file in which the first column contains IDs of Tatoeba sentences in Russian, and the second column contains the "rankings", which are numbers 3-29 corresponding to chapters in the book. Ideally each chapter would get its own list.

https://gist.githubusercontent...._sentences.tsv

I haven't even added all of the book's vocab into my input data yet, since that is a slow-going process. For that reason, from chapter 24 onwards, ranking is based purely on grammar (as estimated by spaCy) and not by vocab for now.

{{vm.hiddenReplies[41226] ? 'expand_more' : 'expand_less'}} hide replies show replies
EugeneGS EugeneGS 10 days ago August 21, 2025 at 6:31:10 PM UTC flag Report link Permalink

Okay! I've started it. It will take a few days (around 4–6, maybe a bit more). Here is the link to all the lists:

https://gist.github.com/kilsens...4a27b9658e5e55

I'll update it with new lists as they appear. If you need, I can also rename the lists.

{{vm.hiddenReplies[41227] ? 'expand_more' : 'expand_less'}} hide replies show replies
frpzzd frpzzd 10 days ago August 21, 2025 at 6:42:18 PM UTC flag Report link Permalink

Thank you very much, this is fantastic! This will be a huge help. Previously I was copying over the IDs into the URL, which was a very clunky workflow.

13 days ago August 18, 2025 at 5:51:10 AM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

14 days ago August 17, 2025 at 1:18:20 PM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

sharptoothed sharptoothed 14 days ago August 17, 2025 at 10:03:27 AM UTC flag Report link Permalink

✹✹ Stats & Graphs ✹✹

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/

felix63 felix63 16 days ago August 15, 2025 at 8:46:45 AM UTC flag Report link Permalink

Heureux anniversaire, Lisa ! 🎂

{{vm.hiddenReplies[41215] ? 'expand_more' : 'expand_less'}} hide replies show replies
marafon marafon 16 days ago August 15, 2025 at 12:50:59 PM UTC flag Report link Permalink

Besser spät als nie. Herzlichen Glückwunsch zum Geburtstag, Lisa!

Pfirsichbaeumchen Pfirsichbaeumchen 15 days ago August 16, 2025 at 5:21:05 PM UTC flag Report link Permalink

Vielen Dank! 😊

Auch Dir alles Gute zum Geburtstag, Christian! 🎉🎊🎈🎂

{{vm.hiddenReplies[41217] ? 'expand_more' : 'expand_less'}} hide replies show replies
marafon marafon 15 days ago August 16, 2025 at 5:39:04 PM UTC flag Report link Permalink

Je me joins aux félicitations !

21 days ago August 10, 2025 at 8:59:26 PM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

{{vm.hiddenReplies[41209] ? 'expand_more' : 'expand_less'}} hide replies show replies
AmarMecheri AmarMecheri 17 days ago August 14, 2025 at 4:18:17 PM UTC flag Report link Permalink

I don't understand why my message violates the Tatoeba.org charter. Please explain.
Thanks anyway for agreeing to distribute it to the interested party and the administrators.

Je ne comprends pas pourquoi mon message enfreindrait la charte de Tatoeba.org. Il serait bon de nous expliquer.
Merci quand même d'avoir accepté de le distribuer à l'intéressë et aux administrateurs.

{{vm.hiddenReplies[41212] ? 'expand_more' : 'expand_less'}} hide replies show replies
Shishir Shishir 17 days ago August 14, 2025 at 7:02:59 PM UTC flag Report link Permalink

Hi Amar, I think you made a mistake, your message was not hidden, it's right under this one. The hidden message you replied to was actually a Vietnamese advertisement.

{{vm.hiddenReplies[41213] ? 'expand_more' : 'expand_less'}} hide replies show replies
AmarMecheri AmarMecheri 17 days ago August 14, 2025 at 7:59:27 PM UTC flag Report link Permalink

@Shishir
I thank you so much for your useful explanatiion.

AmarMecheri AmarMecheri 18 days ago August 13, 2025 at 10:52:06 PM UTC flag Report link Permalink

Gma-tneɣ @Hanafi Michelet yessuter-d deg-i tallalt. Dacu, teẓram-iyi tura ulac maḍi tazmert. Ssarameɣ wiggad uɣur ara d-yesteqsi ad ssikden acu zemren ad tgen akken ad s-fken afus. Tanemmirt-nwen. Afud igerrzen akken ma tellam.

Notre ami @Hanafi Michelet a sollicité mon soutien et je l'en remercie. Malheureusement, ma santé ne me permet pas de l'aider. C'est pourquoi, je voudrais qu'un membre de l'équipe kabyle veuille bien l'aider au cas où il se manifesterait.