Aiji Aiji 4 hours ago, edited 4 hours ago February 17, 2020 at 3:18 AM, edited February 17, 2020 at 3:22 AM link permalink

What's New on Tatoeba? - Your weekly recap °4


※ The implementation of the new design continues, this time with the favorites and adoption features! Thanks to TRANG for her work.

※ Improved cleaning of contributions by removing all Unicode whitespaces. Various Unicode characters are being used in contributions, posing unnecessary issues to duplicate detection and other potential script-oriented applications. Thanks to TRANG and CK for first reporting and to AndiPersti (on GitHub) for implementing the cleaning method.

※ Improved consistency of settings and tooltips strings. Thanks to Aiji for the modifications.

※ gillux implemented some optimization and correction of the code at the back of Tatoeba. Among other things, you may see the homepage load faster than before! Thanks to him for the work.

※ gillux also improved the way furigana are generated and displayed on Japanese sentences.

※ Yorwba wrote a script to have language names displayed in the interface language even if they are not translated on Transifex yet! Thanks to him for the work.


※ Ricardo opened another discussion about the translation of Tatoeba's interface.

※ CK gave various stats and URLs to search for sentences.

※ Thanuir asked for some French sentences about the adjective "borné" in a mathematical context.

※ Ricardo asked for some French sentences about traveling.

※ Ergulis was wondering how a sentence becomes an orphan.


20 218 sentences added this week (from one export to another). You can check daily activity on this page


If you'd like to help to the development of Tatoeba, report issues, or are just curious, have a look at the GitHub repository:

If you want to help us translate the website to your language, you can join us on Transifex: and check this article on the wiki


Fun fact: In the Oxford English Dictionary, the letter "E" accounts for about 11% of the letters. The least frequent one is "Q".

Last week recap:
See this recap on the blog:

Ricardo14 Ricardo14 49 minutes ago February 17, 2020 at 6:40 AM link permalink

Thanks a lot for this wonderful Tatoeba recap! :D

CK CK yesterday, edited yesterday February 15, 2020 at 12:22 PM, edited February 15, 2020 at 12:23 PM link permalink

I noticed a change in the copy-and-pasting.
I wonder if this was an intentional change.

Before, when we copied translations, we got the 3-letter code for the language.

See this comment for an example.

Now, we don't.

See this comment for an example.

I think having the 3-letter code was useful, especially when multiple languages were involved, and when viewing in the comments stream, rather than on each individual page.

AlanF_US AlanF_US yesterday February 15, 2020 at 1:39 PM link permalink

I always found that the three-letter code got in my way.

Aiji Aiji yesterday February 15, 2020 at 2:06 PM link permalink

And I've never understood why people were copy-pasting like that.

However, I still have this behavior. I'm using Firefox. See

More than twenty people.
Більш двадцяти людей.
Plus que 20 personnes.
Più di venti persone.

AlanF_US AlanF_US yesterday February 15, 2020 at 2:28 PM link permalink

I often copy-paste sentences when I write comments to suggest modifications to a sentence. I use the "copy" button to get the sentence into the clipboard, paste it twice, and then modify the second occurrence (which is easier than typing it all).

When the "copy" button was introduced, I stopped using Firefox's built-in functions to select and copy sentences, so the behavior no longer annoyed me. However, it probably surprises people who don't use that button.

CK CK 3 hours ago February 17, 2020 at 3:58 AM link permalink

It still works the way it used to on my Mac with Firefox, but not with Chrome, Safari, or Opera.

cojiluc cojiluc November 16, 2018 at 8:55 AM, edited November 16, 2018 at 8:59 AM November 16, 2018 at 8:55 AM, edited November 16, 2018 at 8:59 AM link permalink

I think it would be advantageous if there would be a mechanism that the native speakers of a language somehow could be notified (if they wish) if some vocabulary item (i.e., wanted sentences) in their language is added.

By my experiences most vocabulary items remain unnoticed (unless the native speakers regulary check the items).

Amastan Amastan November 16, 2018 at 9:05 AM November 16, 2018 at 9:05 AM link permalink

I have had a look at the vocabulary items that need sentences and most of them are relatively rare words. It's quite hard even for a native speaker to make sentences with such words. Such words are sometimes rare even in the largest online corpora.

soliloquist soliloquist November 16, 2018 at 7:31 PM November 16, 2018 at 7:31 PM link permalink

I find this site useful for example sentences.

Unfortunately, it's monolingual.

tetuan tetuan 13 hours ago, edited 13 hours ago February 16, 2020 at 6:05 PM, edited February 16, 2020 at 6:08 PM link permalink

Hi, similar to sentencedict, I also use , which is very useful to find sentences used in context. Also, in the site there are some exercises to improve academic English Vocabulary...

Ricardo14 Ricardo14 November 17, 2018 at 9:12 AM November 17, 2018 at 9:12 AM link permalink

I've opened an issue in the bugtracker

Ergulis Ergulis 19 hours ago February 16, 2020 at 12:08 PM link permalink

Hello friends,

I was wondering how a sentence actually becomes an orphan. Is it because of someone's deleting an account?

Sorry for the lame question which is maybe answered in FAQ or somewhere else, and I just overlooked it or didn't read at all.

CK CK 19 hours ago, edited 19 hours ago February 16, 2020 at 12:10 PM, edited February 16, 2020 at 12:11 PM link permalink

1. People can release their sentences.

A number of members do this when they contribute in a non-native language, so a native speaker can adopt them.

2. Many of the English and Japanese "orphans" are sentences imported from the Tanaka Corpus.

Ergulis Ergulis 19 hours ago February 16, 2020 at 12:15 PM link permalink

Thanks for the explanation.

Aiji Aiji 18 hours ago February 16, 2020 at 12:59 PM link permalink

For reference:

CK CK 19 hours ago February 16, 2020 at 12:06 PM link permalink

We now have 678,855 sentences with audio, up 16,187 since December 21, 2019.

Slightly over 8% of our sentences have audio.

678,855 / 8,161,223

Screenshot (2020-02-16 12:01 UTC)

Previous screenshot (2019-12-21 0:09 UTC)

Previous Wall announcement (2019-12-21)

maaster maaster yesterday February 15, 2020 at 9:19 PM link permalink

Mraz, akarsz is valamit mondani, valamit kifejteni, átadni, közölni vagy megint csak arról van szó, hogy le akarod húzni a magyar "csapatodat", menni óvónénihez árulkodni? Persze gőzöd sincs róla, hogy mivel, csak ártsál.

Azért próbálj mar meg egy gondolatot megfogalmazni, ha már egyszer felraksz valamit a faliújságra.

Pfirsichbaeumchen Pfirsichbaeumchen yesterday February 16, 2020 at 1:25 AM link permalink

Worum geht es denn hier? Du kannst mir eine Privatnachricht schreiben. 🙂

maaster maaster yesterday February 16, 2020 at 5:40 AM link permalink

Was mich betrifft, weiß ich genau auch nicht mehr als Du, weil mraz sich mit uns Sterblichen nicht kommuniziert. (Ich habe mit meinem auf Pinnwand zitierten Satz gegen die Regeln Tatoebas nicht gestoßen. Bei dem anderen zitierten haben bandeirante und ich über unsere Meinungen diskutiert, wo wir uns nicht einig sind. Es ging einfach darum, dass Sätze auf T. ohne Übersetzungen eine Wert haben.)

mraz mraz yesterday February 15, 2020 at 11:29 AM link permalink


Pandaa Pandaa yesterday February 15, 2020 at 1:09 PM link permalink

Folyton magyar részről látom ezt, hogy próbálunk rájönni a Tatoeba fő, vagy bizonyos tartalmának céljára.
Úgy látszik nem megy, vagy sokat képzelünk bele és nem teljesíti.
Pár napja azt fogalmaztam meg, hogy sokkalta nagyobb szerkesztői tábor kéne lehetőleg különböző korosztályokból együttesen, mely aztán sokkalta jobban reprezentálná a nyelv helyzetét.
Lehet a szerver nem bírná, vagy nem is cél.

Ez egy jó kezdeményezés, hisz akár olyasmi is bekerülhet, amit a saját blogodon nem írsz le, amivel a médiában nem találkozol.
Lehet a kezdeményezés örökre csak kezdeményezés marad. (ezért még tuti nyaklevest oszt Trang)

És ha az marad? És ha mondatok százai, ezrei hevernek lefordítatlanul kb az idők végeztéig? És ha csak több lesz Tomból?

Lehet túl sokat képzeltünk bele, viszont én biztosan, de már első perctől kezdve.
Ez toxikus.
Plusz, még tudok választani párat ezek közül:
"We do not tolerate

Insults: saying something offensive about someone or some people.
Harassment: bothering someone repeatedly.
Accusations: stating that someone is doing something with bad intentions.
Blaming: saying that a problem is due to someone’s fault.
Provocation: writing something that intentionally makes other people angry.
Retaliation: replying to insults, harassment, accusations, blaming or provocation with something that doesn't help.
Bad faith: lying, deceiving, being dishonest.
Sabotage: intentionally damaging the corpus.

Generally speaking, we do not tolerate any kind of behavior that harms the collaborative and civilized atmosphere of Tatoeba."

Aiji Aiji yesterday February 15, 2020 at 2:03 PM link permalink

C'est une phrase très stupide pour tellement de raisons que je ne vais pas m'ennuyer à les énumérer ici, d'autres le feront sans doute à ma place. Par exemple, essayez de "trouver sur Google" plus de trois phrases du patois marnais, de savoie, ou de Puisaye, et on se rappelle, hein.

Mais le plus évident, c'est que pour qu'il y ait des traductions, il faut bien qu'il y ait des phrases originales. Si vous êtes une personne triste ou sans imagination, libre à vous de ne faire que traduire, mais n'essayez pas d'en dissuader les autres. (D'autre part, si vous pensez, comme on l'a déjà entendu par ici, qu'une phrase n'a d'intérêt que si elle a une traduction anglaise, vous n'avez pas compris grand-chose aux langues.)

Enfin, prendre son expérience seule comme vérité absolue ne mène jamais bien loin, mais là je m'éloigne sans doute un peu du sujet.

mraz mraz yesterday February 15, 2020 at 2:40 PM link permalink


Pandaa Pandaa yesterday February 15, 2020 at 3:27 PM link permalink

Mutogatni én is tudok. #5938461
Érdekel bárkit is? Nem.

TRANG TRANG yesterday February 15, 2020 at 4:57 PM link permalink

> Érdekel bárkit is? Nem.

Actually I am interested. How did you notice it? I mean, how did you notice that #5938461 was a near-duplicate of #5937698? Was it by coincidence? Or was it because you were browsing sentences by date? Or something else?

Have you noticed a lot of sentences like that, where a sentence is first added with just a pronoun, but then shortly after, the almost same sentence was added with "Tom" (or any name) instead?

Pandaa Pandaa yesterday February 15, 2020 at 6:38 PM link permalink

Van több is, ezt csak egy keresés alkalmával találtam:

CK szeret másolni, ez nem újdonság.

Thanuir Thanuir yesterday February 15, 2020 at 6:51 PM link permalink

Kääntämättömätkin lauseet ovat hyödyllisiä.

1. Kun osaa kieltä jo hieman, pelkästään sanan näkeminen lauseyhteydessä voi riittää kertomaan, mitä se tarkoittaa ja miten se toimii.

2. Kielikuvia on usein vaikea kääntää, mutta ne pystyy usein ymmärtämään luettuaan.

3. Mikään ei estä muita käyttäjiä myöhemmin kääntämästä kääntämättömiä lauseita.

4. Koneoppiminen voi hyödyntää myös kääntämättömiä lauseita ja Tatoebaa voi käyttää yhtenä lähteenä.


Myös vanhahtava kieli slangi ovat avuliaita, koska niistä voi oppia lukemaan ja ymmärtämään epätavallista kielenkäyttöä. Esimerkiksi aukikirjoitettujen murteiden lukeminen on ainakin minulle haastavaa, mutta sitäkin voi oppia näkemällä liudan esimerkkejä. Sama vanhahtavan ja teknisen sanaston kanssa.

Hitusenkaan kieltä osaava pystyy melko helposti erottamaan vahvasti murteelliset tai vanhahtavat ilmaukset yleiskielisistä. Ja vaikkakin menisi vikaan, ei virhe ole suuri - usein murreilmaisu tai vanhahtava kieli on kuitenkin ymmärrettävää ja näitä vahingossa käyttävä luultavasti muutenkin selvästi ulkomaalainen, joten ne aiheuttavat vahinkoa vain harvoin, jos koskaan.

bandeirante bandeirante 14 days ago February 2, 2020 at 7:49 PM link permalink

Does it make sense to add sentences that cannot be translated? Like puns, jokes on words? Is there any "official" position on that issue?
Personally, I don't think they serve any useful purpose, but I may be wrong.

alexmarcelo alexmarcelo 14 days ago, edited 14 days ago February 2, 2020 at 8:31 PM, edited February 2, 2020 at 8:31 PM link permalink

In my opinion, it makes complete sense. The good thing about a sentence, even if it has no translations, is that it provides context, and sometimes that's all a student needs in order to learn a new word or phrase.

AlanF_US AlanF_US 14 days ago February 2, 2020 at 9:15 PM link permalink

Also, it's possible to translate sentences with wordplay in a way that makes it possible to understand the original meaning, even if some or all of the wordplay is not preserved in the translation.

alexmarcelo alexmarcelo 14 days ago, edited 14 days ago February 2, 2020 at 11:18 PM, edited February 2, 2020 at 11:18 PM link permalink

Definitely. It's also a great opportunity for the translator to be creative. ;-)

Aiji Aiji 14 days ago February 3, 2020 at 12:05 AM link permalink

On addition to Alan's message, I think you cannot be sure that a sentence really cannot be translated. It might be so in the languages you know but some language may have the same trick for the particular sentence.

Other than that, Tatoeba is not a translation platform. From the about page, for example, Tatoeba is a "big database of sentences AND translation", "a tool providing examples of how words are used in context".

So I believe the sentences you describe serve a very useful purpose if they have value in themselves. To be honest, I'd like everybody to remember that not everybody uses Tatoeba for second language study and stop being biased by this erroneous idea.

Thanuir Thanuir 14 days ago February 3, 2020 at 6:26 AM link permalink

Erilaiset sanaleikit, idiomaattiset ilmaukset, vitsit ja muut erikoisuudet ovat juurikin hyvää sisältöä Tatoebaan. Niitä voi olla vaikea löytää muualta ja ne ovat osa kielen rikkautta, joka voi muuttua nopeasti tai olla hyvin aluekohtaista.

TRANG TRANG yesterday February 15, 2020 at 4:22 PM link permalink

I saw this recent post:
and was reminded about this thread here.

First, I think every sentence can be translated. Not every sentence can be easily translated into every language, but everything is translatable in some way or another. Translation is not about re-creating with exact precision a sentence from a certain language into another language. It's about finding the closest match in the other language. There is always going to be a closest match.
Also, unless you can speak every language of the world, I don't think you can claim that there are sentences that cannot be translated. Maybe there will be a sentence that you cannot easily translate into the languages that you know, but who says that there isn't another language out there in which the sentence can be easily translated?

Second, translations are not the only information that we attach to sentences. You have comments and tags as well. Even if a sentence cannot be translated, there can still be valuable information someone can discover about the sentence if there is a comment or a tag. Sentences that are hard or almost impossible to translate probably carry very interesting features of a language. It would be shame to exclude them.

CK CK 3 days ago February 13, 2020 at 10:17 AM link permalink

** Search for English, showing sentences translated into the most languages first. **

CK CK 2 days ago, edited yesterday February 14, 2020 at 11:15 AM, edited February 15, 2020 at 9:32 AM link permalink

** List 7055 - English sentences translated into more than 9 languages **

Find English sentences translated into over 10 languages that are not yet translated into your native language.

You can do this with the following link.

Just change the last 3 letters in the URL to your native language's code.

This URL is set to find such sentences not yet translated into Ukranian.
(Over 16,000 results)

This URL is set to find such sentences not yet translated into Esperanto.
(Over 12,000 results)

The over 49,500 sentences that I've added to this list are also on List 907, so you can be fairly confident that they are error-free and natural-sounding English sentences.

fjay69 fjay69 2 days ago February 14, 2020 at 12:24 PM link permalink

Great! Thanks!

CK CK yesterday February 15, 2020 at 9:35 AM link permalink

(Over 4,000 results for sentences on List 7055 without Russian translations)

Ricardo14 Ricardo14 2 days ago February 14, 2020 at 5:46 PM link permalink

Please, could anyone add some French sentences about traveling (hotel, etc) on the list ? Merci :D

CK CK 2 days ago, edited 2 days ago February 14, 2020 at 11:39 PM, edited February 15, 2020 at 6:41 AM link permalink

You can find existing sentences with this kind of search.

Search for sentences with these tags, limited to sentences linked to French.

hotel, travel, restaurant, border crossing, immigration

You will need to search for each tag individually since our search engine doesn't use the "or" operator for tags.

Here are a few examples.

Thanuir Thanuir 2 days ago February 15, 2020 at 7:03 AM link permalink

Matkustustunnistetta on sovellettu kahteen ranskankieliseen lauseeseen: . Muunkinkielisillä voi toki olla ranskankielisiä käännöksiä, joihin tunnisten soveltunee. Sieltä voi kerätä joitakin.