CK 8 days ago July 2, 2020 at 3:32 AM

** Sentences with Audio 699,777 (2020-07-02 3:30 UTC) **

We should soon have over 700,000 sentences with audio files.

You can keep your eye on this using this URL.

gillux 17 days ago June 23, 2020 at 10:51 AM

Quel groupe de phrases pourrait représenter Tatoeba ?

Je suis en train de réfléchir à quoi imprimer sur le t-shirt ou la tasse que les participants à Kodoeba recevront. Nous aimerions avoir un objet qui représente Tatoeba en général, pour qu’il puisse initier des discussions sur Tatoeba dans la vraie vie, et d’y ajouter une petite mention "Kodoeba #1" dans un coin. En d’autres mots, ce sera plus un objet de "Tatoeba" que de "Kodoeba".

Une tasse a été créée un jour [1]. J’aimerais faire un truc dans le genre, mais au lieu d’une phrase qui parle de thé et de café, je pense que nous pourrions trouver une phrase plus inspirante et en lien avec la philosophie du projet.

Aussi, avez-vous des idées pour la phrase ? Il faudrait qu’elle soit traduite dans toutes les langues des participants à Kodoeba, c’est-à-dire au moins l’anglais, le français, le portugais, le tagalog, le polonais, l’allemand, l’espagnol et l’italien (mais cela peut se faire plus tard ; d’autres langues sont évidemment bienvenues). Il faudrait aussi qu’elle ne soit pas trop longue.

Voici quelques premières idées :

#2456781 How do you say that in your language?
#656787 I know your language.
#478567 Sing a song in your language, please!
#1762086 Why don't you use your language?
#704105 It's all about sentences. Not words. Tatoeba: Future Mottos


What group of sentences could represent Tatoeba?

I am thinking about what to print on the t-shirt or mug that Kodoeba participants will receive. We’d like to have an object that represents Tatoeba in general, so that it can initiate discussions about Tatoeba in real life, and to add a small mention "Kodoeba #1" in a corner. In other words, it will be more of a "Tatoeba" object than a "Kodoeba" object.

A cup was created before [1]. I’d like to do something like that, but instead of a sentence about tea and coffee, I think we could find a sentence that is more inspiring and related to the philosophy of the project.

So any ideas for the sentence? It should be translated into all the languages of the Kodoeba participants, that is to say, at least English, French, Portuguese, Tagalog, Polish, German, Spanish and Italian (but this can be done later; other languages are obviously welcome). It should also be not too long.

Here are some initial ideas:

#2456781 How do you say that in your language?
#656787 I know your language.
#478567 Sing a song in your language, please!
#1762086 Why don't you use your language?
#704105 It's all about sentences. Not words. Tatoeba: Future Mottos


Ricardo14 9 days ago, edited 9 days ago July 1, 2020 at 12:06 AM, edited July 1, 2020 at 12:43 AM

How about those ones I've just thought:

→ If you can read this, it means that you speak, at least, one language and so, you're more than welcome to Tatoeba! (#8883615)

→ Tatoeba: A super database of sentences and translations which I proudly work on. (#8883617)

→ I really love languages. Thanks to Tatoeba to make it possible to study and help people to learn them. (#8883621)

→ I really love languages. Thank you, Tatoeba! (#8883622)

→ You might not have realized it yet, but I'm crazy about languages. (#8883624)

→ How about teaching me your language? (#8883704)

I've thought on those sentences because they "link" languages to Tatoeba which might attract more people to join us.

maaster 15 days ago June 25, 2020 at 4:24 PM

I'd like to ask for your remarks for the sentence #8870764.

The problem with it: There's no official word "Szögöd"...
(Szeged [pronounciation: "sagad"] is a Hungarian city where there's a dialect named from the town in which dialect people say "ö" instead of "e", but not always; it has its strict rules that I don't know - e.g. Szeged->Szöged.
So, people say "Szöged" in Szeged; but people in the rest of the country often say "Szögöd" because of the peculiarity of that dialect. Those persons aren't dumb, they just make a joke of that. (However, one can find several examples of the written form of Szögöd on the Internet.)
I just wanted to contemplate with this second-added translation what is special with this city.
Whether Adelpa intentionally choosed that city I don't know.

That problem occurs in several other cases I think: e.g.: kösz(i)ke (=köszönöm), zacsi (=zacskó), szeva (=szevasz). These are colloquial forms that everybody knows (everybody excepted linguists) and uses.
(Unfortunately they're unable to update the Hungarian language.)

brauchinet 15 days ago June 25, 2020 at 6:21 PM

Ich werde das Szegediner-Gulasch ab jetzt Szögödönör-Gulasch nennen.

maaster 11 days ago, edited 11 days ago June 29, 2020 at 6:07 PM, edited June 29, 2020 at 6:11 PM

Das rántott hús heißt in Szeged wohl Wienör Schnitzöl, nicht nur denn viele Deutschen leben in Szeged, sondern das rántott hús hat kein e (o. evtl. noch rántott szölöt).

Yorwba 15 days ago, edited 15 days ago June 25, 2020 at 7:34 PM, edited June 25, 2020 at 7:35 PM

> These are colloquial forms that everybody knows (everybody excepted linguists) and uses. (Unfortunately they're unable to update the Hungarian language.)

I was curious about that, since I'd expect linguists to be interested in colloquial forms. I found an article by a certain Béla Istók, who may or may not be a linguist, who noted the presence of examples of this kind of word formation in Hungarian textbooks published by the Oktatáskutató és Fejlesztő Intézet, so it appears that some people did update (their description of) the Hungarian language with those words.

Thanuir 14 days ago June 26, 2020 at 1:04 PM

Slangilla ja vastaavilla ilmauksilla on paikkansa Tatoebassa. Sopiva tunniste olisi hyvä. Ehkä:

non-standard spelling
derogatory (jos kyseessä on pilkkanimi)

Ehkä myös seuraavat tunnisteet sopisivat lauseelle:

place name

Alexs 15 days ago June 25, 2020 at 11:19 AM

Hi everyone,

As a Kotoeba participant, I am currently thinking of a way to structure tags. The question has already been raised here, but I would like to hear more about our needs and expectations.

Tags are a highly valuable feature of Tatoeba, provided that one can easily scan through them. It is currently hard to explore them, because we cannot see all we can do at a glance, hence the idea to organize tags :)

* I was thinking to organize tags hierarchically, such that each tag can have a parent, that is itself another tag. For example if "animals" is a tag, "cat" could be one of its children. This would enable to build trees as deep as we want, but I wonder whether we need it.
--> Do we need several levels of depth or is one level enough?

* As for the parent tags (super-tags), CK has already done a titanic work on classifying tags into categories, that can be summarized as follows: language variants, grammar, topics, idioms, register, meta-information (length/quality), pronunciation, source (by ...). Obviously this list does not have to be decided now, we will be able to move tags across categories, but I think coming up with a few ideas can help answer the first question.
--> What tag categories do we need? How does it help answer the first question?

* Finally, there are duplicate tags, some because of translations, other because of different naming conventions. I do believe organizing tags is a first step to merging duplicates.

Thank you in advance for your feedback!

Thanuir 15 days ago June 25, 2020 at 2:12 PM

Minusta on tärkeämpää mahdollistaa tunnisteiden julistaminen synonyymeiksi, tai peräti käännöksiksi.

Ontologian rakentamisessa on se ongelma, että kun Tatoeban tunnisteita aletaan lopulta monikielistämään, täytyy se tehdä koko ontologialle. Ei ole mitenkään selvää, että varsinkaan monikerroksinen ontologia kääntyisi sujuvasti kaikille kielille. Jos haluaa rakentaa jonkinlaisen luokittelujärjestelmän, niin mieluummin matalan kuin syvän.

Itse kokisin seuraavat mahdollisuudet jo varsin riittäviksi:

1. Julista kaksi tunnistetta synonyymeiksi. Tällöin kumpikin tunniste jäisi näkymään lauseisiin, joilla ne on, mutta jos etsisi kumpaa tahansa tunnistetta, löytäisi molemmilla merkityt lauseet. Tämä pitäisi toteuttaa mielivaltaiselle määrälle tunnisteita, ei vain kahdelle.

2. Poista tunniste ja ohjaa kaikki sen lauseet toiseen tunnisteeseen. Esimerkiksi animal -> animals tai toisin päin. Tällöin poistettavaksi julistettava tunniste korvattaisiin paremmalla kaikissa lauseissa missä se on, ja aina jos joku kirjoittaisi poistetun tunnisteen lauseeseen, se korvautuisi paremmalla.

Kirjoitin aiemmin aiheeseen liittyen englanniksi:

Alexs 12 days ago June 28, 2020 at 3:18 PM

Thank you for your interesting feedback ! I believe your idea amounts to creating a one-layer tree and not giving names to the "supertags", which indeed removes the need to translate these "supertags".

Thanuir 11 days ago June 29, 2020 at 1:58 PM


Jos käy ilmi, että ihmiset kaipaavat laajempia tunnisteita tai hierarkiaa niille, voi kai sellaisen askarrella, mutta tunnisteiden yhdistäminen ja käsittely suurempi kokonaisuuksina on jotain välittömästi hyödyllistä ja käyttökelpoista.

Esimerkiksi Wikipediassa on laaja tunnistehierarkia, kun taas toisaalta Stack exchange -sivustoilla sitä ei ole, joten molempia lähestymistapoja näkee. Tämä projekti saattaa olla lähempänä wikiä kuin kysy-ja-vastaa -sivustoa, mutta ehkä suuren tunnistehierarkien rakentaminen ja ylläpito ei kuitenkaan ole sivuston oleellisinta antia.

maaster 14 days ago June 26, 2020 at 6:35 PM

As for me, I added the tag "colloquial spelling" (and perhaps also the "colloquial".
(We two Hungarians can't agree in this matter.)

Alexs 12 days ago June 28, 2020 at 3:19 PM

Thank you for pointing that out ! This shows that tags are somewhat subjective, and I guess clustering tags as Thanuir suggested would allow to group these two tags :)

maaster 11 days ago June 29, 2020 at 6:01 PM

It's not really about tags.
It's about that the sentence must be changed or not.

Ricardo14 11 days ago June 29, 2020 at 3:41 AM

Thanks a lot for working on that, Alexs! Tags are really for both language learners and translators.

I'd like to point out the following: Some tags are only related to one language (sometimes to a specific "dialect" from a language). That said, it'd be good if certain tags can are used in a particular language.

Some examples:

English - past simple, past continuous, present continuous, phrasal verb.
Portuguese - Brazilian Portuguese, presente do indicativo, Brazilian Spelling (?).
Spanish - Mexican Spanish, Chilean Spanish, voseo.

sharptoothed 11 days ago June 29, 2020 at 11:56 AM

** Stats & Graphs **

Tatoeba Stats, Graphs & Charts have been updated:

Guybrush88 11 days ago June 29, 2020 at 12:09 PM


Ricardo14 11 days ago June 29, 2020 at 2:42 PM

Thank you!

cojiluc 13 days ago June 27, 2020 at 5:36 AM

I have added a vocabulary item for "brassage". But this vocabulary item does not appear in

Do I miss something?

Guybrush88 13 days ago June 27, 2020 at 6:51 AM

probably it's related to this bug:

rumpelstilzchen 13 days ago June 27, 2020 at 4:19 PM

The actual bug is

I've just proposed a fix:

TRANG 13 days ago June 27, 2020 at 11:37 AM

It's a bug, as Guybrush mentioned.

From your vocabulary page:

You see that it detects "1000+ sentences" for the vocabulary when there should be only 3 sentences.

Because the count is 1000+, it won't show up on the "Sentences wanted" because that page will only lists vocabulary that have fewer than 10 sentences.

CK 13 days ago June 27, 2020 at 8:02 AM

** Stats - 2020-06-27 - Native Speakers with Contributions **

Find out who the native speakers are and get links to their sentences.

TRANG June 4, 2020 at 4:57 PM, edited June 6, 2020 at 5:01 PM

**Responsive landing page**

I have deployed on the dev website an attempt at making the landing page responsive and will need some feedback. If you have time, please visit the dev website from your smartphone.

Only the landing page (for non-authenticated users only) will adjust to small screens. Other pages will still be displayed as miniatures of the website.

To convert this page into a responsive page, I had to rework some elements so that they would display properly on smartphones. Some of these changes have an effect for the desktop users too (login box and UI language selection being the most notable changes). I'll let you try out and report things that you find confusing or inconvenient.

Thank you for testing!

CK June 4, 2020 at 10:45 PM

Note that TRANG is referring to the non-logged-in main page.
You will have to make sure to log out to see what she's talking about.

This is the only problem that I noticed right off.
The interface language icon wasn't intuitive to me, but perhaps it is to others.

TRANG June 6, 2020 at 5:22 PM

> The interface language icon wasn't intuitive to me, but perhaps it is to others.

Would it perhaps be more intuitive with the language code next to the icon?

AlanF_US June 5, 2020 at 2:31 AM

The "show more features"/"show fewer features" icons are displayed, but have no effect, for a user who is not logged in.

orion17 June 5, 2020 at 3:38 AM

It does, but it expands and collapses the hidden sentences instead of showing features...

AlanF_US June 5, 2020 at 7:47 PM

The icons appear in two places: (1) in the top green bar and (2) next to each sentence. I was looking at the one in the top green bar, which didn't seem to do much. But in fact, it does affect sentences that have unreviewed automatically-generated alternate transcriptions: It shows or hides the transcriptions.

TRANG June 6, 2020 at 4:59 PM

The "show more" button in the green toolbar should indeed not be displayed. This bug should be fixed on dev now.

TRANG 20 days ago June 20, 2020 at 6:10 PM

I have implemented additional changes to the top menu. As always, if you have time, please test it on the dev website and let me know if you have any issue.

The main change is the part with the user menu, inbox, log out and language selection.


CK 19 days ago, edited 19 days ago June 21, 2020 at 12:57 AM, edited June 21, 2020 at 1:02 AM

It works, but I don't like the way it takes multiple clicks to get to things I often access.

With the current design (on the main site), I can mouse-over my username, then click "comments on my sentences." This only requires one click.

The new design requires multiple clicks.
1. Click my username in the menu bar.
2. Click "my profile."
3. Click my username again where is appears under the search bar.
4. Click "comments on my sentences."

It's not only this, but it requires this same number of clicks to get to everything else in the drop-down menu that I use.

Changing my language isn't something that I use very much, hardly ever, but I do use the other things, so I'm not sure why the language change part cannot just be included in the "settings."

AlanF_US 19 days ago June 21, 2020 at 12:55 PM

I agree. I rarely change my profile, but I often want to look at my lists, my most recent comments and sentences, and comments on my sentences. It is a pain to have to go through my profile in order to get there, especially on the desktop, where I have plenty of room for a longer drop-down list.

AlanF_US 16 days ago June 24, 2020 at 3:32 PM

In addition to forcing us to use more clicks, moving these items out of the menu has a negative effect on discoverability. If I didn't see them in the menu, I might think that such functionality didn't exist at all, especially if I didn't visit my profile after setting it up as a new user.

AlanF_US 19 days ago June 21, 2020 at 12:58 PM

When I look at my profile, there's a box on the right-hand side with the heading "Settings". Three items are shown: (1) whether e-mail notifications are enabled, (2) whether the profile is public, and (3) the list of languages for translations I want to display. The problem with item (3) is that there's no explanation, just a bare list:

>> deu, eng, epo, fra, heb, ita, jpn, por, rus, ukr

It would be better if there were a short introductory phrase:

>> Translations displayed in: deu, eng, epo, fra, heb, ita, jpn, por, rus, ukr

moxy 17 days ago June 23, 2020 at 8:23 AM


I read the wiki and it says that you currently don't provide an API for interfacing with the website. Is it even something that you are considering? I would be interested since you seem to have a pretty large corpus for Bulgarian that doesn't contain a lot of archaisms and would be useful for statistical analysis.


Yorwba 17 days ago, edited 17 days ago June 23, 2020 at 10:06 AM, edited June 23, 2020 at 10:06 AM

For statistics, it'd probably be better to download the complete set of Bulgarian sentences from the downloads page:

There are plans to develop a read-only API as part of the currently ongoing Kodoeba event. It's currently in the design phase, but it's likely that it will allow programmatic access to the search engine, since that's one thing many people have been asking for.

moxy 16 days ago June 24, 2020 at 1:20 PM

Okay thank you very much. The corpus is actually really small with only 24k entries so it would've been faster anyway to just index it locally instead of through the website

sacredceltic 19 days ago June 21, 2020 at 4:55 PM

Apparemment, l’IHM du site en français a été changée sans que j’en sois notifié , alors qu’historiquement, c’est moi qui en ai traduit une partie importante. Je trouve ça grossier.
En plus, les changements ne sont pas justifiés, ce qui est un comble. Ça s’appelle un diktat, mais j’ignore qui est le nouveau dictateur de la langue française sur Tatoeba.. ça serait bien qu’il ait la décence de se présenter.
Par exemple, la nouvelle dénomination pour les placards (textes qu’on placarde sur un mur) est “message”...un message, dans mon vocabulaire et mon dictionnaire, nécessite un destinataire, c’est le principe de la messagerie et du courrier. Donc un message et un placard ne sont pas la même chose. Le placard ne nécessite pas de destinataire. Il s’adresse à tous ceux qui passent et le lisent. Si on commence à dénommer tout avec un terme unique passe-partout, autant abolir le language et revenir aux grognements...

gillux 18 days ago June 22, 2020 at 9:56 AM

Dans tout projet collaboratif, il arrive que ce que l’on fait soit partiellement défait par d’autres. Je comprends ta frustration, mais c’est un petit prix à payer pour parvenir à faire ce qui nous est impossible d’accomplir seul.

Moi aussi, en tant que développeur, il m’arrive parfois de grogner quand je vois que d’autres défont le code ce que j’ai écrit… D’un autre côté, sans ces autres personnes, Tatoeba n’existerait pas tout bonnement pas. Et je suis bien content que Tatoeba reste actif quand je ne suis pas là, toujours grâce à ces mêmes personnes, qui reprennent le flambeau. Je t’invite donc à faire preuve d’indulgence face aux maladresses des autres, et à garder un tête que toute participation part d’une bonne intention.

J’ai moi-même tenu à jour la traduction française du site ces derniers mois, mais je ne me souviens pas avoir changé des « placards » en « messages ».

Voici comment faire pour retrouver et remonter l’historique d’une chaîne de caractères donnée. Ouvre le fichier principal français sur Transifex :
De là, tu peux chercher une chaîne en particulier, soit en anglais avec le filtre "text", soit en français avec le filtre "translated_text". Puis, clique sur une chaîne et en haut à droite tu verras qui l’a modifiée pour la dernière fois et quand. Il y a aussi l’onglet Historique en bas à droite.

sacredceltic 18 days ago June 22, 2020 at 2:54 PM

Oui bon, les bonnes intentions, bof : Lénine a pu créer le goulag avec les meilleurs intentions de bâtir une société meilleure...
Tout le monde revendique toujours les meilleurs intentions et on nous bassine constamment avec la soi-disant bienveillance universelle qui nous ferait vivre sur l’Ïle aux enfants. Sauf que les bonnes intentions, appliquées au language, ça se termine en Gloubiboulga, où e-mail veut dire tantôt message électronique, tantôt adresse électronique, tantôt envoi de messages électroniques en masse, j’en passe et des meilleurs, plus personne ne sait plus comment interpréter quoi que ce soit correctement...bref c’est le bordel assuré !
Donc moi je m’en tiens surtout aux résultats, à l’efficacité de l’action au regard de la clarté de la communication résultante. Et ce que je constate, c’est, au nom du jeunisme, une érosion continue de cette clarté et du sens des mots, en général.
Il y a une déperdition permanente du vocabulaire.