Wall (7,128 threads)
Tips
Before asking a question, make sure to read the FAQ.
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
gillux
25 minutes ago
araneo
41 minutes ago
gillux
2 hours ago
gillux
2 hours ago
PaulP
3 hours ago
frpzzd
5 hours ago
Waldelfe
5 hours ago
gillux
22 hours ago
gillux
23 hours ago
ecorralest101
yesterday

✹✹ Tatoeba Year 2024 Graphs ✹✹
https://tatoeba.j-langtools.com...24/graphs.html
Previous years:
https://tatoeba.j-langtools.com...23/graphs.html
https://tatoeba.j-langtools.com...22/graphs.html
https://tatoeba.j-langtools.com...21/graphs.html
https://tatoeba.j-langtools.com...20/graphs.html
https://tatoeba.j-langtools.com...19/graphs.html
https://tatoeba.j-langtools.com...18/graphs.html
https://tatoeba.j-langtools.com...17/graphs.html
https://tatoeba.j-langtools.com...16/graphs.html

🎁 🎉 Joyeuses fêtes de fin d'année à chacun d'entre vous ! 🔔 🎅

✹✹ Stats & Graphs ✹✹
Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/

I was wondering if I could change the name of a tag. I would like to modify a tag created by me, but I don't know how to do it, if it is even possible.

You can remove the old tag from a sentence and add a new tag with a different name instead. If you want to change all occurrences of a tag, it's going to be a lot of work, of course. (Unless you know how to automate it. That's how I added "quote" tags to a bunch of sentences tagged "by <Somebody>")
And you cannot change tags that someone else added, but you can add the new tag next to the old tag.

Thanks for the reply. Personally I think that only admins can change the names of tags.

I think the Cyrillic/Latin transliterator for Uzbek is redundant, since the language has officially transitioned into the Latin alphabet in 2023, and some languages using both alphabets already don't have that feature (for example, Serbian).
The transliteration feature was there when Uzbek was first added into Tatoeba, back in 2010.
I remember Georgian having the transliteration feature, but that was removed since it was redundant for a phonemic language.

It's not completely redundant, as there are a bunch of Uzbek sentences in the database using Cyrillic script. And the transliteration feature makes it possible to find them even when you're using Latin script to search: https://tatoeba.org/en/sentence...ry=dushmanning

I wasn't aware of this side of the transliteration feature until now. Thank you for letting me know about it!

Is there an open-source English sentence database similar to Tatoeba?

Mozilla's Common Voice is similar in collecting sentences and recordings thereof. It does not have the translation aspect of Tatoeba.
See https://commonvoice.mozilla.org/

If you just need English sentences, there are a few. However, I have looked myself, and found Tatoeba to be of the best quality, especially for English.
English-only:
• English Penn Treebank (Pennsylvania State University)
... is not something I know much about.
• English Web Treebank (Universal Dependencies)
... is mostly composed of biased sentence picks, but each has a grammatical breakdown. Stanford's NLP project Stanza uses it.
• Common Voice (Mozilla Foundation)
... as Augustus said!
With translation:
• OpenSubtitles2018 Corpus (OpenSubtitles)
... isn't very good for high-fidelity translation, but is rather natural, apart from its dramatizations.
Honorable mentions:
• Google Books Ngram Dataset (Google)
... only has a few languages. For example, their Japanese dataset is old and can only be accessed via purchase in yen.
• Wikipedia and Wiktionary (Wikimedia Foundation)
• Any other English (meta)corpora out there
https://www.google.com/search?q...s"%7C"dataset"
It really depends on your intentions and usage, as all corpora have their biases, unfortunately.

🍎 Random Esperanto Sentences with Audio by PaulP
https://bit.ly/rndepoaudio

Interesting link, CK. Thanks!

✹✹ Stats & Graphs ✹✹
Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/

🍎 Top 5 Languages by number of sentences with audio
Esperanto advanced to 3rd place today.
English (849,032)
Spanish (118,277)
Esperanto (53,135)
Kabyle (53,056)
German (32,943)
Since last December, PaulP has contributed over 48,300 audio files for Esperanto sentences.
You can listen to his most-recently added audio files at the top of this list.
https://tatoeba.org/en/sentence...how/171975/und
This link will also show all linked translations and will show the "add a translation" icon.

Hi, your website seems to be down. The front page works, but every attempt to search sends me the message "Tatoeba is currently unavailable. We are sorry for the inconvenience. You can check our blog or Twitter for more information."

What are you searching for? When I try searching for "all" or "thing" in English, I get hits. However, when I leave the word field blank and search in English (which normally gives me every English sentence), I get a message that an internal error occurred.