Zeď (7 118 témat)
Tipy
Před položením dotazu se podívejte na často kladené otázky – FAQ.
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
sharptoothed
Před 22 hodinami
Shanaz
Před 3 dny
Qaztat
Před 4 dny
TATAR1
Před 4 dny
Tartar
Před 4 dny
EugeneGS
Před 5 dny
menaud
Před 6 dny
Ooneykcall
Před 9 dny
hecko
Před 9 dny
boracasli2
Před 9 dny

Greetings to all participants! I would like to immediately suggest that the leaders of this project add the missing Siberian Tatar language to the database. Its code is (sty). Sincerely, TATAR1.

Hi and welcome to Tatoeba. This is what should be done to have a new language added: https://en.wiki.tatoeba.org/art...guage-request#

Thank you!

Feliĉan novan jaron! mraz

HAPPY NEW YEAR to all.

BONNE ANNÉE à tous.

✹✹ Tatoeba Year 2024 Graphs ✹✹
https://tatoeba.j-langtools.com...24/graphs.html
Previous years:
https://tatoeba.j-langtools.com...23/graphs.html
https://tatoeba.j-langtools.com...22/graphs.html
https://tatoeba.j-langtools.com...21/graphs.html
https://tatoeba.j-langtools.com...20/graphs.html
https://tatoeba.j-langtools.com...19/graphs.html
https://tatoeba.j-langtools.com...18/graphs.html
https://tatoeba.j-langtools.com...17/graphs.html
https://tatoeba.j-langtools.com...16/graphs.html

🎁 🎉 Joyeuses fêtes de fin d'année à chacun d'entre vous ! 🔔 🎅

✹✹ Stats & Graphs ✹✹
Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/

I was wondering if I could change the name of a tag. I would like to modify a tag created by me, but I don't know how to do it, if it is even possible.

You can remove the old tag from a sentence and add a new tag with a different name instead. If you want to change all occurrences of a tag, it's going to be a lot of work, of course. (Unless you know how to automate it. That's how I added "quote" tags to a bunch of sentences tagged "by <Somebody>")
And you cannot change tags that someone else added, but you can add the new tag next to the old tag.

Thanks for the reply. Personally I think that only admins can change the names of tags.

I think the Cyrillic/Latin transliterator for Uzbek is redundant, since the language has officially transitioned into the Latin alphabet in 2023, and some languages using both alphabets already don't have that feature (for example, Serbian).
The transliteration feature was there when Uzbek was first added into Tatoeba, back in 2010.
I remember Georgian having the transliteration feature, but that was removed since it was redundant for a phonemic language.

It's not completely redundant, as there are a bunch of Uzbek sentences in the database using Cyrillic script. And the transliteration feature makes it possible to find them even when you're using Latin script to search: https://tatoeba.org/en/sentence...ry=dushmanning

I wasn't aware of this side of the transliteration feature until now. Thank you for letting me know about it!

Is there an open-source English sentence database similar to Tatoeba?

Mozilla's Common Voice is similar in collecting sentences and recordings thereof. It does not have the translation aspect of Tatoeba.
See https://commonvoice.mozilla.org/

If you just need English sentences, there are a few. However, I have looked myself, and found Tatoeba to be of the best quality, especially for English.
English-only:
• English Penn Treebank (Pennsylvania State University)
... is not something I know much about.
• English Web Treebank (Universal Dependencies)
... is mostly composed of biased sentence picks, but each has a grammatical breakdown. Stanford's NLP project Stanza uses it.
• Common Voice (Mozilla Foundation)
... as Augustus said!
With translation:
• OpenSubtitles2018 Corpus (OpenSubtitles)
... isn't very good for high-fidelity translation, but is rather natural, apart from its dramatizations.
Honorable mentions:
• Google Books Ngram Dataset (Google)
... only has a few languages. For example, their Japanese dataset is old and can only be accessed via purchase in yen.
• Wikipedia and Wiktionary (Wikimedia Foundation)
• Any other English (meta)corpora out there
https://www.google.com/search?q...s"%7C"dataset"
It really depends on your intentions and usage, as all corpora have their biases, unfortunately.

🍎 Random Esperanto Sentences with Audio by PaulP
https://bit.ly/rndepoaudio

Interesting link, CK. Thanks!