menu
Tatoeba
language
S'inscriure Connexion
language Occitan
menu
Tatoeba

chevron_right S'inscriure

chevron_right Connexion

Percórrer

chevron_right Afichar la frasa aleatòria

chevron_right Percórrer per lenga

chevron_right Percórrer per lista

chevron_right Percórrer per etiqueta

chevron_right Percórrer los enregistraments àudio

Community

chevron_right Paret

chevron_right Lista de totes los membres

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search

Wall (7118 threads)

Astúcias

Before asking a question, make sure to read the FAQ.

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

Darrièrs messatges subdirectory_arrow_right

AlanF_US

8 hours ago

feedback

sharptoothed

1 days ago

subdirectory_arrow_right

Shanaz

4 days ago

subdirectory_arrow_right

Qaztat

4 days ago

subdirectory_arrow_right

TATAR1

4 days ago

feedback

Tartar

4 days ago

subdirectory_arrow_right

EugeneGS

5 days ago

subdirectory_arrow_right

menaud

7 days ago

subdirectory_arrow_right

Ooneykcall

9 days ago

subdirectory_arrow_right

hecko

9 days ago

TATAR1 TATAR1 January 9, 2025 January 9, 2025 at 11:35:58 AM UTC link Permalink

Greetings to all participants! I would like to immediately suggest that the leaders of this project add the missing Siberian Tatar language to the database. Its code is (sty). Sincerely, TATAR1.

{{vm.hiddenReplies[40860] ? 'expand_more' : 'expand_less'}} hide replies show replies
Guybrush88 Guybrush88 January 9, 2025 January 9, 2025 at 12:28:30 PM UTC link Permalink

Hi and welcome to Tatoeba. This is what should be done to have a new language added: https://en.wiki.tatoeba.org/art...guage-request#

{{vm.hiddenReplies[40861] ? 'expand_more' : 'expand_less'}} hide replies show replies
TATAR1 TATAR1 January 9, 2025 January 9, 2025 at 12:52:04 PM UTC link Permalink

Thank you!

mraz mraz January 1, 2025 January 1, 2025 at 9:46:40 AM UTC link Permalink

Feliĉan novan jaron! mraz

Ergulis Ergulis December 31, 2024 December 31, 2024 at 11:22:29 PM UTC link Permalink

HAPPY NEW YEAR to all.

{{vm.hiddenReplies[40854] ? 'expand_more' : 'expand_less'}} hide replies show replies
lbdx lbdx January 1, 2025 January 1, 2025 at 7:45:29 AM UTC link Permalink

BONNE ANNÉE à tous.

sharptoothed sharptoothed December 31, 2024 December 31, 2024 at 9:52:41 AM UTC link Permalink

✹✹ Tatoeba Year 2024 Graphs ✹✹

https://tatoeba.j-langtools.com...24/graphs.html

Previous years:
https://tatoeba.j-langtools.com...23/graphs.html
https://tatoeba.j-langtools.com...22/graphs.html
https://tatoeba.j-langtools.com...21/graphs.html
https://tatoeba.j-langtools.com...20/graphs.html
https://tatoeba.j-langtools.com...19/graphs.html
https://tatoeba.j-langtools.com...18/graphs.html
https://tatoeba.j-langtools.com...17/graphs.html
https://tatoeba.j-langtools.com...16/graphs.html

felix63 felix63 December 25, 2024 December 25, 2024 at 11:29:55 AM UTC link Permalink

🎁 🎉 Joyeuses fêtes de fin d'année à chacun d'entre vous ! 🔔 🎅

sharptoothed sharptoothed December 22, 2024 December 22, 2024 at 7:18:17 AM UTC link Permalink

✹✹ Stats & Graphs ✹✹

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/

Ergulis Ergulis December 21, 2024 December 21, 2024 at 9:09:52 AM UTC link Permalink

I was wondering if I could change the name of a tag. I would like to modify a tag created by me, but I don't know how to do it, if it is even possible.

{{vm.hiddenReplies[40843] ? 'expand_more' : 'expand_less'}} hide replies show replies
Yorwba Yorwba December 21, 2024 December 21, 2024 at 2:03:42 PM UTC link Permalink

You can remove the old tag from a sentence and add a new tag with a different name instead. If you want to change all occurrences of a tag, it's going to be a lot of work, of course. (Unless you know how to automate it. That's how I added "quote" tags to a bunch of sentences tagged "by <Somebody>")

And you cannot change tags that someone else added, but you can add the new tag next to the old tag.

{{vm.hiddenReplies[40844] ? 'expand_more' : 'expand_less'}} hide replies show replies
Ergulis Ergulis December 21, 2024 December 21, 2024 at 2:52:19 PM UTC link Permalink

Thanks for the reply. Personally I think that only admins can change the names of tags.

Borbie Borbie December 12, 2024, edited December 12, 2024 December 12, 2024 at 1:26:47 AM UTC, edited December 12, 2024 at 1:32:21 AM UTC link Permalink

I think the Cyrillic/Latin transliterator for Uzbek is redundant, since the language has officially transitioned into the Latin alphabet in 2023, and some languages using both alphabets already don't have that feature (for example, Serbian).

The transliteration feature was there when Uzbek was first added into Tatoeba, back in 2010.

I remember Georgian having the transliteration feature, but that was removed since it was redundant for a phonemic language.

{{vm.hiddenReplies[40838] ? 'expand_more' : 'expand_less'}} hide replies show replies
Yorwba Yorwba December 14, 2024 December 14, 2024 at 2:53:55 PM UTC link Permalink

It's not completely redundant, as there are a bunch of Uzbek sentences in the database using Cyrillic script. And the transliteration feature makes it possible to find them even when you're using Latin script to search: https://tatoeba.org/en/sentence...ry=dushmanning

{{vm.hiddenReplies[40839] ? 'expand_more' : 'expand_less'}} hide replies show replies
Borbie Borbie December 15, 2024 December 15, 2024 at 1:14:19 AM UTC link Permalink

I wasn't aware of this side of the transliteration feature until now. Thank you for letting me know about it!

coinxee coinxee July 17, 2024 July 17, 2024 at 2:52:34 AM UTC link Permalink

Is there an open-source English sentence database similar to Tatoeba?

{{vm.hiddenReplies[40692] ? 'expand_more' : 'expand_less'}} hide replies show replies
Augustus Augustus July 18, 2024 July 18, 2024 at 8:53:16 PM UTC link Permalink

Mozilla's Common Voice is similar in collecting sentences and recordings thereof. It does not have the translation aspect of Tatoeba.

See https://commonvoice.mozilla.org/

urro urro July 20, 2024, edited July 20, 2024 July 20, 2024 at 1:23:22 AM UTC, edited July 20, 2024 at 1:27:38 AM UTC link Permalink

If you just need English sentences, there are a few. However, I have looked myself, and found Tatoeba to be of the best quality, especially for English.

English-only:
• English Penn Treebank (Pennsylvania State University)
... is not something I know much about.
• English Web Treebank (Universal Dependencies)
... is mostly composed of biased sentence picks, but each has a grammatical breakdown. Stanford's NLP project Stanza uses it.
• Common Voice (Mozilla Foundation)
... as Augustus said!

With translation:
• OpenSubtitles2018 Corpus (OpenSubtitles)
... isn't very good for high-fidelity translation, but is rather natural, apart from its dramatizations.

Honorable mentions:
• Google Books Ngram Dataset (Google)
... only has a few languages. For example, their Japanese dataset is old and can only be accessed via purchase in yen.
• Wikipedia and Wiktionary (Wikimedia Foundation)

• Any other English (meta)corpora out there

https://www.google.com/search?q...s"%7C"dataset"

It really depends on your intentions and usage, as all corpora have their biases, unfortunately.

CK CK December 7, 2024 December 7, 2024 at 10:11:37 AM UTC link Permalink

🍎 Random Esperanto Sentences with Audio by PaulP

https://bit.ly/rndepoaudio

{{vm.hiddenReplies[40831] ? 'expand_more' : 'expand_less'}} hide replies show replies
PaulP PaulP December 8, 2024 December 8, 2024 at 6:43:31 AM UTC link Permalink

Interesting link, CK. Thanks!