menu
تتويبا
language
سجّل لِج
language العربية
menu
تتويبا

chevron_right سجّل

chevron_right لِج

تصفح

chevron_right Show random sentence

chevron_right تصفح حسب اللغة

chevron_right تصفح حسب القائمة

chevron_right تصفح حسب الوسم

chevron_right تصفح ملفات الصوت

المجتمع

chevron_right الحائط

chevron_right قائمة بجميع الأعضاء

chevron_right لغات الأعضاء

chevron_right المتحدثون الأصليون

search
clear
swap_horiz
search

الحائط (٦٬٩٦٠ موضوعًا)

نصائح

قبل أن تسأل، تأكد من أنك قرأت الأسئلة الشائعة.

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

أحدث الرسائل subdirectory_arrow_right

morbrorper

قبل ساعتين

subdirectory_arrow_right

marafon

قبل 5 أيام

feedback

CK

قبل 5 أيام

feedback

sharptoothed

قبل 10 أيام

subdirectory_arrow_right

Cangarejo

قبل 10 أيام

subdirectory_arrow_right

Cangarejo

قبل 14 يومًا

subdirectory_arrow_right

Thanuir

قبل 14 يومًا

subdirectory_arrow_right

ondo

قبل 14 يومًا

subdirectory_arrow_right

ddnktr

قبل 15 يومًا

feedback

ondo

قبل 15 يومًا

11164880 11164880 ١٢ مارس ٢٠٢٤ ١٢ مارس ٢٠٢٤ ٢:٠٦:١٠ ص UTC link Permalink

您好,我想要下载完整的多语种语音数据集,有什么办法可以下载吗?

{{vm.hiddenReplies[40554] ? 'expand_more' : 'expand_less'}} أخفِ الردود أظهر الردود
DJ_Saidez DJ_Saidez ١٢ مارس ٢٠٢٤ ١٢ مارس ٢٠٢٤ ٣:٤٧:٣٥ ص UTC link Permalink

链接在此。https://tatoeba.org/zh-cn/downloads
如果您有任何问题,请告诉我们!

DaoSeng DaoSeng ١٢ مارس ٢٠٢٤ ١٢ مارس ٢٠٢٤ ٧:٤٩:٣٧ م UTC link Permalink

從 DJ_Saidez 給的頁面上下載好像需要一定的編程能力。幾年前有位用戶把這裡的音頻放進了 Anki 的牌組:https://ankiweb.net/shared/by-author/604511069 。您要是有 Anki ( https://apps.ankiweb.net )的話可以先下載那些牌組,然後再從 Anki 裡導出。

{{vm.hiddenReplies[40558] ? 'expand_more' : 'expand_less'}} أخفِ الردود أظهر الردود
DJ_Saidez DJ_Saidez ١٦ مارس ٢٠٢٤ ١٦ مارس ٢٠٢٤ ٥:٢٩:٢٣ ص UTC link Permalink

Thanks for sharing the Anki links, a lot easier!

١٦ مارس ٢٠٢٤ ١٦ مارس ٢٠٢٤ ٥:٢٢:١٦ ص UTC link Permalink
warning

محتوى هذه الرسالة مخالف لقواعدنا ولذلك فقد أُخفي. يظهر المحتوى للمشرفين ولكاتب الرسالة فقط.

١١ مارس ٢٠٢٤ ١١ مارس ٢٠٢٤ ٩:٤٠:٥٣ ص UTC link Permalink
warning

محتوى هذه الرسالة مخالف لقواعدنا ولذلك فقد أُخفي. يظهر المحتوى للمشرفين ولكاتب الرسالة فقط.

sharptoothed sharptoothed ١٠ مارس ٢٠٢٤ ١٠ مارس ٢٠٢٤ ١٠:١٨:٣٤ ص UTC link Permalink

✹✹ Stats & Graphs ✹✹

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/

٨ مارس ٢٠٢٤ ٨ مارس ٢٠٢٤ ٢:٤٩:٢٧ ص UTC link Permalink
warning

محتوى هذه الرسالة مخالف لقواعدنا ولذلك فقد أُخفي. يظهر المحتوى للمشرفين ولكاتب الرسالة فقط.

CK CK ٣ مارس ٢٠٢٤ ٣ مارس ٢٠٢٤ ١٠:٥١:٥١ م UTC link Permalink

🍎 Bilingual Audio Pairs

Get random selections of 1,000 setences using these links.

Spanish with Audio Linked to English with Audio
https://tatoeba.org/en/sentence...a&trans_to=eng
67,376 occurrences

German with Audio Linked to English with Audio
https://tatoeba.org/en/sentence...u&trans_to=eng
29,316 occurrences

Kabyle with Audio Linked to English with Audio
https://tatoeba.org/en/sentence...b&trans_to=eng
28,267 occurrences

Esperanto with Audio Linked to English with Audio
https://tatoeba.org/en/sentence...o&trans_to=eng
15,063 occurrences

Portuguese with Audio Linked to English with Audio
https://tatoeba.org/en/sentence...r&trans_to=eng
14,069 occurrences

French with Audio Linked to English with Audio
https://tatoeba.org/en/sentence...a&trans_to=eng
8,536 occurrences

Dutch with Audio Linked to English with Audio
https://tatoeba.org/en/sentence...d&trans_to=eng
7,188 occurrences

Hungarian with Audio Linked to English with Audio
https://tatoeba.org/en/sentence...n&trans_to=eng
6,185 occurrences

Russian with Audio Linked to English with Audio
https://tatoeba.org/en/sentence...s&trans_to=eng
5,893 occurrences

Japanese with Audio Linked to English with Audio
https://tatoeba.org/en/sentence...n&trans_to=eng
2,136 occurrences

If you prefer to see all translations, and not only English translations, change the last 3 letters of the above URLs to "und". (&trans_to=eng => &trans_to=und)

These are only a few of the possibilities.

Try other possibilities, starting with this "pre-filled advanced search form" set for Spanish-German pairs. (7,513 occurrences)

https://tatoeba.org/en/sentence...rd_count_min=1

CK CK ٢٤ فبراير ٢٠٢٤, edited ٢٤ فبراير ٢٠٢٤ ٢٤ فبراير ٢٠٢٤ ٢:٢٤:٠١ م UTC, edited ٢٤ فبراير ٢٠٢٤ ٢:٢٦:١٧ م UTC link Permalink

🍎 Are you looking for English sentences to translate into your own language?

Here are some sentences that have audio that do not yet have translations into any language.

ddnktr's Sentences (732)

https://tatoeba.org/en/sentence...rd_count_min=1

shekitten's Sentences (328)

https://tatoeba.org/en/sentence...rd_count_min=1

Miktsoanit's Sentences (140)

https://tatoeba.org/en/sentence...rd_count_min=1

AlanF_US's Sentences (30)

https://tatoeba.org/en/sentence...rd_count_min=1

sundown's Sentences (8)

https://tatoeba.org/en/sentence...rd_count_min=1

{{vm.hiddenReplies[40530] ? 'expand_more' : 'expand_less'}} أخفِ الردود أظهر الردود
sacredceltic sacredceltic ٢٨ فبراير ٢٠٢٤ ٢٨ فبراير ٢٠٢٤ ٩:٣٥:١٨ م UTC link Permalink

These links aren’t functional…

sharptoothed sharptoothed ٢٥ فبراير ٢٠٢٤ ٢٥ فبراير ٢٠٢٤ ٥:١٢:١٩ م UTC link Permalink

✹✹ Stats & Graphs ✹✹

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/

lbdx lbdx ٣ فبراير ٢٠٢٤, edited ٣ فبراير ٢٠٢٤ ٣ فبراير ٢٠٢٤ ٩:٤١:٤٨ ص UTC, edited ٣ فبراير ٢٠٢٤ ٩:٥٤:١٨ ص UTC link Permalink

** Pruned/Rebalanced Lists **

Rebalanced lists are lexical filters that provide a more varied and balanced view of the Tatoeba Corpus. They prohibit a word from occurring more than 10 times as often as in a reference corpus. Long sentences of more than 15 words have little success with translators and are therefore systematically pruned. The most recent sentences are pruned before older ones. The words targeted are usually pervasive named entities that are used extensively by a few Tatoebans, and not relevant across languages.

10 major languages on Tatoeba are currently supported:
- English: https://tatoeba.org/en/sentence...=1&orphans=any
- French: https://tatoeba.org/en/sentence...=1&orphans=any
- German: https://tatoeba.org/en/sentence...=1&orphans=any
- Italian: https://tatoeba.org/en/sentence...=1&orphans=any
- Japanese: https://tatoeba.org/en/sentence...=1&orphans=any
- Mandarin Chinese: https://tatoeba.org/en/sentence...=1&orphans=any
- Portuguese: https://tatoeba.org/en/sentence...=1&orphans=any
- Russian: https://tatoeba.org/en/sentence...=1&orphans=any
- Spanish: https://tatoeba.org/en/sentence...=1&orphans=any
- Turkish: https://tatoeba.org/en/sentence...=1&orphans=any

All rebalanced lists are updated automatically every Saturday.

{{vm.hiddenReplies[40482] ? 'expand_more' : 'expand_less'}} أخفِ الردود أظهر الردود
sundown sundown ٤ فبراير ٢٠٢٤, edited ٤ فبراير ٢٠٢٤ ٤ فبراير ٢٠٢٤ ٩:٠٠:٠٩ ص UTC, edited ٤ فبراير ٢٠٢٤ ٩:٠٠:٤٢ ص UTC link Permalink

@lbdx Reading your description of the list in your profile, it's interesting to me that you date the imbalance of the English corpus to 2017: that's when I joined. Sharptooth's graphs show a massive increase of English sentences at that time. Until 2020 (or thereabouts), I myself had only added about 1,500 sentences.

{{vm.hiddenReplies[40484] ? 'expand_more' : 'expand_less'}} أخفِ الردود أظهر الردود
lbdx lbdx ٤ فبراير ٢٠٢٤ ٤ فبراير ٢٠٢٤ ١٠:٤٦:١١ ص UTC link Permalink

The years 2017 and 2018 were years in which Tatoeba's main English-speaking contributor added hundreds of thousands of sentences in bulk.These sentences were mostly built according to syntactic patterns and used wildcards to avoid creating paraphrases that differ only in their named entities. These massive additions have greatly reduced the lexical diversity of the English corpus and increased the proportion of sentences containing pervasive words from 20% to 40%. This sudden change coincides with a sharp drop in the number of active contributors to Tatoeba.

The introduction of rate limits for sentence additions would prevent such a flood from happening again.

{{vm.hiddenReplies[40485] ? 'expand_more' : 'expand_less'}} أخفِ الردود أظهر الردود
sundown sundown ١١ فبراير ٢٠٢٤ ١١ فبراير ٢٠٢٤ ٨:٠٧:٤٠ ص UTC link Permalink

Thanks, @lbdx. It's good to have some numbers to back up what should be obvious to anyone who cares to look a bit at the English corpus and who has shaped it. Could you give us some more detail about the sharp drop in the number of active contributors? For example, has it been across all languages and countries?

{{vm.hiddenReplies[40508] ? 'expand_more' : 'expand_less'}} أخفِ الردود أظهر الردود
lbdx lbdx ١٤ فبراير ٢٠٢٤ ١٤ فبراير ٢٠٢٤ ٥:٥٦:٥٢ م UTC link Permalink

The number of monthly sentence owners fell from 350-400 between 2012 and 2016 to 250-300 between 2017 and 2023. I don't have the details by language.

TRANG TRANG ١٤ فبراير ٢٠٢٤ ١٤ فبراير ٢٠٢٤ ٢:٠٤:٣٢ م UTC link Permalink

Do you have any recommendation on what might be a good rate limit?

I did find a post where you suggested 3000 original sentences per month in one language:
https://tatoeba.org/en/wall/sho...#message_39818
Just wondering if you still think it's a good rate limit, or do you perhaps have another opinion now?
Any particular reason why you suggested a monthly rather than a daily or weekly cap?

If I may share some additional insight, perhaps some people don't know, but Tatoeba used to have a mass import feature. Only admins could access it, but you could send a list of sentences to an admin and ask for them to be imported. The feature was disabled on January 2019 because we migrated CakePHP from v2 to v3, and we didn't feel it was urgent to migrate the mass import feature. It was for the best, I guess. I would say this feature was the main cause for the reduced lexical diversity of the English corpus.

{{vm.hiddenReplies[40518] ? 'expand_more' : 'expand_less'}} أخفِ الردود أظهر الردود
lbdx lbdx ١٤ فبراير ٢٠٢٤ ١٤ فبراير ٢٠٢٤ ٥:٤٨:٤٨ م UTC link Permalink

Trang, thank you for reopening the debate on this important issue.

My view on this has evolved slightly. I now think it would be simpler and more understandable to also include derived sentences in this rate limit of 3,000 sentences per language per month. Sentence counts would be reset at the beginning of each month. Once the limit has been reached, the user would not be allowed to add any more sentences until the following month. I prefer a monthly rate limit because it doesn't penalise users who don't contribute every day or every week.

Note that I'm not against the occasional import of other corpora into Tatoeba as long as they are lexically balanced and composed of sentences that are useful for language learners.

morbrorper morbrorper ١٥ فبراير ٢٠٢٤ ١٥ فبراير ٢٠٢٤ ٩:٠٧:٣٩ ص UTC link Permalink

I don't see the problem with a native speaker contributing useful sentences in great volumes; I'd rather have that than non-native speakers contributing questionable sentences, even in small volumes. But then, I don't see a big problem with Tom and Mary either.

I used to be a proponent of limits, but the more I think about it, I think any workable limits would have to be set so high as to make them more or less meaningless.

{{vm.hiddenReplies[40523] ? 'expand_more' : 'expand_less'}} أخفِ الردود أظهر الردود
Miktsoanit Miktsoanit ١٥ فبراير ٢٠٢٤, edited ١٥ فبراير ٢٠٢٤ ١٥ فبراير ٢٠٢٤ ٥:٥٢:٢٩ م UTC, edited ١٥ فبراير ٢٠٢٤ ٥:٥٢:٤٣ م UTC link Permalink

> I don't see the problem with a native speaker contributing useful sentences in great volumes; I'd rather have that than non-native speakers contributing questionable sentences, even in small volumes.

This is too simple. A native speaker contributing 1000 auto-generated sentences isn't necessarily more valuable than a non-native speaker contributing one correct sentence.

{{vm.hiddenReplies[40524] ? 'expand_more' : 'expand_less'}} أخفِ الردود أظهر الردود
morbrorper morbrorper ١٦ فبراير ٢٠٢٤ ١٦ فبراير ٢٠٢٤ ٦:٠٥:١٣ م UTC link Permalink

Obviously, we don't want any auto-generated content at all. But even with limits per hour, day, week and month, or a combination thereof, it's trivial to adjust a script to conform with that, and still be able to upload thousands of sentences we don't want.

{{vm.hiddenReplies[40528] ? 'expand_more' : 'expand_less'}} أخفِ الردود أظهر الردود
Yorwba Yorwba ١٧ فبراير ٢٠٢٤ ١٧ فبراير ٢٠٢٤ ٩:١٧:٠٩ ص UTC link Permalink

Yes, an automated script can be slowed down arbitrarily to conform with any given limit, until it no longer has a quantitative advantage over someone adding sentences manually.

Uploading thousands of sentences we don't want is only possible if you're able to upload thousands of sentences in the first place.

١٦ فبراير ٢٠٢٤ ١٦ فبراير ٢٠٢٤ ١٠:٠٣:١٤ ص UTC link Permalink
warning

محتوى هذه الرسالة مخالف لقواعدنا ولذلك فقد أُخفي. يظهر المحتوى للمشرفين ولكاتب الرسالة فقط.

cojiluc cojiluc ١٦ نوفمبر ٢٠٢٣, edited ١٦ نوفمبر ٢٠٢٣ ١٦ نوفمبر ٢٠٢٣ ١٠:٣٣:٠١ م UTC, edited ١٦ نوفمبر ٢٠٢٣ ١٠:٣٦:٥٤ م UTC link Permalink

Is there a more user-friendly way to perform an advanced search, restricted to a list?

In advanced search https://tatoeba.org/en/sentences/advanced_search, there is a drop down menu "Belong to List" by which one can choose a specific List. But this drop down menu is very cumbersome and choosing a specific list is very difficult.
For example if one likes to restrict the advanced search within the List "Spread by Tatoebans" one has to scroll the menu hundreds of time to arrive to "Spread by Tatoebans".

(It would be easier if this menu could be partially searched to find a specific list more quickly, for example typing "Spread" lists all lists containing the word "Spread" and then choosing the proper one.)

{{vm.hiddenReplies[40284] ? 'expand_more' : 'expand_less'}} أخفِ الردود أظهر الردود
CK CK ١٦ نوفمبر ٢٠٢٣, edited ١٦ نوفمبر ٢٠٢٣ ١٦ نوفمبر ٢٠٢٣ ١١:٣١:٠٦ م UTC, edited ١٦ نوفمبر ٢٠٢٣ ١١:٥٤:٣١ م UTC link Permalink

For me, using Google Chrome on a Mac, I can just click the select option and start typing and it jumps to that list.

Remember that you can also save a template, then bookmark it.

Here is a template with that list selected.

https://tatoeba.org/en/sentence...rd_count_min=1

You can make additional presets for your searches, too, in templates.

For example,
Language: English
Has audio: Yes
List: Spread by Tatoebans
Sort: random

https://tatoeba.org/en/sentence...rd_count_min=1

Without entering any search query, you can just click the "Search" button to get a random selection of sentences.

If you are looking for English sentences to translate into Persian with the above criteria, add the "Exclude sentences already translated into Persian" part.

Here is that template already created for you.

https://tatoeba.org/en/sentence...rd_count_min=1

{{vm.hiddenReplies[40285] ? 'expand_more' : 'expand_less'}} أخفِ الردود أظهر الردود
cojiluc cojiluc ١٧ نوفمبر ٢٠٢٣ ١٧ نوفمبر ٢٠٢٣ ٥:٤٣:٣٢ ص UTC link Permalink

Thank you for the templates. You are right on pc, typing just works. On the other hand on touch screen devices the keyboard is not opened and the only way seems, scrolling the menu.

١٦ فبراير ٢٠٢٤ ١٦ فبراير ٢٠٢٤ ١٠:٠٤:٤٨ ص UTC link Permalink
warning

محتوى هذه الرسالة مخالف لقواعدنا ولذلك فقد أُخفي. يظهر المحتوى للمشرفين ولكاتب الرسالة فقط.