Our data is released under various Creative Commons licenses.More information
If you love this content, please consider a donation.
Tips
Before asking a question, make sure to read the FAQ.
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
Latest messages
Wall (5241 threads)
hide replies
show replies
1. Download all the sentences.
http://downloads.tatoeba.org/ex...tences.tar.bz2
2. Download the sentence numbers with CC0 license.
https://downloads.tatoeba.org/e...es_CC0.tar.bz2
3. Grab all the sentences with these numbers from the sentences.csv file.
These are the counts from last week's exported data.
kab (7079 sentences)
eng (253 sentences)
fra (108 sentences)
ukr (94 sentences)
spa (12 sentences)
por (6 sentences)
rus (3 sentences)
pol (3 sentences)
deu (3 sentences)
http://downloads.tatoeba.org/ex...tences.tar.bz2
2. Download the sentence numbers with CC0 license.
https://downloads.tatoeba.org/e...es_CC0.tar.bz2
3. Grab all the sentences with these numbers from the sentences.csv file.
These are the counts from last week's exported data.
kab (7079 sentences)
eng (253 sentences)
fra (108 sentences)
ukr (94 sentences)
spa (12 sentences)
por (6 sentences)
rus (3 sentences)
pol (3 sentences)
deu (3 sentences)
** Audio Stats **
https://prnt.sc/mmgmtz
Compare stats from these 3 days.
2019-02-16
2018-02-08
2016-03-22
https://prnt.sc/mmgmtz
Compare stats from these 3 days.
2019-02-16
2018-02-08
2016-03-22
8 days ago
I have problems adding Ottoman Turkish sentences written in Latin script. The punctuation order gets corrupted and it looks weird. The system only works properly when using Arabic script with Ottoman Turkish.
Some languages are written in more than one script. Like Azerbaijani (Latin, Arabic and Cyrillic), Kurdish (Latin and Arabic) or Serbian (Latin and Cyrillic).
I guess there are Serbian sentences written in both Latin and Cyrillic scripts on Tatoeba. It isn't a problem when the script direction is same as in Serbian, but if the direction is different, it becomes difficult to use the script other than the 'default' one.
Can't something be done about it?
Some languages are written in more than one script. Like Azerbaijani (Latin, Arabic and Cyrillic), Kurdish (Latin and Arabic) or Serbian (Latin and Cyrillic).
I guess there are Serbian sentences written in both Latin and Cyrillic scripts on Tatoeba. It isn't a problem when the script direction is same as in Serbian, but if the direction is different, it becomes difficult to use the script other than the 'default' one.
Can't something be done about it?
hide replies
show replies
7 days ago
#7771502
I've noticed that the 'other language' flag is direction-neutral. It allows both left-to-right and right-to-left scripts. So I think it should be possible to implement this to other languages that can be written in multiple scripts.
I've noticed that the 'other language' flag is direction-neutral. It allows both left-to-right and right-to-left scripts. So I think it should be possible to implement this to other languages that can be written in multiple scripts.
hide replies
show replies
6 days ago
Also, it would be really helpful if simplified and traditional Chinese could be somehow separated. A lot of otherwise simple sentences take me a while to translate because I have to change it to simplified chinese on google translate first. Just some kind of tag in advanced search, or a separate category?
hide replies
show replies
5 days ago
That's a different issue, but I agree. It would be useful. When studying a language that can be written in multiple scripts, one may need to view sentences written only in a particular script like Chinese sentences written in traditional Chinese or Berber sentences written in the Tifinagh script. Currently, it's not possible to separate/filter sentences in such a way.
What exactly does the "AB" icon do?
I thought it converted between the 2 and added in Pinyin.
I thought it converted between the 2 and added in Pinyin.
Actually we used to automatically generate the sentences in the other script for Chinese.
Like this sentence for instance: https://tatoeba.org/eng/sentences/show/7776103
If the sentence is in simplified, it would have the traditional version in grey. If it was in traditional, it would have the simplified version in great.
I don't think we've ever decide to remove this feature so it must have broken at some point...
Like this sentence for instance: https://tatoeba.org/eng/sentences/show/7776103
If the sentence is in simplified, it would have the traditional version in grey. If it was in traditional, it would have the simplified version in great.
I don't think we've ever decide to remove this feature so it must have broken at some point...
hide replies
show replies
18 hours ago
Hm, when I look up sentences it appears for a couple but not for most. Like maybe one or two out of every ten sentences?
hide replies
show replies
As you pointed out, the current implementation assumes that Ottoman Turkish is written right-to-left using Arabic script.
I had a look at the English Wikipedia article about the Ottoman Turkish language, and I am a bit confused because it says that this language switched to the Latin script as it evolved into modern Turkish. Can you elaborate about the contemporary use of Arabic vs. Latin to write Ottoman Turkish?
One way to quickly solve the display problem is to set the direction of Ottoman Turkish to "auto". Another, much more complex way is to implement multiple script support in and auto-convert between, but only if it's worth, that is to say there are actually native speakers using Latin and Arabic, we want to be able find sentences written in Arabic by the searching in Latin and vice-versa, the conversion can be partly or fully automated, etc.
As you found out, the direction of sentences of "unknown" language is set to automatic. That said, this is not a reason to set the language of your Ottoman Turkish sentences written in Latin script to "unknown", just because they look better. I strongly discourage you from doing this because then these sentences are excluded from the Ottoman Turkish corpus, they won't show up in searches and statistics, which is preventing contributors/learners of Ottoman Turkish from finding them. What's worse, since *only you* know their actual language, if for some reason you forget about them or stop contributing, these sentences will never be assigned to the correct language and will be definitely lost.
I had a look at the English Wikipedia article about the Ottoman Turkish language, and I am a bit confused because it says that this language switched to the Latin script as it evolved into modern Turkish. Can you elaborate about the contemporary use of Arabic vs. Latin to write Ottoman Turkish?
One way to quickly solve the display problem is to set the direction of Ottoman Turkish to "auto". Another, much more complex way is to implement multiple script support in and auto-convert between, but only if it's worth, that is to say there are actually native speakers using Latin and Arabic, we want to be able find sentences written in Arabic by the searching in Latin and vice-versa, the conversion can be partly or fully automated, etc.
As you found out, the direction of sentences of "unknown" language is set to automatic. That said, this is not a reason to set the language of your Ottoman Turkish sentences written in Latin script to "unknown", just because they look better. I strongly discourage you from doing this because then these sentences are excluded from the Ottoman Turkish corpus, they won't show up in searches and statistics, which is preventing contributors/learners of Ottoman Turkish from finding them. What's worse, since *only you* know their actual language, if for some reason you forget about them or stop contributing, these sentences will never be assigned to the correct language and will be definitely lost.
hide replies
show replies
9 hours ago
>I had a look at the English Wikipedia article about the Ottoman Turkish language, and I am a bit confused because it says that this language switched to the Latin script as it evolved into modern Turkish. Can you elaborate about the contemporary use of Arabic vs. Latin to write Ottoman Turkish?
Thanks for your reply, gillux. Have you seen the GitHub issue? I tried to explain this there. Also, there are some other languages being affected from this issue.
https://github.com/Tatoeba/tato...ment-463754887
The Turkish language reform consists of a script reform and replacing of loanwords. They are different things. Allowing Ottoman Turkish sentences in the Latin script will increase contributions in the old language and its readability. Currently, almost all 'Ottoman Turkish' sentences on Tatoeba are simply transliterations of modern Turkish into the Arabic script. They're not wrong, but they don't truly reflect the old language. If one looked here to compare Ottoman Turkish and modern Turkish, they would assume that the only difference is the alphabet.
>One way to quickly solve the display problem is to set the direction of Ottoman Turkish to "auto".
This sounds good to me. If doing it would display sentences in both Arabic and Latin scripts correctly and wouldn't cause any unintended consequences, why not?
> I strongly discourage you from doing this because then these sentences are excluded from the Ottoman Turkish corpus, they won't show up in searches and statistics
I created only one pair set as 'unknown' for demonstration. I'm adding romanized Ottoman Turkish sentences to Turkish corpus for now. I will change them back to Ottoman Turkish once a solution is found.
Thanks for your reply, gillux. Have you seen the GitHub issue? I tried to explain this there. Also, there are some other languages being affected from this issue.
https://github.com/Tatoeba/tato...ment-463754887
The Turkish language reform consists of a script reform and replacing of loanwords. They are different things. Allowing Ottoman Turkish sentences in the Latin script will increase contributions in the old language and its readability. Currently, almost all 'Ottoman Turkish' sentences on Tatoeba are simply transliterations of modern Turkish into the Arabic script. They're not wrong, but they don't truly reflect the old language. If one looked here to compare Ottoman Turkish and modern Turkish, they would assume that the only difference is the alphabet.
>One way to quickly solve the display problem is to set the direction of Ottoman Turkish to "auto".
This sounds good to me. If doing it would display sentences in both Arabic and Latin scripts correctly and wouldn't cause any unintended consequences, why not?
> I strongly discourage you from doing this because then these sentences are excluded from the Ottoman Turkish corpus, they won't show up in searches and statistics
I created only one pair set as 'unknown' for demonstration. I'm adding romanized Ottoman Turkish sentences to Turkish corpus for now. I will change them back to Ottoman Turkish once a solution is found.
In https://tatoeba.org/deu/tags/view_all/cebuano there's a string untranslated into German
http://prntscr.com/mmc1pr
It displays "Tags containing"
http://prntscr.com/mmc1pr
It displays "Tags containing"
If I have marked a sentence and it has been since removed, the sentence is shown with mark as outdated; for example https://tatoeba.org/eng/sentences/show/7711907.
This is okay, but the problem is that such sentences are still visible in my collection of outdated markings. As the collection is a good place to check if one should change those ratings, it would be good for the ratings of removed sentences to not be there.
Deleted sentences could also be removed from the other lists of ratings, but this is not as important.
This is okay, but the problem is that such sentences are still visible in my collection of outdated markings. As the collection is a good place to check if one should change those ratings, it would be good for the ratings of removed sentences to not be there.
Deleted sentences could also be removed from the other lists of ratings, but this is not as important.
Advanced search:
English, display translations in French
Limit to sentences having French indirect translations.
Only English and French are displayed.
Link one French => All translations are now displayed.
Is it intended behavior?
(I do not think so, and that is quite a bother when doing some linking work)
English, display translations in French
Limit to sentences having French indirect translations.
Only English and French are displayed.
Link one French => All translations are now displayed.
Is it intended behavior?
(I do not think so, and that is quite a bother when doing some linking work)
hide replies
show replies
I couldn't duplicate the problem with these 2 searches.
Are these similar to what you were trying to do?
https://tatoeba.org/eng/sentenc...io=&sort=words
Keyword: Australia
From: English
To: French
Show Translations In: French
Limit To: French
Link: Indirect
https://tatoeba.org/eng/sentenc...io=&sort=words
Keyword: Australia
From: English
To: French
Show Translations In: French
Limit To: French
Link: Direct
Are these similar to what you were trying to do?
https://tatoeba.org/eng/sentenc...io=&sort=words
Keyword: Australia
From: English
To: French
Show Translations In: French
Limit To: French
Link: Indirect
https://tatoeba.org/eng/sentenc...io=&sort=words
Keyword: Australia
From: English
To: French
Show Translations In: French
Limit To: French
Link: Direct
hide replies
show replies
Yes.
If I link one French sentence, then all translations will appear.
The problem will not happen if you restrict the langages to be displayed in your profile. That's usually how I do it, but this time I forgot so I faced this situation (maybe I, or somebody else, already mentioned it in the past).
If I link one French sentence, then all translations will appear.
The problem will not happen if you restrict the langages to be displayed in your profile. That's usually how I do it, but this time I forgot so I faced this situation (maybe I, or somebody else, already mentioned it in the past).
hide replies
show replies
4 days ago
Er hat Geburtstag, unser Ricardo (Ricardo14)! Herzlichen Glückwunsch zum Geburtstag!
Happy birthday, Ricardo! 😊
Happy birthday, Ricardo! 😊
** History of the Tatoeba Project **
http://bit.ly/tatoebahistory
I've re-uploaded this. It hasn't been updated in a while, though.
Perhaps some new members would be interested in this.
Previously posted here.
https://tatoeba.org/eng/wall/sh...#message_28844
http://bit.ly/tatoebahistory
I've re-uploaded this. It hasn't been updated in a while, though.
Perhaps some new members would be interested in this.
Previously posted here.
https://tatoeba.org/eng/wall/sh...#message_28844
hide replies
show replies
5 days ago
It was an interesting read, thanks!
Of particular interest, I think that the community activities/challenges were motivating. I mean the days when members had to create lists, adopt orphan sentences, etc. does anyone know why it was stopped?
Of particular interest, I think that the community activities/challenges were motivating. I mean the days when members had to create lists, adopt orphan sentences, etc. does anyone know why it was stopped?
hide replies
show replies
I was not here, then, but the default answer to these kinds of questions is that someone stopped doing it and nobody continued. Supposing there was no big drama around it, there is very little stopping an interested person, such as you, from restarting an activity you found useful.
I hope someone tells if there was a more specific problem that caused the stopping.
I hope someone tells if there was a more specific problem that caused the stopping.
10 English Words Per Day
http://goo.gl/H8TRSp
I set this up a while ago for anyone who may be interested in focusing on certain English words each day.
You will get a choice of 10 words each day, roughly sorted in the order of frequency of use.
If you like this idea, you can bookmark the page and return to it every day.
http://goo.gl/H8TRSp
I set this up a while ago for anyone who may be interested in focusing on certain English words each day.
You will get a choice of 10 words each day, roughly sorted in the order of frequency of use.
If you like this idea, you can bookmark the page and return to it every day.
hide replies
show replies
Awesome! I've bookmarked it and I'll post into my profile, so it'll be even easier to use this every day