پروفائل
جملے
ژخیرہ الفاظ
ریویو
تندیراں
پسنداں
تبصرے
soliloquist دے جملیاں تے تبصرے
وال سنیہے
لوگو
آڈیو
نقلاں
soliloquist دے جملیاں دا ترجمہ کرو

I have problems adding Ottoman Turkish sentences written in Latin script. The punctuation order gets corrupted and it looks weird. The system only works properly when using Arabic script with Ottoman Turkish.
Some languages are written in more than one script. Like Azerbaijani (Latin, Arabic and Cyrillic), Kurdish (Latin and Arabic) or Serbian (Latin and Cyrillic).
I guess there are Serbian sentences written in both Latin and Cyrillic scripts on Tatoeba. It isn't a problem when the script direction is same as in Serbian, but if the direction is different, it becomes difficult to use the script other than the 'default' one.
Can't something be done about it?

Not long ago, I've noticed a Turkish advanced contributor unlinking sentences of some users (including mine), just because they are not translated literally or have some errors, and without being kind enough to leave a comment. I sent a PM, but couldn't get a response. Unfortunately, it's not easy for one to track down their unlinked sentences. I wish we could easily see the editing, unlinking and tagging activities made on our sentences, just like we can see comments.
I don't know what exactly the problem within the Hungarian community is, but I guess such behavior isn't limited to the Turkish community. Still, I think you shouldn't leave just for that reason.

Sentence owners are not shown on comments.
https://tatoeba.org/eng/sentence_comments/index
Previously, that was only the case with orphan sentences.

Thank you.

I think there's a bug with the add-to-list function on pages with numerous sentences (like list pages or "translate user's sentences" pages). On such pages, sentence selection goes wrong if a previously-selected sentence is selected again. Add-to-list command is erroneously sent to the previous sentence.
It might sound unclear so I've created a few example sentences to demonstrate.
https://dev.tatoeba.org/eng/act...es_of/testuser
Just add those sentences into some lists. Do it in order, not randomly. After getting done with the 3rd one, select the 1st or 2nd one again and try to add it into another list. You'll reproduce the bug.
This is present on both the main and the dev site.

You can create a transifex account and translate those untranslated strings yourself.
https://www.transifex.com/tatoeba/tatoeba_website/

This was requested before.
https://github.com/Tatoeba/tatoeba2/issues/797
I also requested a similar feature.
https://tatoeba.org/eng/wall/sh...#message_29459
Having both of them would be very useful. This way, typing 'A, B' into the owner field would give results only from the users A and B, while typing '-A, -B' would exclude results from them.

If your goal is to study other languages using it, have a look at this.
http://www.manythings.org/anki/

The following are numbers of comments per 1000 sentences for the top 10 languages.
- English: 67
- Russian: 82
- Turkish: 24
- Italian: 14
- Esperanto: 180
- German: 177
- French: 143
- Portuguese: 60
- Spanish: 137
- Hungarian: 88
I know it's far from being a precise and reliable source to make a judgement as comments reporting wrong-flag errors or giving annotations and some other factors like low number of active speakers decrease its efficiency, but still it might give a rough idea about how actively and effectively sentences in these languages are checked and maintained.

> However, even without translation native speakers quite often produce awkwardly sounding sentences that they can either correct themselves, or leave as is.
> So I thought you meant your "Partly unnatural sentences" were in that category.
The difference may seem unclear, but I'd rather call them 'non-standard'. Other than that, we're on the same page, I think.

Thank you for your remarks, deniko.
> I think most of Turkish sentences are linked to English ones, but what if you come across something linked to say Ukrainian or Marathi, would you check the translation as well?
Yes, most of them are directly-linked to English and most of the rest are indirectly-linked to English, so I didn't have much difficulty checking them. Sometimes I looked to Google Translate and Glosbe for sentences other than English to solidify my decisions. If I encountered an isolated unusual pair like Marathi-Turkish I would probably skip it.
> Imagine a native speaker of English trying to determine which sentences are "good", and also check whether all 100 translations into 50 languages of each sentence is correct.
I think English is a special case. Many of the English sentences are original. I don't think evaluating the quality of a corpus with translations linked to it later would be very fair, but I accept that's a logical dilemma. A cooperative and multilingual work is needed here. As Goethe said, let everyone sweep in front of his own door and the whole world will be clean.
> From my point of view, and according to my low standards, I'd say 94% of Turkish sentences are good sentences. This is the only category of "bad" sentences, IMHO:
> Completely unnatural sentences (literal translations, very strange word choices/orders etc. - sentences that a native speaker would never say) : 61 (6.1%)
I beg to differ. Some of those 'partly unnatural' sentences might be acceptable for language books to show/stress some aspects of the language, but from a purely native standpoint, they're not much different from the 'completely unnatural' sentences. In novel translation, for instance, such sentences would irritate readers. From my point of view, a good sentence should be indistinguishable whether it is original or a translation.

I have finished checking 1000 random Turkish sentences. The results are as follows.
- Sentences in good condition (no errors & natural-sounding) : 764 (76.4%)
- Sentences owned by non-native speakers: 3 (0.3%)
- Sentences with spelling or punctuation errors (accent letters are not taken into account) : 43 (4.3%)
- Sentences with other grammatical errors: 18 (1.8%)
- Sentences with translation errors (mistranslated words, tense errors, pronoun errors etc. ) : 37 (3.7%)
- Partly unnatural sentences (excessive pronoun usage, improvable word choices etc. - sentences that don't sound smooth) : 121 (12.1%)
- Completely unnatural sentences (literal translations, very strange word choices/orders etc. - sentences that a native speaker would never say) : 61 (6.1%)
They are more than 1000 (100%) in total because some of them fall into more than one category.

Hi, Ramin.
Tatoeba doesn't work like HiNative or Lang-8. However, you can contribute by translating sentences into Persian or by creating original Persian sentences.
Here are the English sentences that are not translated into Persian.
https://tatoeba.org/eng/sentenc...o=&sort=random
I see you set Turkey as your country on your profile. If you're living in Turkey you can also try translating Turkish sentences into Persian. As neighbor countries we have so few linked Turkish-Persian sentences here.
https://tatoeba.org/eng/sentenc...io=&sort=words

You're right, but sentences without any translations are likely to be original.
Is there a better way to search such sentences regardless if they have translations or not?

>Also, could it be possible to implement a system where people can be temporarily suspended from the wall, but not from contributing?
+1
Suspending from contributing sentences is really not necessary.

>Is it possible to perform a search within (or out of) all such sentences?
You can easily do this using the advanced search. Just be sure to select the 'exclude' and 'any language' options on the translations pane.
This link, for example, shows the original German sentences that have the word Tom.
https://tatoeba.org/eng/sentenc...io=&sort=words
And this one shows all original Persian sentences (no word limitation).
https://tatoeba.org/eng/sentenc...io=&sort=words

This is not only overkill, but also a bit unfair, I believe.
If you check the last threads ending up in debates you can see that it was the Kabyle team who came (as a group) and started debates under Amastan's valid threads.
As you can see in the last statistics, Amastan is one of the most active contributors here. He's a 6-year member of Tatoeba and a corpus maintainer. Without him, the development and maintenance of the Berber corpus will be crippled. The Kabyle team, on the other hand, has other active contributors. They won't suffer a lot from this 'suicide attack', so to speak. Their suspended members are normal contributors anyway. They can simply create new accounts and continue to contribute without losing any privileges. But Amastan is a corpus maintainer. It's not that easy for him.
I think this decision is tipping the balance in favor of the Kabyle team. There should be a better way to solve this more fairly. I hope you unsuspend them all and give them one more chance. This time, they will probably restrain themselves from future debates.
I also agree with sabretou's suggestion. Such an improvement would eliminate many Wall-related problems like this. A limited suspension system that prohibits users from only sending posts on the Wall for a period of time would be useful for this purpose, too.

Güzel fikir, konuyla ilgilenenler için çok faydalı olur. Ama mümkünse standart dışı böyle cümleleri yabancı dildeki cümlelere değil de standart Türkçedeki karşılıklarına bağlayalım. Yani bağlantılar Türkçe-Türkçe olsun.
Ayrıca her ağız için liste oluşturup cümleleri listelerde toplarsak daha derli toplu, kaynak niteliğinde bir çalışma olmuş olur.
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

I don't think that's very likely (it's a 3-year-old request), but the admins would probably announce it on the Wall if such an important improvement got implemented.
For proofreading, you can use the rating marks for now. Unlike tags, you don't have to click to go to each sentence's page to rate them. They're much more practical.