menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search
sabretou sabretou October 30, 2017 October 30, 2017 at 6:50:54 AM UTC link Permalink

Something curious has happened recently: http://www.aljazeera.com/news/2...013156380.html

By 2025, Kazakhstan will officially use the Latin script for the Kazakh language, over the presently-used Cyrillic.

We presently have 2536 Kazakh sentences, and most, if not all of them, appear to be in Cyrillic script.

How will this change in Kazakhstan affect Tatoeba?

{{vm.hiddenReplies[28621] ? 'expand_more' : 'expand_less'}} hide replies show replies
Aiji Aiji October 30, 2017 October 30, 2017 at 9:52:47 AM UTC link Permalink

I guess that when the change will be fully operated, several solutions exist.
Probably some guys will develop tools to go from one alphabet to the other, so a tool could be integrated, like the tool for the Japanese language, that displays furigana.
Another solution is to simply use a tag indicating that the sentence uses Cyrillic. This, or two separate flags, although it would depend on the official choice (is Cyrillic maintained for several years beside Latin, etc.)
At least ,we have time to think about the problem! ^^

In this kind of situation, I like to think that Tatoeba is kind of a keeper of the languages. Even if Cyrillic is replaced, a trace will remain here on Tatoeba.

{{vm.hiddenReplies[28623] ? 'expand_more' : 'expand_less'}} hide replies show replies
Selena777 Selena777 October 30, 2017 October 30, 2017 at 7:00:40 PM UTC link Permalink

Actually, we have similar situation with Serbian which uses both Cyrillic and Latin alphabets right now. Serbian Tatoeba corpus consists of both types. The conversion can be fully automatized only in the case "Cyrillic to Latin". Automatic "Latin to Cyrillic" conversion will give wrong results sometimes.

{{vm.hiddenReplies[28627] ? 'expand_more' : 'expand_less'}} hide replies show replies
astru astru October 31, 2017 October 31, 2017 at 9:51:19 PM UTC link Permalink

Unfortunately Serbian alphabet conversion tool is not implemented for a long time.
https://github.com/Tatoeba/tatoeba2/issues/1456

{{vm.hiddenReplies[28635] ? 'expand_more' : 'expand_less'}} hide replies show replies
Selena777 Selena777 November 1, 2017 November 1, 2017 at 7:05:00 AM UTC link Permalink

I agree with what you said on Github and I don't see a real nesessarity in "native speaker verification" before starting the work on the converter, cause those rules of transliterations are all in common textbooks and those exception roots can be easily found in vocabularies. It's a work which an intermediate level speaker can do. Of cause, any native or professional Serbian speaker is very welcome to check and complete the list, but it can be done in the process of work.
I also agree that Cyrillic script is preferable to use for Serbian contribution, but in fact many Serbian speakers prefer to use Latin in their everyday writing communications, so an obligation to only use Cyrillic might become a kind of burden for them.

astru astru October 31, 2017 October 31, 2017 at 10:06:52 PM UTC link Permalink

The Kazakh language is recognized officially in Russia on regional level and there the alphabet will not be changed to Latin even if the reform in Kazakhstan will be successful. So the Cyrillic Kazakh will remain.
Eg. Azeri language in Azerbaijan changed to Latin but in Russia, Azeri language is official in Dagestan in its Cyrillic form. (But Cyrillic is not used on Tatoeba)

The Cyrillic Azeri newspaper "Derbent" October 2017
https://i.mycdn.me/image?id=861...9Jhh0rmoKbqRgk

{{vm.hiddenReplies[28636] ? 'expand_more' : 'expand_less'}} hide replies show replies
Selena777 Selena777 November 1, 2017 November 1, 2017 at 7:08:06 AM UTC link Permalink

In which regions of Russia the Kazakh language is reconized officially as a regional language?

{{vm.hiddenReplies[28638] ? 'expand_more' : 'expand_less'}} hide replies show replies
astru astru November 1, 2017 November 1, 2017 at 7:38:34 PM UTC link Permalink

Altai republic
http://zakon.scli.ru/ru/legal_t...7-3057d6dbed48

"Казахский язык используется в официальных сферах общения в местах компактного проживания его носителей."

{{vm.hiddenReplies[28639] ? 'expand_more' : 'expand_less'}} hide replies show replies
Selena777 Selena777 November 1, 2017 November 1, 2017 at 7:42:08 PM UTC link Permalink

Thanks.

TRANG TRANG October 30, 2017 October 30, 2017 at 10:51:31 AM UTC link Permalink

If the conversion between Latin and Cyrillic can be automatized, we could use the same mechanism we have for Mandarin Chinese, where we allow users to enter sentences both in simplified and traditional.

astru astru October 31, 2017 October 31, 2017 at 9:47:46 PM UTC link Permalink

There is big chance the alphabet reform will never be finished, at least in this variant. The Uzbek scenario is the most probable.