menu
Tatoeba
language
Eman izena Hasi saioa
language Euskara
menu
Tatoeba

chevron_right Eman izena

chevron_right Hasi saioa

Arakatu

chevron_right Erakutsi ausazko esaldia

chevron_right Nabigatu hautatutako hizkuntzan

chevron_right Nabigatu hautatutako zerrendan

chevron_right Nabigatu hautatutako etiketetan

chevron_right Arakatu audioa

Komunitatea

chevron_right Horma

chevron_right Kideen zerrenda

chevron_right Kideen hizkuntzak

chevron_right Jatorrizko hiztunak

search
clear
swap_horiz
search
JimBreen JimBreen 2010(e)ko martxoakren 21(a) 2010(e)ko martxoakren 21(a) 06:11:23 (UTC) flag Report link Esteka iraunkorra

Traditional and Simplified Chinese

I saw the comment about converting hanzi on-the-fly. Be very cautious about that, as there are many cases where it simply doesn't work. Proper Traditional<->Simplified conversion needs to work at the lexeme level and in some cases needs some context for disambiguation.

Jack Halpern wrote a very good paper about this about 10 years ago:
http://www.cjk.org/cjk/c2c/c2cbasis.htm

PS: how do I make a comment on another posting?

{{vm.hiddenReplies[377] ? 'expand_more' : 'expand_less'}} ezkutatu iruzkinak erakutsi iruzkinak
JimBreen JimBreen 2010(e)ko martxoakren 21(a) 2010(e)ko martxoakren 21(a) 06:40:28 (UTC) flag Report link Esteka iraunkorra

OK, I worked out how to do a follow-on. I'd clicked "reply" but it hadn't worked. Now it does.

sysko sysko 2010(e)ko martxoakren 21(a) 2010(e)ko martxoakren 21(a) 11:10:11 (UTC) flag Report link Esteka iraunkorra

the traditional to simplified chinese is not made at "character by character" level, but try to decompose the sentence (you can see how the sentence has been segmented by looking to pinyin)
As I've said I'm in conctact with the guy who develop it, so don't hesitate to report any bad segmentations, I will report to him