menu
Tatoeba
language
Nýskráning Innskrá
language Íslenska
menu
Tatoeba

chevron_right Nýskráning

chevron_right Innskrá

Vafra

chevron_right Sýna setningu af handahófi

chevron_right Vafra eftir tungumáli

chevron_right Vafra eftir lista

chevron_right Vafra eftir merki

chevron_right Vafra upptökum

Samfélag

chevron_right Veggur

chevron_right Meðlimalisti

chevron_right Listi tungumála meðlima

chevron_right Innfæddir

search
clear
swap_horiz
search
JimBreen JimBreen 21. mars 2010 21. mars 2010 kl. 06:11:23 UTC flag Report link Tengill

Traditional and Simplified Chinese

I saw the comment about converting hanzi on-the-fly. Be very cautious about that, as there are many cases where it simply doesn't work. Proper Traditional<->Simplified conversion needs to work at the lexeme level and in some cases needs some context for disambiguation.

Jack Halpern wrote a very good paper about this about 10 years ago:
http://www.cjk.org/cjk/c2c/c2cbasis.htm

PS: how do I make a comment on another posting?

{{vm.hiddenReplies[377] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
JimBreen JimBreen 21. mars 2010 21. mars 2010 kl. 06:40:28 UTC flag Report link Tengill

OK, I worked out how to do a follow-on. I'd clicked "reply" but it hadn't worked. Now it does.

sysko sysko 21. mars 2010 21. mars 2010 kl. 11:10:11 UTC flag Report link Tengill

the traditional to simplified chinese is not made at "character by character" level, but try to decompose the sentence (you can see how the sentence has been segmented by looking to pinyin)
As I've said I'm in conctact with the guy who develop it, so don't hesitate to report any bad segmentations, I will report to him