menu
Tatoeba
language
En em enskrivañ Kevreañ
language Brezhoneg
menu
Tatoeba

chevron_right En em enskrivañ

chevron_right Kevreañ

Merdeiñ

chevron_right Diskouez ur frazenn dre zegouezh

chevron_right Diskouez dre yezh

chevron_right Diskouez dre listenn

chevron_right Diskouez dre valiz

chevron_right Diskouez an aodio

Kumuniezh

chevron_right Moger

chevron_right Listenn an holl Izili

chevron_right Yezhoù an Izili

chevron_right Komzerien a-vihanik

search
clear
swap_horiz
search
JimBreen JimBreen 21 Meurzh 2010 21 Meurzh 2010 da 06:11:23 UTC flag Report link Liamm-peurbadus

Traditional and Simplified Chinese

I saw the comment about converting hanzi on-the-fly. Be very cautious about that, as there are many cases where it simply doesn't work. Proper Traditional<->Simplified conversion needs to work at the lexeme level and in some cases needs some context for disambiguation.

Jack Halpern wrote a very good paper about this about 10 years ago:
http://www.cjk.org/cjk/c2c/c2cbasis.htm

PS: how do I make a comment on another posting?

{{vm.hiddenReplies[377] ? 'expand_more' : 'expand_less'}} kuzhat ar respontoù diskouez ar respontoù
JimBreen JimBreen 21 Meurzh 2010 21 Meurzh 2010 da 06:40:28 UTC flag Report link Liamm-peurbadus

OK, I worked out how to do a follow-on. I'd clicked "reply" but it hadn't worked. Now it does.

sysko sysko 21 Meurzh 2010 21 Meurzh 2010 da 11:10:11 UTC flag Report link Liamm-peurbadus

the traditional to simplified chinese is not made at "character by character" level, but try to decompose the sentence (you can see how the sentence has been segmented by looking to pinyin)
As I've said I'm in conctact with the guy who develop it, so don't hesitate to report any bad segmentations, I will report to him