Menu
Hy! Does anyone know what are they using for the Pinyin converter in Tatoeba? is it sinoparserd? Does it also provides tokenizing function too? Thank you!
Hello Quielin, welcome to Tatoeba. As you guessed, the Pinyin converter currently used on Tatoeba is sinoparserd. I don’t understand Chinese, but it seems the generated Pinyin is tokenized: Pinyin “words” are separated by spaces.
thank you! i didnt notice that, u are absolutely right. I will try to install it. The information on git is so limited that wasnt sure thay it provides that functionality.