menu
Tatoeba
language
Qeyd bibe Dekewe
language Kirmancki
menu
Tatoeba

chevron_right Qeyd bibe

chevron_right Dekewe

Cigêre

chevron_right Cumleya raştameyê bimojne

chevron_right Goreyê ziwanî cigêre

chevron_right Goreyê lîste cigêre

chevron_right Goreyê etîketî cigêre

chevron_right Cigêre bi veng

Cemat

chevron_right Dês

chevron_right Lîsteya heme endaman

chevron_right Ziwanê endaman

chevron_right Ziwanê dayîke

search
clear
swap_horiz
search
sysko {{ icon }} keyboard_arrow_right

Profîl

keyboard_arrow_right

Cumleyî

keyboard_arrow_right

Çekuye

keyboard_arrow_right

Etudî

keyboard_arrow_right

Lîsteyî

keyboard_arrow_right

Favorîyî

keyboard_arrow_right

Şiroveyî

keyboard_arrow_right

Şiroveyê ke cumleyanê sysko ser o ameyê kerdene

keyboard_arrow_right

Mesajê Dêsî

keyboard_arrow_right

Dekewtişî

keyboard_arrow_right

Veng

keyboard_arrow_right

Transkrîpsyonî

translate

Cumleyanê sysko biaçarne

Mesajê sysko yê Dêsî (pêro pîya 1397)

sysko sysko January 7, 2011 January 7, 2011 at 5:45:48 AM UTC link Lînko payîdar

done

sysko sysko January 5, 2011 January 5, 2011 at 9:43:02 PM UTC link Lînko payîdar

the software works on "words" y "words" not on characters by characters, and looking to the number of "words" I use (around 100k). I think it's really realiable, as anyway it's the technics used by wikipedia (and their words list is far shorter than mine), and most of possible ambiguity disappear if you see the text as "words" (anyway otherwise it would have been ambigous for human reader too).
But after some "errors" are still possible, for the rare case of single characters "words" and also for errors in the words list itself. So by correcting the words list, and by permit manual edit of the generated "other script" version (which will be possible in a near future) we should be able to reach easily a 99,9% accuracy. (I think we're near 95~98% for the moment)

sysko sysko January 5, 2011 January 5, 2011 at 6:05:03 PM UTC link Lînko payîdar

chinese sentences has a little icon in the icon bar to say if it's simplified or traditional script.

sysko sysko January 5, 2011 January 5, 2011 at 5:36:05 AM UTC link Lînko payîdar

Oh yep sure, in fact the software I've made only segments words, the words themselves, and their pinyin is already stored, so it's just a per entry correction to make :), I will do that, if you see other entries which are missing this correction, can you put them in this thread ? :)

sysko sysko January 5, 2011 January 5, 2011 at 5:31:19 AM UTC link Lînko payîdar

you still can contact the author to ask him if he's the author of this sentences, or if at least it does have the right on these sentences, so that he can give you a copy of them relicenced under the CC-BY or a compatible licence. Because to be honest the GPL has not been made for text, books or so, there are more suitable licences for this kind of data (such as the CC-BY / CC-BY-SA , or the Gnu FDL if you want to say in Gnu's licence), so I do think the guy chose th GPL more because it was to say "my data are open" rather than for the exact terms of the licence. So it costs nothing to ask :) (1000 shanghainese sentences come from a copyrighted books for which the author given me the explicit authorization to use them under CC-BY)

sysko sysko January 5, 2011 January 5, 2011 at 5:15:26 AM UTC link Lînko payîdar

but it's still not a different language, and it still can be written in both script. If a sentence is specific to a part of China, then it will be tagged accordingly.

sysko sysko January 5, 2011 January 5, 2011 at 5:01:45 AM UTC link Lînko payîdar

Hi, glad to see a new Chinese user here
If you take a look you will see that each chinese sentences come with both script, and when you add a sentence, the other script version is automatically generated. Moreover as said, it's 2 different scripts, not 2 different languages.

sysko sysko January 4, 2011 January 4, 2011 at 9:20:35 PM UTC link Lînko payîdar

the technical limitation is 500 characters (due to the way we store sentences). After for the "guideline" part, I agree with what Pahramp said.

sysko sysko January 3, 2011 January 3, 2011 at 3:34:22 PM UTC link Lînko payîdar

it's depend of what you mean by an "annotation system" :)

sysko sysko January 3, 2011 January 3, 2011 at 9:33:52 AM UTC link Lînko payîdar

I will be in Macao the 16th

sysko sysko January 3, 2011 January 3, 2011 at 4:56:48 AM UTC link Lînko payîdar

for Chinese I've made a sentence analyser so it's should be possible too :) and anyway if we can do that autotically for 90% of the language we support, it's already far better than nothing :)

sysko sysko January 3, 2011 January 3, 2011 at 4:54:38 AM UTC link Lînko payîdar

=your sentence will only search exact match of "your" + sentence not "your sentence"

sysko sysko January 3, 2011 January 3, 2011 at 4:53:55 AM UTC link Lînko payîdar

="your sentence starting with a equal and enclosed by quote, it's case unsensitive"

sysko sysko January 1, 2011 January 1, 2011 at 9:28:39 AM UTC link Lînko payîdar

ça devrait remarcher :)

sysko sysko December 31, 2010 December 31, 2010 at 8:34:23 AM UTC link Lînko payîdar

http://yoyodyne.cc/tatoeba/
\o/
Thanks to everyone, this year with you was wonderful
See you in 2011

sysko sysko December 31, 2010 December 31, 2010 at 1:33:44 AM UTC link Lînko payîdar

(ah ce propos il faut que tu corriges un petit truc, car les personnes qui comme sacredceltic et moi avions déjà mis des espaces insécables, nos phrases on a présent deux espaces, un fine insécable, et un insécable)

sysko sysko December 30, 2010 December 30, 2010 at 12:20:11 AM UTC link Lînko payîdar

my bad, an error while manipulating the server, should be back now ^^
thanks to have noticed me :)

sysko sysko December 26, 2010 December 26, 2010 at 5:03:38 AM UTC link Lînko payîdar

有 :p

sysko sysko December 22, 2010 December 22, 2010 at 1:52:11 PM UTC link Lînko payîdar

In the sentence commment yep, but not in the sentence text itself. Because as the data of tatoeba can be reuse for any purpose it is important to keep the sentence as pure as possible. But as I plan with one of my friends, to focus on tools for sentence analysis when we will have finished a first release of tatoeba, I think I will add a field somewhere to add such informations in a more specific place than the "all purpose" comments.

sysko sysko December 19, 2010 December 19, 2010 at 6:36:36 AM UTC link Lînko payîdar

thanks to you too, I think you've spent more time adding all the Cantonese sentences (in addition to all the Mandarin sentences) than me coding this software :)

for the list, a .txt file with the following format would be perfect

word[tab]jyutping
word2[tab]jyutping2

:)