Mesajê sysko yê Dêsî

sysko {{ icon }}

keyboard_arrow_right

Profîl

keyboard_arrow_right

Cumleyî

keyboard_arrow_right

Çekuye

keyboard_arrow_right

Etudî

keyboard_arrow_right

Lîsteyî

keyboard_arrow_right

Favorîyî

keyboard_arrow_right

Şiroveyî

keyboard_arrow_right

Şiroveyê ke cumleyanê sysko ser o ameyê kerdene

keyboard_arrow_right

Mesajê Dêsî

keyboard_arrow_right

Dekewtişî

keyboard_arrow_right

Veng

keyboard_arrow_right

Transkrîpsyonî

translate

Cumleyanê sysko biaçarne

sysko January 7, 2011 January 7, 2011 at 5:45:48 AM UTC

link

Lînko payîdar

done

sysko January 5, 2011 January 5, 2011 at 9:43:02 PM UTC

link

Lînko payîdar

the software works on "words" y "words" not on characters by characters, and looking to the number of "words" I use (around 100k). I think it's really realiable, as anyway it's the technics used by wikipedia (and their words list is far shorter than mine), and most of possible ambiguity disappear if you see the text as "words" (anyway otherwise it would have been ambigous for human reader too).
But after some "errors" are still possible, for the rare case of single characters "words" and also for errors in the words list itself. So by correcting the words list, and by permit manual edit of the generated "other script" version (which will be possible in a near future) we should be able to reach easily a 99,9% accuracy. (I think we're near 95~98% for the moment)

sysko January 5, 2011 January 5, 2011 at 6:05:03 PM UTC

link

Lînko payîdar

chinese sentences has a little icon in the icon bar to say if it's simplified or traditional script.

sysko January 5, 2011 January 5, 2011 at 5:36:05 AM UTC

link

Lînko payîdar

Oh yep sure, in fact the software I've made only segments words, the words themselves, and their pinyin is already stored, so it's just a per entry correction to make :), I will do that, if you see other entries which are missing this correction, can you put them in this thread ? :)

sysko January 5, 2011 January 5, 2011 at 5:31:19 AM UTC

link

Lînko payîdar

you still can contact the author to ask him if he's the author of this sentences, or if at least it does have the right on these sentences, so that he can give you a copy of them relicenced under the CC-BY or a compatible licence. Because to be honest the GPL has not been made for text, books or so, there are more suitable licences for this kind of data (such as the CC-BY / CC-BY-SA , or the Gnu FDL if you want to say in Gnu's licence), so I do think the guy chose th GPL more because it was to say "my data are open" rather than for the exact terms of the licence. So it costs nothing to ask :) (1000 shanghainese sentences come from a copyrighted books for which the author given me the explicit authorization to use them under CC-BY)

sysko January 5, 2011 January 5, 2011 at 5:15:26 AM UTC

link

Lînko payîdar

but it's still not a different language, and it still can be written in both script. If a sentence is specific to a part of China, then it will be tagged accordingly.

sysko January 5, 2011 January 5, 2011 at 5:01:45 AM UTC

link

Lînko payîdar

Hi, glad to see a new Chinese user here
If you take a look you will see that each chinese sentences come with both script, and when you add a sentence, the other script version is automatically generated. Moreover as said, it's 2 different scripts, not 2 different languages.

sysko January 4, 2011 January 4, 2011 at 9:20:35 PM UTC

link

Lînko payîdar

the technical limitation is 500 characters (due to the way we store sentences). After for the "guideline" part, I agree with what Pahramp said.

sysko January 3, 2011 January 3, 2011 at 3:34:22 PM UTC

link

Lînko payîdar

it's depend of what you mean by an "annotation system" :)

sysko January 3, 2011 January 3, 2011 at 9:33:52 AM UTC

link

Lînko payîdar

I will be in Macao the 16th

sysko January 3, 2011 January 3, 2011 at 4:56:48 AM UTC

link

Lînko payîdar

for Chinese I've made a sentence analyser so it's should be possible too :) and anyway if we can do that autotically for 90% of the language we support, it's already far better than nothing :)

sysko January 3, 2011 January 3, 2011 at 4:54:38 AM UTC

link

Lînko payîdar

=your sentence will only search exact match of "your" + sentence not "your sentence"

sysko January 3, 2011 January 3, 2011 at 4:53:55 AM UTC

link

Lînko payîdar

="your sentence starting with a equal and enclosed by quote, it's case unsensitive"

sysko January 1, 2011 January 1, 2011 at 9:28:39 AM UTC

link

Lînko payîdar

ça devrait remarcher :)

sysko December 31, 2010 December 31, 2010 at 8:34:23 AM UTC

link

Lînko payîdar

http://yoyodyne.cc/tatoeba/
\o/
Thanks to everyone, this year with you was wonderful
See you in 2011

sysko December 31, 2010 December 31, 2010 at 1:33:44 AM UTC

link

Lînko payîdar

(ah ce propos il faut que tu corriges un petit truc, car les personnes qui comme sacredceltic et moi avions déjà mis des espaces insécables, nos phrases on a présent deux espaces, un fine insécable, et un insécable)

sysko December 30, 2010 December 30, 2010 at 12:20:11 AM UTC

link

Lînko payîdar

my bad, an error while manipulating the server, should be back now ^^
thanks to have noticed me :)

sysko December 26, 2010 December 26, 2010 at 5:03:38 AM UTC

link

Lînko payîdar

有 :p

sysko December 22, 2010 December 22, 2010 at 1:52:11 PM UTC

link

Lînko payîdar

In the sentence commment yep, but not in the sentence text itself. Because as the data of tatoeba can be reuse for any purpose it is important to keep the sentence as pure as possible. But as I plan with one of my friends, to focus on tools for sentence analysis when we will have finished a first release of tatoeba, I think I will add a field somewhere to add such informations in a more specific place than the "all purpose" comments.

sysko December 19, 2010 December 19, 2010 at 6:36:36 AM UTC

link

Lînko payîdar

thanks to you too, I think you've spent more time adding all the Cantonese sentences (in addition to all the Mandarin sentences) than me coding this software :)

for the list, a .txt file with the following format would be perfect

word[tab]jyutping
word2[tab]jyutping2

:)

Îhtîyacê şima bi hetkarî esto?

Xurtkerdoxî

Derheq