Profîl
Cumleyî
Çekuye
Etudî
Lîsteyî
Favorîyî
Şiroveyî
Şiroveyê ke cumleyanê sysko ser o ameyê kerdene
Mesajê Dêsî
Dekewtişî
Veng
Transkrîpsyonî
Cumleyanê sysko biaçarne

down again, I'm finalizing things, if this works, it will mean some increase in perfomance, and in indexing speed :)

it's back now. test are promising.

down again, I'm continuing my test, sorry for the inconvience.

http://ubuntuforums.org/ for questions about ubuntu:)

It seems that neither me nor Trang are paid to do this, which mean we're doing this on our freetime, if we are busy in our private life, or solving bugs, then we have less time to implement things. Nothing more to add to this.

For Quechua and Azerbaijani, they will be added when we will have enough sentences which DO NOT come from copyrighted sources. For the others, they will be added when I will have fix something in the database. Moreover in the future can you avoid add sentences in language you do not speak. Tatoeba goal is to offer natural sentences, added by native, or people who can be considered as native by natives. The goal is not to have million of sentences in hundreds languages if this mean "machine translation" quality, because then tatoeba will be less useful than google translate or other automatic translation website.

The forbidden error is due to a problem which occurs yesterday soon after I leave home for a diner (murphy law I hate you)

done

I've stopped the search engine for some minutes, I'm doing some optimization test. It will be back in 10 minutes

In a specific language? In normal time we have a script we run from time to time which merge duplicate, as for ressource reason, we can't have a real time check. But I don't think it's more than a hundred of duplicate, over the 500 000 sentences.


In fact, as I will have (I hope), some free times, I will take time to see native speakers of minority languages in China (mainly Shanghainese and other Wu dialect as I will be in this region)
But I agree, rather than a "absolute" adding rate, my "hope" is not to reach a million, 10 million, but to boost contributions in our current minority languages, and add new ones, especially endangered ones.

but I agree with your analyse, I don't think it will become exponential, for some reason:
1 - as sacredceltic a lot of contributors have come in a burst, and we're still in the "high motivation" part, most of the time it last one month or two, and after their contribution rate drop, so it suppose not only new comers will come to "replace" them, but new ones too.
2 - I don't think for the present time tatoeba is able to support 10000 contributions a day ^^ (but I can be wrong, and I hope I'm wrong)

spoiler: this year I will have to take care of hundred of chinese students in China, and one of my attribution will be to teach them French, and to help the French students there learn chinese, so guess what I will use ...


I will see with Trang, and we will answer you, but personnaly (not an official statement), I would say they should be considered as different languages, after for delimiting what is old and what is "modern", we can rely on preexisting works on this. (as it's sometimes teach in university, they should already have stated this)

No, because even for single sentences at school we used to study the following case:
In a guide for tourist there used to be a paragraph about the Champagne region (where I'm from, really nice place, you should come ^^), talking about wines etc. etc. A Champagne productor takes one these sentences and used them on his bottles, but soon the author from the guide bring suit. The judge state that this sentence even a single one could be "copyrighted", because the sentence contain enough "style" to be considered a something unique who has needed works.
So yep "I eat an apple" don't fall on this case, but I'm sure that people wanting to include a sentence are more likely to do it for a "with style" sentences.
Moreover the equivalent of "fair use" in French law is really precise and strict, it's only for educational/illustrating purpose if we locate the quote in the original book, precise the author etc.
So here I see 2 problems
1 - Will every single users know about this and even if they know, will they take time to add them the way the law says it should be? Do moderators have time to check that?
2 - Tatoeba content is licensed under the CC-BY license, which mean that commercial use are allowed, so does it break the "education/illustrating" purpose ?
To be honnest, I really don't have the time to check all of this, and we're not enough moderator to cover everything. And even if we had time, do we really need quote from books, for sure some are interesting etc. I mean does this deserve we took time to check the law AND take care about how contributors add quote (in the case it's legal) ?

In fact I've figured there's a problem in sorting, it should normaly propose "proverb" as first suggestion when you type "p" , but for a strange reason it's not working. So I think it's maybe a first reason why is not so pratical ?
for the enter key which select the first suggestion rather than what you've typed, it's was anyway something for "test". I don't know if you know, but you instead of typing as fast as you can, just press "escape". If other people prefer also to press down arrow/click on the first suggestion and then enter "enter key", I will change it, because anyway I've added it to make user's life easier ^^

If you see content which can possibly be extract from recent book / movie / lyrics or even if you're not sure, please tag them as "possible copyright violation", thanks :)

Wrong transliteration,
I will try soon, to make possible for modos / trusted users to edit them, trough a "special" page, I haven't check yet, but maybe we can agree on a tag for "wrong transliteration", this way when this feature will be ready, we will be able to edit them.