Notandasíða
Setningar
Orðaforði
Dómar
Listar
Eftirlæti
Ummæli
Ummæli á setningum frá sharptoothed
Veggskilaboð
Saga
Upptökur
Umritunir
Þýða setningar frá sharptoothed

グレイト・マインドス・シンック・エライック・ユ・ノウ ^^

エンド・ヲッタ・エバウット・カタカナ・ラッシャン? ;-)

For some reason, Tatoeba search engine ignores plain "why" in queries. Try to search for "=why".

Another problem is that Cyrillic and Latin letters look differently in this font (at least on my PC) and this is just awful. :'(

Hi! Nice to see you here.
タトエバへようこそ!^^

Nice song and great idea as well. Thanks, Alex! :-)

It would be also nice if tag auto-suggestion drop-down list get worked again.

С повышеньицем! :-)

It seems that the search engine is completely non-functional at the moment.

The New Year came to Japan a few minutes ago. High time to start celebrating! :-)
明けましておめでとうございます! ^^

Judging by the file size changes, some nice person still cares about the Japanese indices, but, to all appearance, not very intensive.

Добро пожаловать на Татоэбу! :-)
> Волнует меня одно в первую очередь -меня не расплющат в бумажный лист...
Предыдущий оратор всё предельно точно изложил: никто Вас плющить ни во что не будет. :-) Надо только иметь в виду, что многие участники относятся к своим "детишкам" (предложениям, сиречь) настолько ревностно, что будут защищать их всеми зубами и когтями, сколь бы ужасными их отпрыски ни были. К этому надо быть готовым. :-)

> I strongly belive: linking sentences is a right of
> advanced contributors, not an obligation.
Indeed, being an AC or a CM is not an obligation. On the other hand, nobody compels a member to become an AC to say nothing of a CM. Doesn't higher position implies a bit higher responsibility and a bit lesser freedom as well? Of course, in voluntary community people do what they want (within the limits of the rules, though) but if one doesn't want to use his extra rights, why bother gaining them?

I think we just need a reasonable compromise between corpus needs and computational needs. Regular corpus cleaning and de-duplication is among the essentials, I believe. In any case, it's up to us to not make things worse. As we say, it's clean not where they clean up but where they don't litter. :-)

Indeed, the newly added sentence has more chances to be translated than the older one, and, yes, the new translation maybe better than old. But to re-think the existing version we have to be aware of its existence at least. That is, in overall the corpus search is more effective than adding (potential) duplicates.
Besides, in the majority of cases duplicates are either basic/simple sentences (greetings, for example) or well-known quotations (Bible, proverbs, etc.) and I don't think the benefit surpasses the burden in this situation.

> I think that time someone proved how they can become beneficial to the corpus.
I'd really like to learn his arguments since I don't understand how the waste of storage and CPU time, excessive complexity of graph structure and non-optimal search results not to mention the waste of efforts of members translating duplicates time and time again can be of any benefit to the corpus.

Sorry, I should have added more smileys to my comment. :-)
Joking apart, duplicates are not harmful provided that de-duplication is being performed regularly. Sorry to say, I don't remember when it was done last time.

> ... should I add the same sentence as a new direct translation?
No, you shouldn't since duplicates are not welcome on Tatoeba and you have to avoid adding duplicate sentences tooth and nail. :-) Advanced users and corpus maintainers can link any pair of sentences so if you feel that sentence #12345 can be linked to sentence #67890 - just add an appropriate comment and someone will take care about that pair.

I think you (and me, too) understand it right. :-) Sphinx performs stemming before search (not for every language, though) so the result will include all word forms of the same stem.

You can read about search syntax starting from here: http://sphinxsearch.com/docs/cu...boolean-syntax
In your example, the following search string should do what you need: "=venire =per"
'=' tells the search engine to use exact word form and quotation marks make the engine search for a particular phrase.