Профил
Реченице
Вокабулар
Прегледи
Спискови
Омиљени
Коментари
Коментари на реченицама корисника sharptoothed
Зидне поруке
Логови
Звук
Транскрипције
Преведи реченице корисника sharptoothed

С повышеньицем! :-)

It seems that the search engine is completely non-functional at the moment.

The New Year came to Japan a few minutes ago. High time to start celebrating! :-)
明けましておめでとうございます! ^^

Judging by the file size changes, some nice person still cares about the Japanese indices, but, to all appearance, not very intensive.

Добро пожаловать на Татоэбу! :-)
> Волнует меня одно в первую очередь -меня не расплющат в бумажный лист...
Предыдущий оратор всё предельно точно изложил: никто Вас плющить ни во что не будет. :-) Надо только иметь в виду, что многие участники относятся к своим "детишкам" (предложениям, сиречь) настолько ревностно, что будут защищать их всеми зубами и когтями, сколь бы ужасными их отпрыски ни были. К этому надо быть готовым. :-)

> I strongly belive: linking sentences is a right of
> advanced contributors, not an obligation.
Indeed, being an AC or a CM is not an obligation. On the other hand, nobody compels a member to become an AC to say nothing of a CM. Doesn't higher position implies a bit higher responsibility and a bit lesser freedom as well? Of course, in voluntary community people do what they want (within the limits of the rules, though) but if one doesn't want to use his extra rights, why bother gaining them?

I think we just need a reasonable compromise between corpus needs and computational needs. Regular corpus cleaning and de-duplication is among the essentials, I believe. In any case, it's up to us to not make things worse. As we say, it's clean not where they clean up but where they don't litter. :-)

Indeed, the newly added sentence has more chances to be translated than the older one, and, yes, the new translation maybe better than old. But to re-think the existing version we have to be aware of its existence at least. That is, in overall the corpus search is more effective than adding (potential) duplicates.
Besides, in the majority of cases duplicates are either basic/simple sentences (greetings, for example) or well-known quotations (Bible, proverbs, etc.) and I don't think the benefit surpasses the burden in this situation.

> I think that time someone proved how they can become beneficial to the corpus.
I'd really like to learn his arguments since I don't understand how the waste of storage and CPU time, excessive complexity of graph structure and non-optimal search results not to mention the waste of efforts of members translating duplicates time and time again can be of any benefit to the corpus.

Sorry, I should have added more smileys to my comment. :-)
Joking apart, duplicates are not harmful provided that de-duplication is being performed regularly. Sorry to say, I don't remember when it was done last time.

> ... should I add the same sentence as a new direct translation?
No, you shouldn't since duplicates are not welcome on Tatoeba and you have to avoid adding duplicate sentences tooth and nail. :-) Advanced users and corpus maintainers can link any pair of sentences so if you feel that sentence #12345 can be linked to sentence #67890 - just add an appropriate comment and someone will take care about that pair.

I think you (and me, too) understand it right. :-) Sphinx performs stemming before search (not for every language, though) so the result will include all word forms of the same stem.

You can read about search syntax starting from here: http://sphinxsearch.com/docs/cu...boolean-syntax
In your example, the following search string should do what you need: "=venire =per"
'=' tells the search engine to use exact word form and quotation marks make the engine search for a particular phrase.

Generally, "OK" tag means that a sentence was reviewed by a native speaker and was acknowledged by him as natural and grammatical sentence. Apparently, "OK" and "@needs native check" tags are mutually exclusive, so when you ask for the "OK" tag you automatically request removal of the "@needs native check" tag.

Welcome to the Tatoeba Project!
The links below should help you to decide what to do. :-)
http://en.wiki.tatoeba.org/arti...ow/quick-start
http://tatoeba.org/faq
http://tatoeba.org/help

This is a quotation from the Biblica's (NIV owner) Terms of use:
----
The NIV®, TNIV®, NIrV® may be quoted in any form (written, visual, electronic or audio) up to and inclusive of five hundred (500) verses without the express written permission of the publisher, providing the verses quoted do not amount to a complete book of the Bible nor do the verses quoted account for more than 25 percent (25%) or more of the total text of the work in which they are quoted. For additional rights and permission usage on the NIV®, NIrV® and the TNIV® Bible please contact Biblica.
http://www.biblica.com/biblica-.../terms-of-use/
----
So it seems we're not it trouble yet (maybe) but I think we should avoid further use of NIV quotations. There's a lot of public domain Bible versions including those written in modern English (World English Bible, for example).

"Egyptian Arabic" is recognized as a separate language in the ISO 639-3 standard.
http://www-01.sil.org/iso639-3/...ion.asp?id=arz

I hope this will help: http://tatoeba.org/help

AlanF_US is a CM of late so he can delete all those sentences himself, I believe.

It seems we're experiencing the same problem again. Is it high time to implement some kind of connection per second limitation on the Tatoeba web-server?