menu
Tatoeba
language
Registriĝi Ensaluti
language Esperanto
menu
Tatoeba

chevron_right Registriĝi

chevron_right Ensaluti

Foliumi

chevron_right Montri hazardan frazon

chevron_right Foliumi laŭ lingvo

chevron_right Foliumi laŭ listo

chevron_right Foliumi laŭ etikedo

chevron_right Foliumi sonregistraĵojn

Komunumo

chevron_right Muro

chevron_right Listo de ĉiuj membroj

chevron_right Lingvoj de la membroj

chevron_right Denaskaj parolantoj

search
clear
swap_horiz
search
sharptoothed {{ icon }} keyboard_arrow_right

Profilo

keyboard_arrow_right

Frazoj

keyboard_arrow_right

Vortaro

keyboard_arrow_right

Revizioj

keyboard_arrow_right

Listoj

keyboard_arrow_right

Preferaĵoj

keyboard_arrow_right

Komentoj

keyboard_arrow_right

Komentoj pri frazoj de sharptoothed

keyboard_arrow_right

Muraj mesaĝoj

keyboard_arrow_right

Registroj

keyboard_arrow_right

Sono

keyboard_arrow_right

Transskriboj

translate

Traduki la frazojn de sharptoothed

Surmuraj mesaĝoj de s% (entute sharptoothed)

sharptoothed sharptoothed 2016-februaro-01 2016-februaro-01 08:50:02 UTC link Konstanta ligilo

** Sentences & Translations Stats **

These stats have been updated.

http://tatoeba.j-langtools.com/transtop/

Various graphs have been updated, too:

http://tatoeba.j-langtools.com/graphs.html

sharptoothed sharptoothed 2016-januaro-06, modifita 2016-januaro-06 2016-januaro-06 16:30:01 UTC, modifita 2016-januaro-06 16:34:24 UTC link Konstanta ligilo

It seems that RCVD_REMOVED check was added to SpamAssassin rules on December, 16th, and December, 21st is maybe the date your provider updated his SpamAssassin. So, if your provider's SMTP server removes 'Received:' headers, this fact was just unnoticed until the update.

Judging by the mail headers, Tatoeba doesn't have its own MTA. It sends mail via Google SMTP server.

sharptoothed sharptoothed 2016-januaro-06, modifita 2016-januaro-06 2016-januaro-06 16:01:57 UTC, modifita 2016-januaro-06 16:07:19 UTC link Konstanta ligilo

Tatoeba e-mail notification system has nothing to do with this issue, as far as I can judge.
'Received:' headers are being inserted by SMTP servers at every hop (i.e., each SMTP server that relays or delivers a message adds its own 'Received:' header for tracking purposes).
In the SpamAssassin v.3.4.1, RCVD_REMOVED check fails in three cases only:
- the message was received via a mailing list running ezmlm software configured to remove 'Received:' headers except the top one.
- all 'Received:' headers were removed from the message (normal SMTP servers and mail programs never do this)
- the message was received via MSN Groups (this service removes 'Received:' headers added by SMTP-relays the message has passed through)

To nail the problem down, please check the headers of the messages you receive from Tatoeba.

sharptoothed sharptoothed 2016-januaro-03 2016-januaro-03 13:40:20 UTC link Konstanta ligilo

It would be nice to have a mass deletion function in personal messaging system (in all folders).

sharptoothed sharptoothed 2016-januaro-01 2016-januaro-01 09:18:56 UTC link Konstanta ligilo

С Новым Годом!
Happy New Year!
明けましておめでとうございます!
:-)

sharptoothed sharptoothed 2015-decembro-28 2015-decembro-28 11:34:18 UTC link Konstanta ligilo

** Sentences & Translations Stats **

These stats have been updated. It was the last update this year. :-)

http://tatoeba.j-langtools.com/transtop/

Various graphs are now available on one page:

http://tatoeba.j-langtools.com/graphs.html

sharptoothed sharptoothed 2015-decembro-17 2015-decembro-17 09:56:16 UTC link Konstanta ligilo

It works like a charm now. Thanks, gillux!

sharptoothed sharptoothed 2015-decembro-17 2015-decembro-17 08:40:05 UTC link Konstanta ligilo

It seems that the database that the search engine uses should be re-indexed to reflect recent changes. Indeed, judging from my example the lag is two days at least. I don't know how often re-indexing is being carried out but even one day interval is too long since it makes advanced search feature almost unusable in certain scenarios.

sharptoothed sharptoothed 2015-decembro-16 2015-decembro-16 19:09:09 UTC link Konstanta ligilo

Судя по всему, индексирование не проводится почему-то. Если перевод сделан недавно, то поисковик не исключает его при выводе результата.

sharptoothed sharptoothed 2015-decembro-16, modifita 2015-decembro-16 2015-decembro-16 18:56:10 UTC, modifita 2015-decembro-16 18:56:37 UTC link Konstanta ligilo

Если результат у вас другой, то, значит, поиск меня лично не любит. :-)
Другой браузер, другой компьютер, другая ОС, другой провайдер - результат тот же:
https://dl.dropboxusercontent.c...advsearch2.png

sharptoothed sharptoothed 2015-decembro-16 2015-decembro-16 18:48:36 UTC link Konstanta ligilo

Условия поиска на картинке пробовали повторить? :-)

sharptoothed sharptoothed 2015-decembro-16 2015-decembro-16 13:22:26 UTC link Konstanta ligilo

** Advanced search feature is slightly broken? **

Recently I've noticed that advanced search sometimes fails to filter unwanted translations from the result. Please take a look:
https://dl.dropboxusercontent.c.../advsearch.png
According to the conditions, direct Russian translations should be excluded from result but still we can observe them there.

sharptoothed sharptoothed 2015-novembro-23 2015-novembro-23 08:15:28 UTC link Konstanta ligilo

** Sentences & Translations Stats **

These stats have been updated.

http://tatoeba.j-langtools.com/transtop/

Various graphs are now available on one page:

http://tatoeba.j-langtools.com/graphs.html

sharptoothed sharptoothed 2015-novembro-22, modifita 2015-novembro-22 2015-novembro-22 18:04:58 UTC, modifita 2015-novembro-22 18:55:20 UTC link Konstanta ligilo

> whether adding romanization for “any other language that's not "romanized" (that don't use the Roman script)” is a good thing...

I think, for different languages we could add different data that is actual to a particular language. For Russian, for example, we could provide a version of a sentence with stress marks. Romanization is just one of the options.

sharptoothed sharptoothed 2015-novembro-16 2015-novembro-16 12:34:50 UTC link Konstanta ligilo

** Sentences & Translations Stats **

These stats and the underlying software have been updated.

http://tatoeba.j-langtools.com/transtop/

The following graphs have been updated, too:

Top 13 Tatoeba languages monthly dynamics:
http://tatoeba.j-langtools.com/langs.png

English sentences distribution by word count:
http://tatoeba.j-langtools.com/dist-eng-wc.png

English sentences distribution by translation count:
http://tatoeba.j-langtools.com/dist-eng-tc.png

sharptoothed sharptoothed 2015-novembro-13, modifita 2015-novembro-13 2015-novembro-13 09:08:55 UTC, modifita 2015-novembro-13 09:14:47 UTC link Konstanta ligilo

Looks like a localization bug. I hope @sadhen, the Chinese UI translator, or some other authorized person will eventually fix it.

sharptoothed sharptoothed 2015-novembro-12 2015-novembro-12 19:45:08 UTC link Konstanta ligilo

Всё зависит от того, как и чем вы открываете скачиваемый файл и с какой целью.

sharptoothed sharptoothed 2015-novembro-09, modifita 2015-novembro-09 2015-novembro-09 09:00:21 UTC, modifita 2015-novembro-09 15:06:26 UTC link Konstanta ligilo

** Sentences & Translations Stats **

These stats have been updated.

http://tatoeba.j-langtools.com/transtop/

The following graphs are now available:

Top 13 Tatoeba languages monthly dynamics (updated):
http://tatoeba.j-langtools.com/langs.png

English sentences distribution by word count (updated):
http://tatoeba.j-langtools.com/dist-eng-wc.png

English sentences distribution by translation count (new):
http://tatoeba.j-langtools.com/dist-eng-tc.png

sharptoothed sharptoothed 2015-novembro-05 2015-novembro-05 10:12:12 UTC link Konstanta ligilo

> It might be useful to have the selection default to a member's registered native language when the script has trouble detecting a language

I think this really can narrow the problem down a bit.

sharptoothed sharptoothed 2015-novembro-05 2015-novembro-05 08:10:12 UTC link Konstanta ligilo

The short answer is: the Tatoeba language autodetection engine is not perfect.

The algorithm it built upon (character n-gram language models, as far as I know) is pretty probabilistic and, thus, not accurate by design, especially for languages that use similar alphabets and (in the worst case) belong to the same language group. The accuracy can be increased (but not radically at the moment, I'm afraid) by regular n-gram database training and proper n-gram weighting. The problem is that it requires rather big corpus for each language and Tatoeba corpus is still not big enough for the majority of its languages.

Another approach is to amplify existing algorithm with various auxiliary heuristic algorithms for each language or at least for the most problematic ones. This can increase the accuracy dramatically but it requires collaboration of the members who know those languages well and have enough time and knowledge to assist Allan Simon, the author of the Tatodetect engine, provided that he himself has enough time and motivation.