menu
Tatoeba
language
Register Inloggen
language Grunnegs
menu
Tatoeba

chevron_right Register

chevron_right Inloggen

Bloadern

chevron_right Let willekeurege zin zain

chevron_right Bloadern op toal

chevron_right Deur liesten bloadern

chevron_right Bloadern op label

chevron_right Deur audio bloadern

Gemainschop

chevron_right Muur

chevron_right Liest van ale leden

chevron_right Toalen van leden

chevron_right Moudertoalsprekers

search
clear
swap_horiz
search
Silja Silja January 24, 2015 January 24, 2015 at 10:57:58 PM UTC link Permalink

Lately the language detection has marked quite a few of my Finnish sentences as English in the first place. Some of those sentences are these:

Tom justiinsa teki niin. http://tatoeba.org/fin/sentences/show/3785040
Nouse ylös ja taistele. http://tatoeba.org/fin/sentences/show/3785043
Tom hikoili. http://tatoeba.org/fin/sentences/show/3785774
No onko edes siistiä? http://tatoeba.org/fin/sentences/show/3786584
Onko Tom vielä hereillä? http://tatoeba.org/fin/sentences/show/3789015
Onko Tom edelleen hereillä? http://tatoeba.org/fin/sentences/show/3789014
Puhuvatko he ranskaa? http://tatoeba.org/fin/sentences/show/3790314
Emme me puhu ranskaa. http://tatoeba.org/eng/sentences/show/3793029
Ole varovainen. Älä heitä pois noita papereita. http://tatoeba.org/eng/sentences/show/3794756

Yes, there are some words that could also be English in those sentences (no, me), but otherwise I really can't understand why these are detected as English. If I remember correctly, the language detector needs to be updated from to time, so that it "learns" better what kind of combination of letters should be detected as which language. Has this update been made recently?

I'm not complaning, because it's really something like 1 out of 100 sentences that are detected wrongly and it's no big deal to correct them manually, but I'm just curious. :)

{{vm.hiddenReplies[21606] ? 'expand_more' : 'expand_less'}} verbaarg reaksies teun reaksies
gillux gillux January 24, 2015 January 24, 2015 at 11:46:36 PM UTC link Permalink

> Has this update been made recently?
Yes, on the 17th of November, 2014.

I’m not familiar with the language detection tool so I can’t tell you much about its weaknesses.