menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search
Wezel Wezel December 18, 2016, edited December 18, 2016 December 18, 2016 at 12:27:24 AM UTC, edited December 18, 2016 at 1:41:57 AM UTC link Permalink

Bug (?)

Searching for "påse" in "any language" yields, besides Swedish, some unexpected results in Turkish.
https://tatoeba.org/eng/sentenc...rom=und&to=und
http://i.imgur.com/mWYmHkM.png
http://i.imgur.com/GtXMpy1.png

{{vm.hiddenReplies[27797] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US December 18, 2016, edited December 18, 2016 December 18, 2016 at 7:15:48 PM UTC, edited December 18, 2016 at 7:16:28 PM UTC link Permalink

That's interesting behavior, but it shouldn't get in anyone's way. People can get reasonable results by restricting the search as follows:

- From: Swedish; Search: påse
- From: Any language; Search: *påse*
- From: Any language; Search: =påse

People searching in Turkish would presumably not use the å character.

Apparently, the Turkish stemmer (software that tries to condense multiple forms of a particular word) converts the "på" part of the string into something that matches "bunu", "bindim", and so on, and the "se" part of the string into something that matches "senden", "senin", and so on. Note that the stemmer was not developed by us, or even by the developers of Sphinx, the search engine that we use. It was contributed by someone named Evren Çilden to a project called Snowball that Sphinx uses.

In case anyone is curious, here's a link to the stemmer:

https://raw.githubusercontent.c...em_Unicode.sbl

{{vm.hiddenReplies[27800] ? 'expand_more' : 'expand_less'}} hide replies show replies
Wezel Wezel December 18, 2016 December 18, 2016 at 8:21:28 PM UTC link Permalink

Thank you, that’s interesting.