clear
{{language.name}} No language found.
swap_horiz
{{language.name}} No language found.
search
DostKaplan DostKaplan June 14, 2019 at 10:18 PM, edited June 14, 2019 at 10:19 PM June 14, 2019 at 10:18 PM, edited June 14, 2019 at 10:19 PM link Permalink

The asterisk notation used in searches has become less useful, in my opinion. When searching for "bir daha *mak" from Turkish to English, I get results like:

Bir daha orada yaşamak istemiyor.
Onu bir daha hiç yapmak istemiyorum.

If I had wanted an intervening word, I would have searched for "bir daha * *mak"! But that's not what I want. I want specifically results, if any, like:

...bir daha almak...
...bir daha yapmak...

https://tatoeba.org/eng/sentenc...rom=tur&to=eng

{{vm.hiddenReplies[32033] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US June 15, 2019 at 12:43 PM, edited June 15, 2019 at 1:03 PM June 15, 2019 at 12:43 PM, edited June 15, 2019 at 1:03 PM link Permalink

You can achieve what you want by using the proximity search operator, ~. The following search:

"bir daha *mak"~1

will search for the phrase "bir daha" separated from a word ending with "mak" by up to, but not including, 1 word. (In other words, there must be 0 words separating them.) See this link:

https://docs.manticoresearch.co...ry_syntax.html

I also updated our wiki page:

https://en.wiki.tatoeba.org/art...ow/text-search

{{vm.hiddenReplies[32036] ? 'expand_more' : 'expand_less'}} hide replies show replies
DostKaplan DostKaplan June 15, 2019 at 3:34 PM June 15, 2019 at 3:34 PM link Permalink

Wow. I guess this is a new feature. Thanks.

{{vm.hiddenReplies[32037] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US June 15, 2019 at 6:08 PM June 15, 2019 at 6:08 PM link Permalink

You're welcome. You could also do this search:

bir NEAR/1 daha NEAR/1 *mak

I'm not sure it makes sense to describe the NEAR operator on the wiki page, but you can find a description here:

https://docs.manticoresearch.co...ry_syntax.html

{{vm.hiddenReplies[32038] ? 'expand_more' : 'expand_less'}} hide replies show replies
DostKaplan DostKaplan June 15, 2019 at 7:11 PM June 15, 2019 at 7:11 PM link Permalink

It's pretty unintuitive, no? A more regexp-like notation would have been better. Like...

"bir daha *mak" (no intervening words)
Matches: "bir daha yapmak"

"bir* daha *mak" (no intervening words)
Matches: "biraz daha çalışmak"

"bir daha *{2} *mak" (up to two intervening words)
Matches:
"bir daha yapmak"
"bir daha araba kullanmak"
"bir daha yeni araba kullanmak"

"bir daha *{1} *m[a|e]k" (up to one intervening word and matching *mak or *mek)
Matches:
"bir daha yapmak"
"bir daha söylemek"
"bir daha araba kullanmak"
"bir daha araba sürmek"

{{vm.hiddenReplies[32039] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US June 16, 2019 at 2:12 PM June 16, 2019 at 2:12 PM link Permalink

That syntax would work fine for me, since I'm familiar with regular expressions, but perhaps the Sphinx/Manticore team decided it would be difficult for people who weren't. Or perhaps they had other constraints.