clear
{{language.name}} No language found.
swap_horiz
{{language.name}} No language found.
search
DostKaplan
2019-06-14 22:18 - 2019-06-14 22:19
The asterisk notation used in searches has become less useful, in my opinion. When searching for "bir daha *mak" from Turkish to English, I get results like:

Bir daha orada yaşamak istemiyor.
Onu bir daha hiç yapmak istemiyorum.

If I had wanted an intervening word, I would have searched for "bir daha * *mak"! But that's not what I want. I want specifically results, if any, like:

...bir daha almak...
...bir daha yapmak...

https://tatoeba.org/eng/sentenc...rom=tur&to=eng
hide replies
AlanF_US
2019-06-15 12:43 - 2019-06-15 13:03
You can achieve what you want by using the proximity search operator, ~. The following search:

"bir daha *mak"~1

will search for the phrase "bir daha" separated from a word ending with "mak" by up to, but not including, 1 word. (In other words, there must be 0 words separating them.) See this link:

https://docs.manticoresearch.co...ry_syntax.html

I also updated our wiki page:

https://en.wiki.tatoeba.org/art...ow/text-search
hide replies
DostKaplan
2019-06-15 15:34
Wow. I guess this is a new feature. Thanks.
hide replies
AlanF_US
2019-06-15 18:08
You're welcome. You could also do this search:

bir NEAR/1 daha NEAR/1 *mak

I'm not sure it makes sense to describe the NEAR operator on the wiki page, but you can find a description here:

https://docs.manticoresearch.co...ry_syntax.html
hide replies
DostKaplan
2019-06-15 19:11
It's pretty unintuitive, no? A more regexp-like notation would have been better. Like...

"bir daha *mak" (no intervening words)
Matches: "bir daha yapmak"

"bir* daha *mak" (no intervening words)
Matches: "biraz daha çalışmak"

"bir daha *{2} *mak" (up to two intervening words)
Matches:
"bir daha yapmak"
"bir daha araba kullanmak"
"bir daha yeni araba kullanmak"

"bir daha *{1} *m[a|e]k" (up to one intervening word and matching *mak or *mek)
Matches:
"bir daha yapmak"
"bir daha söylemek"
"bir daha araba kullanmak"
"bir daha araba sürmek"
hide replies
AlanF_US
2019-06-16 14:12
That syntax would work fine for me, since I'm familiar with regular expressions, but perhaps the Sphinx/Manticore team decided it would be difficult for people who weren't. Or perhaps they had other constraints.