Menu
I know question marks are not allowed in a search. It will always return no results. So why not programmatically strip it and other offending characters off the input search string behind the scenes before hitting the database? It is very natural for someone to search for something like "Do you know why?" with the trailing question mark. Just strip it off in the code.
Question marks are allowed in a search. They just have a special function. As our help link says:
:::
The question mark (?) as part of a word is a one-letter wildcard.
The following will find sentences with either "whenever" and "wherever."
whe?ever
The following will find sentences with with 6-letter words that have 2 letters, and then "eve" and then one more letter, such as "clever" "eleven", "peeves", "uneven", ...
??eve?
:::
As you can see, stripping the question mark would prevent people from using this functionality. The same is true for other punctuation marks.
[not needed anymore- removed by CK]
In that case, I would do this: If there are results, we're done, else strip the offending "special" characters, perhaps iteratively one at a time, until some results are returned, if possible. An accompanying message might also be displayed such as: "Your search request yields no results, so 'Do you know why" was used instead."
> since people often search for sentences to make sure they don't exist before contributing new ones
I know you’re always doing this, but I don’t think it is true for the majority of people using Tatoeba (including people without an account).
DostKaplan has a point though. I think people use interrogation marks in the search function more often intending that character than a wildcard. However it’s not possible to search any punctuation character in the first place (for instance you can’t search for commas). I personally disagree with the previous suggestions (I find them too intrusive). It’s not a easy problem, but maybe we could at least provide a more verbose message than “No results found for: <query>”. Like some hints about common pitfalls that prevent results from being returned: check selected language, wildcards, diacritics…
Below the message that says "No results found for: XXX", we could display a link to the help for the wiki page about searching:
http://en.wiki.tatoeba.org/arti...w/text-search#
That page already says that punctuation should be omitted when searching.
[not needed anymore- removed by CK]
Ehm, no way to escape "?" ?
[not needed anymore- removed by CK]
You can also consider creating a Horus-like script for this purpose: a script that regularly scans the database for questions, exclamations, and dialogues, and then tags them accordingly. Since the advanced search section includes tags, limiting search results to those types of sentences without using punctuation marks would be possible.