Menü
EDIT: temporarily disabled to test editable transcriptions instead.
I’m adding additional criteria to the search feature. You can test this ongoing work on https://dev.tatoeba.org/
Perform a regular search, and then you’ll see additional criteria on the right: sentence owner and orphan sentences for the moment. I made orphan sentences hidden by default. This way, they are hidden from top bar searches, but can be displayed by checking the additional criterion, lowering their visibility to newcomers.
What do you think?
I found an issue with accents. First query: https://dev.tatoeba.org/ita/sen...rom=ita&to=und
It becomes this query when I search for my sentences corresponding to that query: https://dev.tatoeba.org/ita/sen...ser=Guybrush88
As you can see, no results are shown because the accent is changed by the query, while sentences I own are shown without specifying my username
Problem solved, thank you.
thanks for the fix, gillux. everything seems to be perfectly working for me now
By the way, is there a way to bring the native speaker factor into the search, e.g. arrange for 'sentences in language X by native speakers' and, conversely, 'non-native speakers / undefined'?
Yes. That’s a good idea, I’ll definitely add this criterion. Though I’m not sure about how to organize the form since we’d have 3 exclusive filters for users: unowned, owned by a given user, owned by a native. It’s already a bit confusing because one can check “Show orphan sentences” while specifying a username (in which case the checkbox is ignored). Adding a third exclusive filter will make things worse.
[not needed anymore- removed by CK]
I like ck's ideas #1 and #3, and I don't mind #2, either.
an automatic 'native speakers' filter would probably be cool, too, but I also very much agree with sacredceltic's caveat below; you just never know who claims to be native. having an individual list as in ck's suggestion #1 would be a good way to cope with this problem.
I don't think, however, that hiding orphans should be the default in the way that you have to check "show orphans" every single time you submit a search query. I think this would lead to a decrease in orphans being adopted and amended. let's rather have it so that you can check "show orphans" and it stays like that until you manually uncheck it again.
it's great seeing this site improving constantly!
I see your points about native speakers. However, I don’t think this problem should be solved by changing the search criterion, but rather by changing the way we identify native speakers in the first place. The search criterion could only be “limit to sentences by self-proclamed natives” because that’s the only information we have in our database so far.
I don’t really like the idea of providing a comma-separated list instead of filtering by self-proclamed natives. First, because it’s rather impractical to use as the list grows. Second, because it restricts the ability to filter by native speakers to a handful of long-time contributors who have their own idea on that matter. I’m worrying about newcomers (who obviously won’t express themselves in this thread) being unable to use the search as efficiently as you guys would. That would be unfair. The current lack of native speakers identification and proper review mechanism to sort out “bad” sentences should be solved first, rather than worked around by that kind of “feature”. I can already see members providing ready-to-use search links in their profiles that filters users from their list. That said, filtering by multiple users itself (regardless of the motivation) seems legit, and is easy to implement.
I agree about what you said about orphans visibility. I initially wanted to limit the visibility of orphans because they are a major problem in some languages like Japanese where more than the half of the corpus are orphans that are mostly wrong. But that’s another problem.
[not needed anymore- removed by CK]
I like this idea, too, but I'd hate typing lots of usernames each time because I'm sure I'd use the same sets of usernames many times. It would be nice if we could make lists of usernames that we can use anytime for search. We could also provide some default lists of self-proclaimed native speakers of each language.
> 2. People could use all the native speakers listed on http://bit.ly/nativespeakers rather than just the few that are listed using the new system on tatoeba.org. We have a lot of sentences written by native speakers that are never likely to come back and change the setting in their profiles.
How about incorporating the information on this page into the official system? Would anyone object to it?
+1 to all CK's suggestions.
> Would it be possible to allow us to also limit searches to only sentences with audio?
Yes. It won’t be testable on dev.tatoeba.org until the next update though.
"Native speakers", by Tatoeba's definition, is anybody who self-proclaims to be such : Russians claiming to be French or Turkish claiming to be British, just for the challenge...teenagers have such an oversized ego and Tatoeba often ends up being their egos's grave.. and makes them so much more aggressive and bitter, as a result...
would it also be possible to search for given words/expressions that are not translated in a given language? for example: I want to search for "once in a blue moon" (or any other expression in any other language) and I want to see all the sentences containing that expression that are not translated in Italian (or any other language). I would also find it useful if i could see all the sentences with a given expression/word that are translated in a given language. for example: i search for "apple pie" and i want to see only the sentences containing "apple pie" that have translations in Italian
+1. I'd also like to have "Show translations in", "Not directly translated into" and "Not translated into" sorting opitions.
> would it also be possible to search for given words/expressions that are not translated in a given language?
Yes. I’ll implement this.
> I would also find it useful if i could see all the sentences with a given expression/word that are translated in a given language. for example: i search for "apple pie" and i want to see only the sentences containing "apple pie" that have translations in Italian
You mean https://tatoeba.org/sentences/s...rom=eng&to=ita ?
> You mean https://tatoeba.org/sentences/s...eng&to=ita ?
yes, actually
I find it pretty difficult to remember the syntax we need to use when we want to search for exact phrases, sentences beginning with a certain word etc. I basically need to go every time to the wiki article to verify what characters mean what in the search (http://en.wiki.tatoeba.org/arti...text-search#).
Many online-dictionaries I use have a drop-down list where you can choose what kind of search you want to make. For example, this Japanese dictionary http://dictionary.goo.ne.jp/ has options "begins with", "exact match" and "ends with" and you can specify your search with those.
I would also like to see something like that in Tatoeba. So there would be next to the search field another drop-down list with options to choose, eg.
- vague matches (eg. "live in boston" or "live") <-- this would be the default. I'm assuming the quotation marks don't do anything if you are searching with only one word, eg. the search "live" returns the same results as plain live, right?
- exact matches (eg. "=live =in =boston" or "=live") (though this wouldn't work when searching phrases in languages without spaces, I guess)
- begins with (eg. "^live in boston" or "^live")
- ends with (eg. "live in boston$" or "live$")
+ maybe something else, like "begins and ends with" (eg. "^live in boston$" or "^live$".)
+1, i would find it better to have the opportunity of making exact searches instead of using "=word" each time i want to see the exact occurrences of something
These criteria seem to limit only the sentences of the "from" language, but we're sometimes rather interested in the "to" language. For example, when I want to know how to say something in French and type a Japanese phrase, I don't mind seeing orphan Japanese sentences but I don't want orphan French sentences. I wonder how we could work this out.
That’s a very relevant point. I’d like to be able to perform such searches too. Either that, or I’d like to be able to distinguish orphans from non-orphans directly within a list of translations. I’ll keep that in mind.