menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search
TRANG TRANG August 8, 2014 August 8, 2014 at 10:11:14 PM UTC link Permalink

** Tatoeba update (August 8, 2014) **

http://blog.tatoeba.org/2014/08...st-8-2014.html


# Better language selector

We know that the list of languages has gotten pretty long, and it can be unpractical to select a language, so we're introducing a better language selector. The new selector has a search field and you will be able to filter the list to show only languages that match the characters that you have entered.

This feature is only available for registered members at the moment. It requires you to activate it from your settings[1] (Options > Advanced language selector). We did not want to make it globally available (not yet at least) since we know that it may not work on tablets. If you have a tablet, please let us know if/how it works for you.

# Other bug fixes

* We fixed a bug where the logs did not record the user who added a sentence/translation.
* We fixed the pagination for the "contributions"[2] page when a language is specified.


We hope that you've been enjoying hanging on Tatoeba now that it's hosted on a new server :) We received 3 donations since the migration[3] so I'd like to thank (again) William, Gary and Shayne for their donations.

-----

[1] http://tatoeba.org/user/settings
[2] http://tatoeba.org/contributions/index
[3] http://blog.tatoeba.org/2014/08...st-2-2014.html

{{vm.hiddenReplies[20125] ? 'expand_more' : 'expand_less'}} hide replies show replies
Silja Silja August 8, 2014, edited August 9, 2014 August 8, 2014 at 11:57:59 PM UTC, edited August 9, 2014 at 12:01:35 AM UTC link Permalink

Uh huh, I wrote a long message but it just dissappeared after I pressed Send...

Anyway, here's some what shorter version: Advanced language search works fine on iPad. This is a great change! ☺

Is it possible to sort the results differently? Now if I want to find for example "Japanese", I type in "ja", I get several results that are alphabetically before "ja" but just contain "ja" somewhere in the middle (like Chinyanja, Gujanti). I don't think any one who wants to find Chinyanja types in "ja", but rather "chsomething". So, is it possible to get first in the filtered list the ones that actually begin with the search string? This is not a big issue, because I can always type in "jap" and get only "Japanese" as a result, but there might be cases in some language names (in some languages) that require you to type in almost the whole name of the language.

By the way, when I activated advanced language search in settings, I got only a blanc page after saving. I guess I should be redirected to my profile or something? (This happened on both laptop and iPad).

Edit. Oh, and please add the new texts in settings to launchpad so that we can translate them into other languages.

{{vm.hiddenReplies[20126] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG August 10, 2014 August 10, 2014 at 11:11:53 PM UTC link Permalink

> Is it possible to sort the results differently?

It should be possible, but I personally find it better that it returns results with languages containing (if I take your example) "ja" rather than beginning with "ja". You have cases like Chinese or Arabic where the user may want to type "chi" to find "Literary Chinese", or "ara" to find "Egyptian Arabic".

> Edit. Oh, and please add the new texts in settings to launchpad so that we can translate them into other languages.

Yes, we will :)

{{vm.hiddenReplies[20133] ? 'expand_more' : 'expand_less'}} hide replies show replies
Silja Silja August 11, 2014 August 11, 2014 at 7:24:02 PM UTC link Permalink

I totally agree, but I still think it would be better to get first the result that begins with the search string, and only then the other ones.

{{vm.hiddenReplies[20135] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG August 11, 2014 August 11, 2014 at 10:21:41 PM UTC link Permalink

Ah right, sorry I misread what you were saying about the sort. Well I'm not sure how easy it would be to customize the search results as you described but this is the plugin we used to implement the feature: http://harvesthq.github.io/chosen/, in case anyone has time to look into it.

The translations on Launchpad are updated now.

tommy_san tommy_san November 13, 2014 November 13, 2014 at 2:35:44 PM UTC link Permalink

+1

Old Prussian gets in the way whenever I want Russian.

tommy_san tommy_san January 19, 2015, edited January 19, 2015 January 19, 2015 at 1:04:10 PM UTC, edited January 19, 2015 at 1:11:29 PM UTC link Permalink

I noticed that we can solve this problem by NOT using the advanced language selector.

For example, type the words, press Tab, type "fr", press Tab, type "ru", press Tab, and press Enter. This is actually much more comfortable.

{{vm.hiddenReplies[21531] ? 'expand_more' : 'expand_less'}} hide replies show replies
Nero Nero January 19, 2015 January 19, 2015 at 8:10:43 PM UTC link Permalink

This is definitely the superior option.

gillux gillux August 12, 2014, edited August 12, 2014 August 12, 2014 at 6:16:48 PM UTC, edited August 12, 2014 at 6:21:27 PM UTC link Permalink

> Is it possible to sort the results differently? Now if I want to find for example "Japanese", I type in "ja", I get several results that are alphabetically before "ja" but just contain "ja" somewhere in the middle (like Chinyanja, Gujanti). I don't think any one who wants to find Chinyanja types in "ja", but rather "chsomething". So, is it possible to get first in the filtered list the ones that actually begin with the search string?

Edit: so just like Trang I misred your comment, sorry. So yes, I think it’s possible, with a bit of hacking into Chosen.

sabretou sabretou November 15, 2014 November 15, 2014 at 7:12:36 AM UTC link Permalink

I've had a couple ideas for the language selector for a while.

Most users on Tatoeba use only a few languages, perhaps 3-5 at max. Perhaps these languages can be selected in the settings, and then only these languages will be displayed in the selector, along with a "More languages..." option that expands into the entire list. I am already using a Greasemonkey script that does something similar: it pushes certain selected languages to the top of the list, making it very convenient to switch between a few languages.

Even more conveniently, if the selected languages are upto 5 in number, they can be represented as separate buttons ("Submit in English", "Submit in Hindi", "Submit in German") etc. and the user can directly click on the button. This would be way, way more convenient for people who add sentences in pairs of two languages, as right now they often have to deal with incorrect flags (that may sometimes be neglected and left uncorrected for months).

{{vm.hiddenReplies[20867] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG November 15, 2014 November 15, 2014 at 10:13:27 AM UTC link Permalink

I've created an issue for this: https://github.com/Tatoeba/tatoeba2/issues/497.

sacredceltic sacredceltic November 15, 2014 November 15, 2014 at 12:22:50 PM UTC link Permalink

>Perhaps these languages can be selected in the settings

They already are.

{{vm.hiddenReplies[20869] ? 'expand_more' : 'expand_less'}} hide replies show replies
gillux gillux November 15, 2014 November 15, 2014 at 9:15:19 PM UTC link Permalink

>> Perhaps these languages can be selected in the settings
> They already are.

This option hides translations in languages other than specified, while sabretou was talking about the language selector. It actually makes much more sense to me to restrict or reorder the language selector. I personally find the current preferred languages option unusable because it actually causes more trouble than anything else. I often need to correlate comments with logs to understand how the local graph of a given translation group has evolved, even if it includes languages I’m not interested in. I think the problem of sentences with many many translations, like #1, could be addressed in a much better way, for instance by showing the preferred languages first, and if the list is really long, hide the end but allow to display it by clicking on a button that says “see the x other translations”. I’d like to hear people’s opinion about this.

{{vm.hiddenReplies[20883] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic November 15, 2014 November 15, 2014 at 9:46:59 PM UTC link Permalink

To me, the way languages are filtered in the settings should be the base for filtering the selector. Why use 2 different settings ?
I don't understand the design of functionalities for the display of unfiltered translations.
What's the point of seeing translations of sentences in 100+ languages ?
Who needs it and what for ? Apart to strain the servers !

{{vm.hiddenReplies[20886] ? 'expand_more' : 'expand_less'}} hide replies show replies
gillux gillux November 15, 2014 November 15, 2014 at 11:57:29 PM UTC link Permalink

> What's the point of seeing translations of sentences in 100+ languages ? Who needs it and what for ? Apart to strain the servers !

Sometimes, I just like to absent-mindedly look at huge list of translations, play some random audio if any, saying to myself “so much translations in so much languages…” while contemplating unintelligible forms of communication, for no reason.

{{vm.hiddenReplies[20889] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic November 16, 2014 November 16, 2014 at 12:29:08 AM UTC link Permalink

Then I now know who's to blame for the service's slowness...

sabretou sabretou November 16, 2014 November 16, 2014 at 6:18:25 AM UTC link Permalink

I do this too! :D I'd never turn off seeing all languages either.

@sacredceltic: Yes, the language selector at present is what I was thinking of as well. It can be combined with what I'm suggesting. Speaking of which, that selector could be made more user friendly as well, with drop downs and stuff.

Also, it doesn't work for me at all. I'm putting combinations like "mar,hin,jpn" and it does nothing, I still see all languages, everywhere.

{{vm.hiddenReplies[20895] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic November 16, 2014 November 16, 2014 at 12:58:47 PM UTC link Permalink

It has been working for me for years and still does.
Make sure your ISO codes are all correct and match the languages codes on Tatoeba. No spaces between codes, just commas. Try deleting your cache and reloading.
It doesn't filter the languages on the random sentence on the front page, but does all others.

It's a pity that so many people don't use these settings. They waste precious time, clutter their screen space and STRAIN THE SERVERS !

{{vm.hiddenReplies[20901] ? 'expand_more' : 'expand_less'}} hide replies show replies
sabretou sabretou November 16, 2014 November 16, 2014 at 1:49:34 PM UTC link Permalink

Cache refresh worked, thanks!

wallebot wallebot November 15, 2014 November 15, 2014 at 12:52:51 PM UTC link Permalink

+1

I think this many time.

gillux gillux November 15, 2014 November 15, 2014 at 9:16:49 PM UTC link Permalink

> Even more conveniently, if the selected languages are upto 5 in number, they can be represented as separate buttons ("Submit in English", "Submit in Hindi", "Submit in German") etc. and the user can directly click on the button.

Isn’t the automatic language detection already providing a convenient enough solution for this problem?

{{vm.hiddenReplies[20884] ? 'expand_more' : 'expand_less'}} hide replies show replies
sabretou sabretou November 15, 2014 November 15, 2014 at 9:41:10 PM UTC link Permalink

Unfortunately, no, more often than not it's a nuisance for me. To test, I just added a sentence in Hindi and it was detected as Marathi. The converse used to happen back when I started adding sentences in Marathi. Even apart from all the times I have to correct the auto-detect, it forces me to constantly be vigilant about what language it picked, or I risk an incorrectly flagged sentence.

{{vm.hiddenReplies[20885] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic November 16, 2014 November 16, 2014 at 12:34:29 AM UTC link Permalink

The algorithm is based on the corpus specificity itself, so the more you create sentences, and the more diverse they are in termed of basic groups of letters, the better the recognition will work for the languages you're working on.
Currently, my experience with French is that it is wrongly identified in under 1% of cases.
It used to be much higher...

{{vm.hiddenReplies[20891] ? 'expand_more' : 'expand_less'}} hide replies show replies
sabretou sabretou November 16, 2014 November 16, 2014 at 6:12:43 AM UTC link Permalink

I'd suspected as much (I was initially under the impression it was handled by Google Translate's API). It goes to show that this system is only really reliable for the major languages: perhaps the top 20 or so. Languages with less sentences are sort of left in the lurch.

{{vm.hiddenReplies[20894] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic November 16, 2014 November 16, 2014 at 1:06:50 PM UTC link Permalink

Don't despair. I think this algorithm, designed, I believe, by sysko, is a piece of genius and works far better than Google to identify languages.
The only thing is, it's dependant on the corpus broadness.
So it can only improve over time.
If you contribute in languages that have yet a limited corpus and that share a lot of hashes (sequences of letters within words and sentences), the results are poor. But if you contribute more sentences that differentiate these hashes in between your languages, the recognition will improve.

{{vm.hiddenReplies[20902] ? 'expand_more' : 'expand_less'}} hide replies show replies
gillux gillux November 17, 2014 November 17, 2014 at 8:35:28 AM UTC link Permalink

Which reminds me we should update that. The algorithm is currently based on an outdated version of the corpus, which may explain why it fails for Hindi.

{{vm.hiddenReplies[20921] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic November 17, 2014 November 17, 2014 at 11:52:28 AM UTC link Permalink

Would it be difficult to automate ?
Maybe you could update each language based on their quantitative progression, so languages that didn't grow don't need updating...

gillux gillux November 17, 2014 November 17, 2014 at 10:20:53 AM UTC link Permalink

sabretou, I updated the data used by the language detection algorithm. Could you try again and tell us if you feel Hindi is better autodetected now?

{{vm.hiddenReplies[20922] ? 'expand_more' : 'expand_less'}} hide replies show replies
sabretou sabretou November 17, 2014 November 17, 2014 at 1:00:15 PM UTC link Permalink

Much better now, gillux! I saw about 1 error per 10 sentences, but it's very good now, at least for Hindi and Marathi.