Muro (3810 fadenoj)

<<< 1234567 >>
gillux
antaŭ 56 minutoj
We are about to put the advanced search feature on tatoeba.org. By the way, Trang redesigned the search bar: https://dev.tatoeba.org/eng/sentences/search

But before, we’d like UI translators to translate the new parts of the site on Transifex. For those who would like to help but are unfamiliar with the process, please read this page: http://en.wiki.tatoeba.org/arti...ce-translation

Previous thread: https://tatoeba.org/wall/show_message/23234
kaŝi la respondojn
123xyz
antaŭ 34 minutoj
Lovely :) I like the advanced search options. I'm not happy that "tatoeba.org" has been removed from the right corner, but that's negligible.
123xyz
antaŭ 20 horoj
When Azeri sentences are displayed on the homepage in the "latest contributions" list, the letter "ə" is taller than the other letters. It appears to be in a different font, or something such. Either way, it is not supposed to be this way. When one opens Azeri sentences individually though, the display is fine.
kaŝi la respondojn
gillux
antaŭ 6 horoj
I can see it too, but can you provide a screenshot just in case? This problem is likely due to font substitution. We tell browsers to use a certain font to display text on the site (which happens to be Trebuchet MS), whereas this font may not include glyphs used in Azeri (and many other languages), so it fallbacks on a different font only to display "ə". Such font mix can result in height differences especially in small sizes, and style differences. One way to work around this problem is to use webfonts. We talked about that some weeks ago https://github.com/Tatoeba/tatoeba2/issues/684
kaŝi la respondojn
123xyz
antaŭ 5 horoj
How do I post the screenshot, when there's no "upload image" here? Do I have to upload it online and then post a link?

Anyway, if font substitution is the problem, please find another more "standard" font that includes all glyphs of all languages. I've noticed such problems in WinWord documents myself, but when I select something like Arial or Tahoma, all the glyphs become normal. I haven't tried Trebuchet MS.
kaŝi la respondojn
gillux
antaŭ 5 horoj
You can use other sites like imgur.com for example.
kaŝi la respondojn
123xyz
antaŭ 2 horoj
http://i.imgur.com/MRrTp52.png

Thank you for the suggestion. Here is a screenshot showing a couple of Azerbaijani sentences with a less than elegant display.
hakkeb
antaŭ 8 horoj
So many great languages! Can you add Vlaams? In English it is known as Flemish. You have Walloon, which is the French of Belgium, so it would be nice to see Vlaams or Flemish, which is the Dutch of Belgium. Thanks!
kaŝi la respondojn
PaulP
antaŭ 7 horoj
Walloon is not the French of Belgium, Hakkeb, but a different language. See e.g. https://en.wikipedia.org/wiki/Walloon_language. It has it's own ISO-code: wnl

Flemish sentences are listed as Dutch. If in some cases a Dutch sentence is typically North Dutch, then the tag Dutch - Netherlands is added (see https://tatoeba.org/epo/tags/sh..._with_tag/5719 )

If it is typically South Dutch and probably not understood in the North, then the tag Dutch - Belgium is added. See https://tatoeba.org/epo/tags/sh..._with_tag/5718



Impersonator
antaŭ 7 horoj
Do you mean this language: https://en.wikipedia.org/wiki/West_Flemish ?

It has its own ISO code, so it can be added.

Please see this article about how to add your language to Tatoeba: http://en.wiki.tatoeba.org/arti...nguage-request
AlanF_US
antaŭ 4 tagoj
** Tatoeba update (June 29th, 2015) **

* Languages in drop-down lists (except the ones for adding a new sentence and adding a translation, which have not yet been changed) are now displayed in two parts: the user-specific languages at the top, and the other languages below. If a user is logged in, the user-specific languages are the ones in his or her profile. Otherwise, they are the ones most recently selected.
* Searching for sentences not translated into a particular language is now faster.
* The "has audio" icon is now displayed correctly for translations in lists.
* Previously, when an attempt was made to auto-detect the language for a sentence, and auto-detection failed, the sentence was not added. This has been fixed.
* Searching for a user's sentences in a given language used to cause an incorrect message to be displayed when no results were found. This has been fixed.
kaŝi la respondojn
pullnosemans
antaŭ 2 tagoj
awesome improvement on the drop-down lists there.
kaŝi la respondojn
AlanF_US
antaŭ 2 tagoj
I agree. We can thank gillux for that.
sabretou
antaŭ 2 tagoj
Going to second that, very good change.
herrsilen
antaŭ 4 tagoj
Is there a wish list somewhere, where people can put words that they want example sentences for?
kaŝi la respondojn
123xyz
antaŭ 4 tagoj
That's a very good idea.
kaŝi la respondojn
herrsilen
antaŭ 4 tagoj
It'd be a good source of inspiration for new sentences as well. As a contributor I’d gladly add sentences with Swedish words that people want in the corpus.
TRANG
antaŭ 4 tagoj
There isn't, but the idea has already been mentioned.
Cf. https://tatoeba.org/eng/wall/sh...#message_21418

If you are interested in having such a feature in Tatoeba, please fill this form
https://docs.google.com/forms/d...LbK_8/viewform
kaŝi la respondojn
herrsilen
antaŭ 4 tagoj - redaktita antaŭ 3 tagoj
Great! I've filled it out!

Go fill it out you too, 123xyz, if you're interested. :)
123xyz
antaŭ 3 tagoj - redaktita antaŭ 3 tagoj
I have filled out the form. Since I see that the form is not very new, how many forms do you wish to collect before implementing the feature, i.e. establishing whether it's worth it?

Also, how come this potential feature has had a form created for it, whereas so many new features have been introduced thus far without any forms being involved? Is it because a wishlist is especially difficult to design?
kaŝi la respondojn
TRANG
antaŭ 3 tagoj
Past 20 responses, it will enter my attention zone. But considering that there are other things that I really want to work on, and that I feel are higher priority, I probably won't be working on this feature until it reaches 40 or 50 responses. Perhaps gillux will want to start something about this earlier than me.

You can see a summary of the responses here:
https://docs.google.com/forms/d.../viewanalytics

This feature has a form for it because it was requested a while ago, and it keeps getting suggested. It was however not something I could find time to implement, so I created the form in order to keep track more precisely of the demands, and evaluate at which point it should become a priority.

Other features didn't have forms because they weren't requested as often. It's also very rare that users take the time to write down their ideas as precisely as possible, like Silja did, which is another reason why this feature gets a special treatment.
Impersonator
antaŭ 4 tagoj
I can’t see the list of latest contributions in Toki Pona:
https://tatoeba.org/rus/contributions/latest/toki
kaŝi la respondojn
gillux
antaŭ 4 tagoj
I recorded that issue: https://github.com/Tatoeba/tatoeba2/issues/700.
Thank you for reporting!
123xyz
antaŭ 4 tagoj - redaktita antaŭ 4 tagoj
Why doesn't the auto-detect recognise Macedonian? Whenever I try to use it for a Macedonian sentence, it classifies it as Bulgarian. Now, how, pray tell, could a sentence containing the љ, њ, ј, ќ, ѓ and/or џ be considered Bulgarian, when those letters are fully absent in Bulgarian? Isn't the basic step of language recognition looking at its alphabet, before going into the words themselves?

This doesn't really matter to me, as I don't use auto-detect - I just select Macedonian manually after logging in, and subsequently, it stays as my default language until I log out, which I generally don't do for entire weeks. However, the fact that the auto-detect feature is endowed with such an absurd imperfection makes me uneasy.
kaŝi la respondojn
Impersonator
antaŭ 4 tagoj
I believe auto-detect is learning from the existing sentences, but it’s not doing that real-time: it needs to re-learn to take the new sentences into account. Probabely when the auto-detect data was generated last time, we had much more Bulgarian sentences than Macedonian.

> Isn't the basic step of language
> recognition looking at its alphabet,
> before going into the words themselves?

Not neccessarily. There are many algorithms available.
gillux
antaŭ 4 tagoj - redaktita antaŭ 4 tagoj
> Why doesn't the auto-detect recognise Macedonian?

Because the autodetection algorithm is based on the Tatoeba corpus itself, and we still need to update it manually once in a while so that it takes new sentences into account. Three month ago, there were less than 200 sentences in Macedonian, which was not enough for the algorithm to work. Now you added about 50 000 Macedonian sentences, it will certainly work better once we update it. I’ll let you know when it’s done.
gillux
antaŭ 4 tagoj
It should work better now.
kaŝi la respondojn
Guybrush88
antaŭ 4 tagoj
actually it seems that it uses the language that is used the most by users. I just added #4322143 and #4322166 to test it and it firstly recognized those sentences as Italian
kaŝi la respondojn
123xyz
antaŭ 4 tagoj
But I've never posted anything in Bulgarian, so that's clearly not the only factor.
123xyz
antaŭ 4 tagoj
Thank you.
CK
CK
antaŭ 4 tagoj
** Stats **

http://goo.gl/QPYHnp = number-of-sentences-by-language-with-no-translations-2015-06-27

http://goo.gl/DexpGV = number-of-sentences-by-language-and-owner-with-no-translations-2015-06-27
pchamorro
antaŭ 5 tagoj
For new contributors, it would be a good idea to include a link for this page: http://blog.tatoeba.org/2010/02...n-tatoeba.html
Impersonator
antaŭ 26 tagoj - redaktita antaŭ 5 tagoj
*** Limited language list is confusing for new users ***

Currently, a user can only choose the languages they have chosen in this profile. This means, if they made a mistake when choosing a language in their profile, they will carry this mistake on to all the added sentences.

Here’s an example of an user who chose Cherokee over Chinese in the profile, and then added Chinese sentences tagged as Cherokee:
http://tatoeba.org/eng/user/profile/mng

If they had a full list of languages, they probably would have noticed they’re adding something wrong.

I have no idea how to deal with this. :/
kaŝi la respondojn
gillux
antaŭ 25 tagoj
I think we should restore the complete list of languages and put the profile languages on the top, the same way personal lists appears on the top when you click on the “Add to list” icon of a sentence.
kaŝi la respondojn
sharptoothed
antaŭ 25 tagoj
I like the idea.
123xyz
antaŭ 5 tagoj
I see that it's been implemented now.
kaŝi la respondojn
gillux
antaŭ 5 tagoj
No, it has been implemented everywhere but on the “add sentence” and “add translation” forms.
Impersonator
antaŭ 6 tagoj - redaktita antaŭ 6 tagoj
kaŝi la respondojn
TRANG
antaŭ 6 tagoj
I personally don't think the problem is due to the restriction of the list but due to the fact that the rules and the form are not very clear for new users.

The form is missing some label and instruction to explain to new users that if a language is not in the list, they should add it in their profile so that they can select the correct language for the sentences they add.

This wasn't much of a problem because there was the auto-detect option. But now, users who have only 1 language in their profile don't have the auto-detect option.

I can bet that if we had the full list of languages, without the auto-detect option, a lot of languages would be wrongly added in Abkhaz, or whatever is the first language in the list.
kaŝi la respondojn
gillux
antaŭ 5 tagoj
> I personally don't think the problem is due to the restriction of the list but due to the fact that the rules and the form are not very clear for new users.

I think it’s both. If you look cindycute’s profile, there is only English while he or she’s a native Chinese. When people are learning a language, and they are asked to add languages they are interested in into their profile, it’s normal they don’t add their native language. Who would know? Even when you’re adding a new language into your profile because the “add sentence” or “add translation” form told you to do so, it’s not clear that we mean your native language rather than the language you want to add sentences in or translate into.

We could solve this problem by having the following process:
• Bring back the full list of languages, with autodetection selected by default.
• Autodetect the language *before* submitting the sentence, so that the user is aware of what language he or she’s about to submit.
• If the selected language isn’t in the user’s profile, display an error message “Sounds like you’re trying to add a sentence in X, but you didn’t add this language to your profile.” And if X was autodetected: “If the sentence was mistakenly detected as X, please select the correct language.”
<<< 1234567 >>