İpuçları

Burada Tatoeba'nın nasıl kullanılacağı, hatalar veya garip davranışların nasıl raporlanacağı gibi genel sorular sorabilir ya da en basitinden topluluğun geri kalanı ile kaynaşabilirsiniz.

Soru sormadan önce SSS'yi okuduğunuzdan emin olun.

Wall (3811 threads)

<<< 1234567 >>
sharptoothed
an hour ago - düzenlendi an hour ago
** Suggestion: Limiting the sentences displayed in "Random sentence" section to the languages defined in member's profile

It seems that a lot of new members are having problems understanding how the Tatoeba "Random sentence" feature works. If they see a sentence in a language they understand among the translations of the random sentence, they simply click "Translate" button without noticing that actually they translate the sentence on top of the list, not the one they maybe want to translate. The fact that all sentences but the one on top disappear after clicking the "Translate" button seems to never prevent many of new members from adding their translations. As a consequence we have a certain amount of wrongly linked sentences and most of such incorrect links won't be found and corrected soon enough.

I propose to limit the sentences that "Random sentence" feature displays to the languages defined in member's profile.
cevapları gizle
odexed
an hour ago
+1
AlanF_US
5 days ago
** Tatoeba update (June 29th, 2015) **

* Languages in drop-down lists (except the ones for adding a new sentence and adding a translation, which have not yet been changed) are now displayed in two parts: the user-specific languages at the top, and the other languages below. If a user is logged in, the user-specific languages are the ones in his or her profile. Otherwise, they are the ones most recently selected.
* Searching for sentences not translated into a particular language is now faster.
* The "has audio" icon is now displayed correctly for translations in lists.
* Previously, when an attempt was made to auto-detect the language for a sentence, and auto-detection failed, the sentence was not added. This has been fixed.
* Searching for a user's sentences in a given language used to cause an incorrect message to be displayed when no results were found. This has been fixed.
cevapları gizle
pullnosemans
3 days ago
awesome improvement on the drop-down lists there.
cevapları gizle
AlanF_US
3 days ago
I agree. We can thank gillux for that.
sabretou
2 days ago
Going to second that, very good change.
sabretou
3 hours ago
Can we have this feature rolled out to all input fields? Specifically Add Sentence and Change Language (for a sentence). In both, I am restricted to the languages in my profile, and this is causing problems.

For Add Sentence, it can lead to situations like this: http://tatoeba.org/eng/sentence...38210#comments

For Change Language, I find that even as a Corpus Maintainer, I cannot fix the flag of some languages because they are not in my profile.

I like the idea of grouping languages according to profile, but I think the current system is restrictive to a fault.
123xyz
yesterday
When Azeri sentences are displayed on the homepage in the "latest contributions" list, the letter "ə" is taller than the other letters. It appears to be in a different font, or something such. Either way, it is not supposed to be this way. When one opens Azeri sentences individually though, the display is fine.
cevapları gizle
gillux
yesterday
I can see it too, but can you provide a screenshot just in case? This problem is likely due to font substitution. We tell browsers to use a certain font to display text on the site (which happens to be Trebuchet MS), whereas this font may not include glyphs used in Azeri (and many other languages), so it fallbacks on a different font only to display "ə". Such font mix can result in height differences especially in small sizes, and style differences. One way to work around this problem is to use webfonts. We talked about that some weeks ago https://github.com/Tatoeba/tatoeba2/issues/684
cevapları gizle
123xyz
23 hours ago
How do I post the screenshot, when there's no "upload image" here? Do I have to upload it online and then post a link?

Anyway, if font substitution is the problem, please find another more "standard" font that includes all glyphs of all languages. I've noticed such problems in WinWord documents myself, but when I select something like Arial or Tahoma, all the glyphs become normal. I haven't tried Trebuchet MS.
cevapları gizle
gillux
23 hours ago
You can use other sites like imgur.com for example.
cevapları gizle
123xyz
20 hours ago
http://i.imgur.com/MRrTp52.png

Thank you for the suggestion. Here is a screenshot showing a couple of Azerbaijani sentences with a less than elegant display.
cevapları gizle
sabretou
5 hours ago - düzenlendi 5 hours ago
This seems to be a problem with DirectWrite rendering, as I see this problem on Google Chrome and Internet Explorer. I don't use DirectWrite on my Firefox, and it renders just fine: http://i.imgur.com/geoHM96.jpg (I use MacType for font smoothing)

Edit: Yep, re-enabled Direct2D on Firefox and the text changed to the larger ə. If I'm not mistaken, font hinting is at play here. I wonder if there are any settings for DirectWrite's font rendering anywhere.
gillux
18 hours ago - düzenlendi 12 hours ago
We are about to put the advanced search feature on tatoeba.org.

EDIT: Trang redesigned the search bar and I added a separate advanced search page: https://dev.tatoeba.org/sentences/advanced_search

But before, we’d like UI translators to translate the new parts of the site on Transifex. For those who would like to help but are unfamiliar with the process, please read this page: http://en.wiki.tatoeba.org/arti...ce-translation

Previous thread: https://tatoeba.org/wall/show_message/23234
cevapları gizle
123xyz
18 hours ago
Lovely :) I like the advanced search options. I'm not happy that "tatoeba.org" has been removed from the right corner, but that's negligible.
Silja
15 hours ago - düzenlendi 15 hours ago
I have trouble translating this sting: "{action} sentences having translations that match all the following criteria." I need to have different case on word "sentences" depending on which word comes before it, "limit to" or "exclude".

How can I translate this properly?

Edit. I guess I managed to translate it now in a way that makes sense.
cevapları gizle
gillux
15 hours ago
You could include the word “sentence” in the action, like this:

• Limit to sentences
• Exclude sentences
• {action} having translations that match all the following criteria.

Or you could reformulate the whole sentence to something that avoids the problem. You don’t need to stick to the original as much as when translating the Tatoeba corpora. As long as the search option is understandable.
Silja
15 hours ago - düzenlendi 14 hours ago
Another translation trouble with the first line of this error message:

"Translate a Esperanto sentence into unknown

There is no result for this search (yet) but you can help us by feeding the corpus with new vocabulary!

Feel free to submit a sentence with the words you were searching."

First of all, I think it's somewhat misleading: I chose from the "show translations in" drop-down option "none". So, the translations are not unknown, I just didn't define the language of the translation. Also, I think the current one sounds strange in English without the word "language", not to mention in Finnish. It would be better if we would't use transifex sting 454 in this case but rather had a new string with words *undefined language* in it.
cevapları gizle
TRANG
14 hours ago
I was a bit confused about how you ended up with this message but I found out:

https://dev.tatoeba.org/eng/sen...amp;sort=words

(I also selected "is orphan" = yes)

We indeed need to review these messages.

In my specific case, I think it doesn't make sense to display a message that encourages the user to add new sentences/translations, since the search was restricted to orphan sentences, and if I didn't restrict to orphan sentences, I would have gotten results.
There are certainly other situations where the message doesn't fit the context.
hakkeb
yesterday
So many great languages! Can you add Vlaams? In English it is known as Flemish. You have Walloon, which is the French of Belgium, so it would be nice to see Vlaams or Flemish, which is the Dutch of Belgium. Thanks!
cevapları gizle
PaulP
yesterday
Walloon is not the French of Belgium, Hakkeb, but a different language. See e.g. https://en.wikipedia.org/wiki/Walloon_language. It has it's own ISO-code: wnl

Flemish sentences are listed as Dutch. If in some cases a Dutch sentence is typically North Dutch, then the tag Dutch - Netherlands is added (see https://tatoeba.org/epo/tags/sh..._with_tag/5719 )

If it is typically South Dutch and probably not understood in the North, then the tag Dutch - Belgium is added. See https://tatoeba.org/epo/tags/sh..._with_tag/5718



cevapları gizle
hakkeb
15 hours ago
I see. Thanks.
Impersonator
yesterday
Do you mean this language: https://en.wikipedia.org/wiki/West_Flemish ?

It has its own ISO code, so it can be added.

Please see this article about how to add your language to Tatoeba: http://en.wiki.tatoeba.org/arti...nguage-request
cevapları gizle
hakkeb
15 hours ago
Yes, this is what I mean. Thank you.
herrsilen
4 days ago
Is there a wish list somewhere, where people can put words that they want example sentences for?
cevapları gizle
123xyz
4 days ago
That's a very good idea.
cevapları gizle
herrsilen
4 days ago
It'd be a good source of inspiration for new sentences as well. As a contributor I’d gladly add sentences with Swedish words that people want in the corpus.
TRANG
4 days ago
There isn't, but the idea has already been mentioned.
Cf. https://tatoeba.org/eng/wall/sh...#message_21418

If you are interested in having such a feature in Tatoeba, please fill this form
https://docs.google.com/forms/d...LbK_8/viewform
cevapları gizle
herrsilen
4 days ago - düzenlendi 4 days ago
Great! I've filled it out!

Go fill it out you too, 123xyz, if you're interested. :)
123xyz
4 days ago - düzenlendi 4 days ago
I have filled out the form. Since I see that the form is not very new, how many forms do you wish to collect before implementing the feature, i.e. establishing whether it's worth it?

Also, how come this potential feature has had a form created for it, whereas so many new features have been introduced thus far without any forms being involved? Is it because a wishlist is especially difficult to design?
cevapları gizle
TRANG
3 days ago
Past 20 responses, it will enter my attention zone. But considering that there are other things that I really want to work on, and that I feel are higher priority, I probably won't be working on this feature until it reaches 40 or 50 responses. Perhaps gillux will want to start something about this earlier than me.

You can see a summary of the responses here:
https://docs.google.com/forms/d.../viewanalytics

This feature has a form for it because it was requested a while ago, and it keeps getting suggested. It was however not something I could find time to implement, so I created the form in order to keep track more precisely of the demands, and evaluate at which point it should become a priority.

Other features didn't have forms because they weren't requested as often. It's also very rare that users take the time to write down their ideas as precisely as possible, like Silja did, which is another reason why this feature gets a special treatment.
Impersonator
4 days ago
I can’t see the list of latest contributions in Toki Pona:
https://tatoeba.org/rus/contributions/latest/toki
cevapları gizle
gillux
4 days ago
I recorded that issue: https://github.com/Tatoeba/tatoeba2/issues/700.
Thank you for reporting!
123xyz
5 days ago - düzenlendi 5 days ago
Why doesn't the auto-detect recognise Macedonian? Whenever I try to use it for a Macedonian sentence, it classifies it as Bulgarian. Now, how, pray tell, could a sentence containing the љ, њ, ј, ќ, ѓ and/or џ be considered Bulgarian, when those letters are fully absent in Bulgarian? Isn't the basic step of language recognition looking at its alphabet, before going into the words themselves?

This doesn't really matter to me, as I don't use auto-detect - I just select Macedonian manually after logging in, and subsequently, it stays as my default language until I log out, which I generally don't do for entire weeks. However, the fact that the auto-detect feature is endowed with such an absurd imperfection makes me uneasy.
cevapları gizle
Impersonator
5 days ago
I believe auto-detect is learning from the existing sentences, but it’s not doing that real-time: it needs to re-learn to take the new sentences into account. Probabely when the auto-detect data was generated last time, we had much more Bulgarian sentences than Macedonian.

> Isn't the basic step of language
> recognition looking at its alphabet,
> before going into the words themselves?

Not neccessarily. There are many algorithms available.
gillux
5 days ago - düzenlendi 5 days ago
> Why doesn't the auto-detect recognise Macedonian?

Because the autodetection algorithm is based on the Tatoeba corpus itself, and we still need to update it manually once in a while so that it takes new sentences into account. Three month ago, there were less than 200 sentences in Macedonian, which was not enough for the algorithm to work. Now you added about 50 000 Macedonian sentences, it will certainly work better once we update it. I’ll let you know when it’s done.
gillux
5 days ago
It should work better now.
cevapları gizle
Guybrush88
5 days ago
actually it seems that it uses the language that is used the most by users. I just added #4322143 and #4322166 to test it and it firstly recognized those sentences as Italian
cevapları gizle
123xyz
4 days ago
But I've never posted anything in Bulgarian, so that's clearly not the only factor.
123xyz
4 days ago
Thank you.
CK
CK
5 days ago
** Stats **

http://goo.gl/QPYHnp = number-of-sentences-by-language-with-no-translations-2015-06-27

http://goo.gl/DexpGV = number-of-sentences-by-language-and-owner-with-no-translations-2015-06-27
pchamorro
6 days ago
For new contributors, it would be a good idea to include a link for this page: http://blog.tatoeba.org/2010/02...n-tatoeba.html
<<< 1234567 >>