Negeseuon mur FeuDRenais- Tatoeba

FeuDRenais {{ icon }}

keyboard_arrow_right

Proffil

keyboard_arrow_right

Brawddegau

keyboard_arrow_right

Vocabulary

keyboard_arrow_right

Reviews

keyboard_arrow_right

Rhestri

keyboard_arrow_right

Ffefrynnau

keyboard_arrow_right

Sylwadau

keyboard_arrow_right

Sylwadau ar frawddegau FeuDRenais

keyboard_arrow_right

Negeseuon mur

keyboard_arrow_right

Cofnodion

keyboard_arrow_right

Audio

keyboard_arrow_right

Transcriptions

translate

Cyfieithu brawddegau FeuDRenais

FeuDRenais 17 Chwefror 2013 17 Chwefror 2013 am 17:09:45 UTC

link

Permalink

Has anyone been able to set up Sbgobin's script to block certain users' posts? If so, is there a simple way to do it? I can't seem to get it to work...

FeuDRenais 15 Chwefror 2013 15 Chwefror 2013 am 16:47:59 UTC

link

Permalink

Very excellent points.

FeuDRenais 15 Chwefror 2013 15 Chwefror 2013 am 15:57:59 UTC

link

Permalink

0) Not a reply to Demetrius's questions but a general remark - this is another reason why a rating system would be so nice. The reliance on and reliability of @needs native check tags would be greatly diminished.

1) Who can put the tag?

Should be the owner of the sentence, so as to indicate that they are experimenting with their linguistic abilities and would like a confirmation from a native speaker. A third party using it doesn't make much sense - if the third party is a native, then they should just do the check, otherwise, they are not a native and aren't "fit" to judge the sentence any more than the author.

Regardless of whether or not there are ratings in the future, this tag is a neat little thing and should probably be kept as an option. I would argue that it deserves a special button for itself that any user could take advantage of.

2) When should it be used?

(see above)

3) Who can check?

This is tricky. A user may not be native in a language but be as good as a native when it comes to simple beginner sentences like "Hello, how are you?" or "I have two apples." In such case, it's silly to not accept the user's check and wait for a native to come and shine light on such a simple sentence. Here, ratings would probably be a much better way of ensuring quality.

The automatic authority given to a native also bothers me a bit, as a foreign speaker can, with enough hard work, achieve the proficiency of a native without ever becoming one. There are also tricky issues like immigration. For example, SC accuses me of being native Russian and not native English, which is true technically but not linguistically, since my Russian sentences will sometimes have mistakes but my English sentences rarely will (it being my strongest language). Never mind that it's impossible to prove, online, just what you are a native of and what your qualifications are (again, I'm sneakily making a case for a rating system :-)

4) Difference between check and native check?

IMO, "check" is more general and can refer to any number of things (e.g. perhaps the user really meant to type "gimme" instead of "give me" because they wanted to demonstrate a colloquial slang variant, or maybe they just made a mistake...) It doesn't have to do with the uncertainty of a non-native speaker. In many cases, it would probably refer to a need of input from the owner, so as to have the owner themselves confirm that there's no mistake in their sentence.

FeuDRenais 11 Ionawr 2013 11 Ionawr 2013 am 18:46:52 UTC

link

Permalink

Probably starting to bore people with these posts (let's just continue in PMs if you want to, as this is starting to go into details... but what the heck I'll post one last one here!)

"bad": I agree, that's tough. But then the sentence is natural, and everything is okay (in a strange way). So, you've achieved what you set out to achieve. It might get deleted later, but the quality of the language is at least there.

"diversity": I meant, if all of your good ratings come from the same 1 or 2 users.

"more accounts": Put a limit on account creation and make it more difficult to create an account. This is a problem already, if I'm not wrong. There's not even a confirmation e-mail for Tatoeba.

"robust system": I agree completely.

FeuDRenais 11 Ionawr 2013 11 Ionawr 2013 am 18:33:09 UTC

link

Permalink

Your code is right, by the way.

Also interesting to note:

If the good user downrates a rogue's sentence once for every five positive ratings the rogue gives himself, the rogue's rating reaches a maximum at 80.

FeuDRenais 11 Ionawr 2013 11 Ionawr 2013 am 18:11:25 UTC

link

Permalink

Fine, but same question. How do you stop users who should be contributing to Section 2 from contributing to Section 1?

I agree that it would work if you assume that every new user will read the guide beforehand, understand it, and follow it. But so would the current Tatoeba.

FeuDRenais 11 Ionawr 2013 11 Ionawr 2013 am 18:07:09 UTC

link

Permalink

"Flaky answer": Yes, he could, but "just" 25 bad sentences would likely be noticed by good users and downvoted during that time.

"More rigorous answer": I would employ additional safeguards in the formula (e.g. penalizing for diversity of ratings). I would also put an upper limit on rating frequency (e.g. a vote per minute), thereby giving time for non-rogue users to undo the positive self-ratings of the rogue and to limit the scripting. Finally, I would assume that few people would go to such lengths to break a rating system on a language site like this one.

FeuDRenais 11 Ionawr 2013 11 Ionawr 2013 am 17:15:38 UTC

link

Permalink

One question: How do you keep people who should use Tatoeba 2 from using Tatoeba 1? And, if Tatoeba 1 is built on the ideal of the original Tatoeba, won't it suffer the same fate as the original Tatoeba (i.e. be split in two, if this is done)?

FeuDRenais 11 Ionawr 2013 11 Ionawr 2013 am 16:44:08 UTC

link

Permalink

While I agree with you on mostly everything, I will emphasize that rating *sentences* is probably a hopeless task, as there's too many. Especially with specific questions that require thought.

The point of the rate-up/down is to "statistically" (I use the word very loosely) rate users. This way, someone browsing the site can see that a certain sentence was written by someone who has a rating of 90 (i.e. is fairly reliable), and that the translation of the sentence to another language was done by someone who has a rating of 20 (so, the translation may have issues). The idea is not to rate sentences one-by-one. And, depending on how elegant you make the statistics, it can certainly be made scientific.

I do agree that all rating systems can be cheated, but it's much harder to cheat a weighted system. Even if you were to create 100 accounts and to rate yourself up this way, your votes would count for very little compared to those of a reliable user (i.e. a single bad vote from someone with a 100 rating could undo your 100 good votes).

Blind votes... I was thinking about this, and although it's appealing "scientifically", I think it kills a large community aspect, which is, IMO, one of Tatoeba's good points.

And yes, without programmers, talking may be pointless, but at least it sets up a reference for programmers to go back to when they actually appear :-)

FeuDRenais 11 Ionawr 2013 11 Ionawr 2013 am 14:27:44 UTC

link

Permalink

I like worst-case scenarios since then no one can blame you if what you propose crashes the site :-)

I would let people with real database management experience comment on how realistic it is to handle such a thing, but I guess a better way to put it would be that the data would scale linearly with the number of users and quadratically with the number of languages. So... O(mn^2)?

FeuDRenais 11 Ionawr 2013 11 Ionawr 2013 am 14:17:20 UTC

link

Permalink

The regional differences present a challenge, but my counterargument would be:

If the differences are so significant that half of the sentences of a Spanish speaker in Mexico are judged as unnatural by someone in Spain, then Tatoeba should split the two languages into Spanish (Mexico) and Spanish (Spain).

If, however, only 1 out of 20 sentences are judged unnatural, the user from Mexico will still receive enough good ratings for the other 19 to be deemed trustworthy. It should average out in the end - that's the power of the sagesse des foules (which, though I agree with SC, is not always ideal, can still be quite useful when it comes to simple tasks).

The most difficult cases, I guess, would be the borderline ones - i.e. the ones where you don't know if a language needs splitting.

FeuDRenais 11 Ionawr 2013 11 Ionawr 2013 am 12:26:04 UTC

link

Permalink

---- Proposal for a Tatoeba Rating System ----

I apologize for the wall of text that is to come, but I’ve wanted to write something semi-formal considering the issue of quality control on Tatoeba for some time now, which, in my opinion, is present but not to the extent necessary for a site that now has over 2 million sentences in over 100 languages. The basic idea is a rating system, which I will present by first arguing for why Tatoeba needs one, giving some concrete ideas about how one could be implemented, and then finishing by giving the immediate benefits of such a system as well as potential drawbacks.

So…

1) Does Tatoeba need ratings for its sentences?

I think since the inception of Tatoeba the answer has been “yes”, as I remember the original guide by Trang saying that “owning” sentences was a good way to ensure quality but that ultimately ratings would be needed, although this was never implemented. Ratings have been discussed several times in the past on the Tatoeba wall/sentences, although no definitive conclusion has ever been reached and, again, nothing implementable was proposed. Current practices for ensuring good sentences and translations include:

- OK tags (as originally started by CK for English)
- Other tags (e.g. needs native check, translation check, change) to point out potential mistakes
- Comments and discussions
- Encouraging users to translate into their native language only
- Encouraging users to state what languages they know and how well they know them in their profiles

Although all of these work on a small scale, they are not able to cover the full 2 million sentences and who-knows-how-many translation links. Furthermore, these are subjective – a tag is only as reliable as the user, and comments, though sometimes reaching consensus among several native speakers, sometimes do not. The second-to-last point is a bit of a debate, since many users like to practice translating into languages that they haven’t mastered (myself being one), and some new users simply aren’t aware that translating into their native language is encouraged. The last point is also subjective and not something that is done by all users.

My overall feeling is that Tatoeba remains unreliable. Or rather, it is reliable for *me*, because I’ve been here for long enough to know who the trustworthy users are (and they are not always Advanced Users and Corpus Maintainers), and will treat their sentences/translations as reliable and take the rest with a grain of salt. The new user who just joins Tatoeba has no idea, however. Someone who is not a user and simply wants to use Tatoeba for a quick reference will also have no idea regarding whom to trust. A number on a sentence that would convey some sort of statistical reliability would be worth a lot.

2) How would a good rating system function?

Not sentence-by-sentence. Back in the early days (2-3 years ago), this was the major criticism of implementing a rating system – there were simply too many sentences, and it was unrealistic to expect multiple users to rate each one of them for quality, as well as all of the translations. It’s even less realistic now, since Tatoeba has grown a bit in sentence quantity but not that much in the number of active users.

However, it is this last point that can be exploited, as Tatoeba succumbs in no small way to Pareto’s rule, with 80% of the contributions being the work of 20% of the contributors. Actually, the former might be larger and the latter smaller. Some days it feels like there are just 10-20 users contributing 95-99% of the sentences. This makes rating all sentences and translations possible simply by rating the users, who are a lot fewer in number.

So, what’s a good scheme for rating users? Let’s just say that the ratings will be between 0 (unreliable) and 100 (full confidence). It’s probably not good to have one such rating, since a user could be really good at writing, e.g., sentences in French (100), but be horrible when it comes to Mandarin (0), and lumping those into a score of 50 is misleading. There should be a rating per user per language. Additionally, there needs to be a distinction between writing natural sentences and writing translations. As such, a user should have a rating for every possible language pair as well (at the moment this would be 119*118/2 pairs). As the number of languages grows, these numbers will increase, but storage should be feasible, especially since any rating set will be extremely sparse (i.e. most users only translate between, let’s say, 10 language pairs at most, and not 119*118/2).

For the actual implementation, it would be sufficient to have a thumbs up/down for every sentence that has an owner (to rate how natural it is) and a thumbs up/down for every link (to rate the translation). I would recommend using a weighted rating algorithm that would work like this:

i) New User gets 0 positive points and 100 total points upon joining. Their rating is 100*(positive/total) = 0.

ii) If their sentence/translation gets rated by User A (who has a rating of X for the *same* sentence/translation type) and the rating is positive, then New User’s new rating for that specific type is now 100*(positive+X)/(total+X). If the rating is negative, the new score is 100*(positive)/(total+X).

Here’s an example for User N as rated for the naturalness of his English sentences:

User N joins (positive = 0, total = 100, rating = 0). User N writes an English sentence. User A (whose has an English rating of 50) rates the sentence positively. User N is now: positive = 50, total = 150, rating = 33.3. User B, who has an English rating of 10, rates the sentence negatively. User N is now at: positive = 50, total = 160, rating = 31.25. If User N now adds a French translation and links the two, they have three different ratings (one for the naturalness of their English sentences, one for the naturalness of their French sentences, and one for the quality of their English-French translations).

Starting with 100 total points is a way to prevent a user from jumping up to a rating of 100 right away with a single good rating, and the weighting makes it possible to give more power to users who have already been recognized as trustworthy with respect to particular languages or language pairs. Of course, we couldn’t initialize everyone with a rating of 0 since we wouldn’t go anywhere. So, I would propose picking a single representative for each language / language pair that is trusted based on the community’s experience and giving them a starting rating of 50 for that language / language pair.

3) Benefits and potential drawbacks

i) Every sentence and translation that has an owner would simply inherit the owner’s score, thereby immediately providing a 0-100 number that would tell the naïve visitor if the sentence/translation was trustworthy. This would give some indicator of the reliability of the majority of sentences/translations. The drawback here is that one still needs to interpret the score somehow (i.e. a score of 86 doesn’t have any concrete meaning – it doesn’t mean that there is an 86% chance of the sentence/translation being “right”).

ii) Rogue contributors (who put up bad translations for fun) or new members who didn’t read or don’t care about the rules will be handled very efficiently. Instead of having to correct all of their added sentences or, in some rare cases, to even block such users’ access, it is sufficient for a few reliable users to downrate a few of their bad sentences to make it clear that these sentences are not trustworthy. In fact, not rating up is sufficient, since the rating of any new user would be 0 by default.

iii) Arguments would hopefully lessen in their quantity/intensity as well if there’s a “neutral” rating system in place that’s somehow based on collective opinion. Well, I hope. This may or may not happen.

iv) The debate over whether or not to translate only into your native language would disappear, since bad translations would just reflect badly on the user (either discouraging them from translating or simply accepting the fact that their translations in their non-native languages will not be viewed as reliable).

v) A certain drawback is that such a system would not really do much for languages or language pairs with very few contributors (due to the monopoly on those languages by the few contributors), but I don’t see how one could really solve this problem without getting more contributors…

vi) Storage and more database calls could be a drawback, but I don’t see this as being that bad. In the worst case, you can imagine Tatoeba with 1,000,000 users and 6,000 languages, which would result in 6,000,000,000 (naturalness) + (6,000*5,999/2)*1,000,000 (translation) total ratings to track. That’s a lot, but that’s the worst case and will never be achieved (as many users only work with a handful of languages, not 6,000).

Anyway, that’s all I’ve got. Just wanted to put it out there as an idea, since I think it could solve a number of issues that Tatoeba has with respect to the reliability of its sentences (as well as other things). Might be worth trying, and it would be easy to remove if it doesn’t work. I suggest a big discussion on this regardless, since this appears to be a major problem with Tatoeba at the moment.

FeuDRenais 9 Ionawr 2013 9 Ionawr 2013 am 15:16:24 UTC

link

Permalink

It is quite a mess... although, to repeat myself, my "goal" with starting this whole font discussion was not to fix the fonts for myself, but for the random visitor who might be browsing with god-knows-what and god-knows-where.

All this did get me motivated to go do some fieldwork, and so I went into an Uyghur netbar to see what things would look like on an "authentic" Uyghur (well, Chinese) machine that many surfers here would use. Anyway, it's IE 8, and everything pretty much looks terrible (the Tatoeba website, in general, as well). None of the fonts previously put up by Sharptoothed display the sentence correctly here, so I take back my argument about sans-serif being reliable... even that doesn't seem to give anything great here.

All in all, tough problem!

FeuDRenais 9 Ionawr 2013 9 Ionawr 2013 am 07:11:59 UTC

link

Permalink

I fear that such machines may not really exist...

FeuDRenais 9 Ionawr 2013 9 Ionawr 2013 am 07:05:11 UTC

link

Permalink

No clue, haven't tried.

What would you do with this data, though? How could it be made into something implementable?

FeuDRenais 9 Ionawr 2013 9 Ionawr 2013 am 06:03:56 UTC

link

Permalink

I'm sorry, halfb1t. I suppose that I should have stated "the whole page looks so much nicer FOR ME... ON MY COMPUTER... WITH MY BROWSER..."

(I thought those were implicit)

Next time I'll write a wall of text ala you.

My only request here is that Tatoeba try something like sans-serif for all of its Uyghur sentences, since I suspect that this will be better (perhaps much better) than doing nothing (across many computers/browsers).

FeuDRenais 8 Ionawr 2013 8 Ionawr 2013 am 19:35:23 UTC

link

Permalink

All those families, and yet the whole page looks so much nicer if you just put "font-family:sans-serif".

Then again, it's probably worth noting that most Uyghurs are from China, and more than just a few people in China still use IE6. And who knows what looks best when it comes to that mess of a relic...

FeuDRenais 7 Ionawr 2013 7 Ionawr 2013 am 20:39:53 UTC

link

Permalink

I think that's what both sysko and liori said, more or less. You just define style rules for each language individually.

It would also be nice to right-align languages that are read right-to-left, but that's probably asking for too much, eh?

FeuDRenais 7 Ionawr 2013 7 Ionawr 2013 am 19:41:04 UTC

link

Permalink

>>> With (2), feuDRenais should get just what suits him by specifying sans-serif as his default font.

Not sure if I understood everything you wrote, but if you're implying that a user should just specify the fonts themselves to find what suits them, I disagree. It's not the job of the website user to have to do these things. I'm not so much bothered by Uyghur displaying incorrectly for me personally (since I can still read and understand it) as I am by people who are learning it (in whatever capacity) being forced to read a corrupted version of the script.

A solution that I would propose is to use the specifications that are the most robust over different systems/browsers (sans-serif clearly seems the best for Uyghur so far). If that's what you're saying, too, then cool, we agree.

FeuDRenais 6 Ionawr 2013 6 Ionawr 2013 am 17:47:16 UTC

link

Permalink

(Well, in Firefox/IE. In Chrome/Opera, Arial and sans-serif are still good, but a few of the other fonts are good/acceptable as well).

Anyway, Arial or sans-serif for Uyghur would be my request if you're going to implement this, sysko.

Need some help?

Developers

About

Negeseuon FeuDRenais i'r mur (cyfanswm o 401)

Need some help?

Developers

About