menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search

Wall (7,342 threads)

Tips

Before asking a question, make sure to read the FAQ.

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

Latest messages subdirectory_arrow_right

kumakyoo

21 hours ago

subdirectory_arrow_right

frpzzd

21 hours ago

feedback

kumakyoo

22 hours ago

subdirectory_arrow_right

TATAR1

2 days ago

feedback

TATAR1

2 days ago

subdirectory_arrow_right

DaoSeng

6 days ago

subdirectory_arrow_right

kumakyoo

6 days ago

subdirectory_arrow_right

LeviHighway

6 days ago

subdirectory_arrow_right

Ooneykcall

7 days ago

feedback

LeviHighway

7 days ago

sacredceltic sacredceltic August 4, 2010 August 4, 2010 at 10:33:13 PM UTC flag Report link Permalink

Je voudrais savoir si le fait de ne pas indiquer la langue à l'ajout de phrases ralentit le processus d'insertion et sollicite davantage le serveur ou bien si le contrôle est fait de toutes manières.

{{vm.hiddenReplies[2024] ? 'expand_more' : 'expand_less'}} hide replies show replies
xtofu80 xtofu80 August 4, 2010 August 4, 2010 at 10:43:52 PM UTC flag Report link Permalink

I think discussions on the wall should be held in English. Only discussions in the comment section of a particular language might be held in that very language, since a reader who cannot read the comment might also not be interested in the sentence anyway.

{{vm.hiddenReplies[2026] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko August 4, 2010 August 4, 2010 at 10:47:49 PM UTC flag Report link Permalink

I think we should restrict people on the language used, because I know some contributors which are not confident with english, or at least not confident with the vocabulary of the question/suggestion they want to do.
So I think we should let the decision of the language to the one who ask the question, if he wants/can do it in english. It would be a pitty to make someone no ask a question only because he can't express itself in English.

{{vm.hiddenReplies[2027] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko August 4, 2010 August 4, 2010 at 10:49:33 PM UTC flag Report link Permalink

*I think we should NOT restrict

{{vm.hiddenReplies[2028] ? 'expand_more' : 'expand_less'}} hide replies show replies
CK CK August 5, 2010, edited October 26, 2019 August 5, 2010 at 5:46:58 AM UTC, edited October 26, 2019 at 4:06:07 AM UTC flag Report link Permalink

[not needed anymore- removed by CK]

{{vm.hiddenReplies[2033] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic August 5, 2010 August 5, 2010 at 8:24:09 AM UTC flag Report link Permalink

Je suis totalement en désaccord avec ce diktat qui est même tout à fait choquant. En quoi, en effet, l'anglais serait-il davantage "poli" que l'allemand, l'italien, l'espagnol, le français, ...ou toute autre langue ?
La politesse, ça marche dans les 2 sens, donc les anglophones peuvent également faire l'effort de parler d'autres langues ou de se les faire traduire, tout copmme les autres le font. Pourquoi les anglophones feraient-ils cette économie, tandis que les autres en encaisseraient le coût ?
Etes-vous disposés à payer le temps que les autres passent à vous traduire ? Non, bien sûr, donc vous devez partager ce coût.
Ensuite, les capacités des non-natifs anglophones étant constamment remises en cause par les natifs, la communication exclusivement en anglais est inégale, puisque certains s'arrogent le droit de juger le vocabulaire des autres tout en imposant leur propre langue. Les dés sont donc pipés.
Enfin, il arrive dans cette communauté ce qui arrive dans toutes les autres communautés internationales où l'anglais est la seule langue de travail: Il n'y a plus que les anglophones qui aient leur mot à dire.
Les études montrent, en effet, que les organisations internationales ou plusieurs langues sont admises comme langues de travail sont beaucoup plus inclusives et donc plus riches des contributions de davantage de personnes. A l'Organisation Mondiale de la Santé, par exemple, il a été démontré que lorsque les débats étaient en anglais, certaines nationalités n'intervenaient jamais, si bien que leurs travaux ne sont jamais pris en compte. Les organisations exclusivement anglophones sont, au mieux, des organisations borgnes !

{{vm.hiddenReplies[2034] ? 'expand_more' : 'expand_less'}} hide replies show replies
arcticmonkey arcticmonkey August 5, 2010 August 5, 2010 at 8:29:07 AM UTC flag Report link Permalink

Paranoia much?

{{vm.hiddenReplies[2035] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic August 5, 2010 August 5, 2010 at 8:42:30 AM UTC flag Report link Permalink

Wenn Paranoia ist denken dass schreiben auf Deutsch oder Französich nicht unhöflich ist, dann bin ich paranoid, Ja!

{{vm.hiddenReplies[2037] ? 'expand_more' : 'expand_less'}} hide replies show replies
arcticmonkey arcticmonkey August 5, 2010 August 5, 2010 at 9:34:01 AM UTC flag Report link Permalink

That's not what I was hinting at. I actually agree that everyone should be allowed to use the language of their choice. However, your fear of an Anglo-Saxon world conspiracy and google is beyond me.

{{vm.hiddenReplies[2039] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic August 5, 2010 August 5, 2010 at 10:04:31 AM UTC flag Report link Permalink

Je ne sais pas ou vous avez vu que j'avais peur d'une "conspiration anglo-saxonne". Vous avez des références à citer ?
J'utilise par ailleurs Google tous les jours, j'en possède même des actions et je suis un expert en référencement Google. J'utilise pratiquement TOUS les produits Google depuis leur lancement. Je connais donc Google de manière LA PLUS INTIME POSSIBLE.
Je suis donc parfaitement qualifié pour juger que Google n'est TOUT SIMPLEMENT PAS une référence linguistique. Cette seule suggestion fait hurler de rire tous les linguistes professionnels.

{{vm.hiddenReplies[2040] ? 'expand_more' : 'expand_less'}} hide replies show replies
arcticmonkey arcticmonkey August 5, 2010 August 5, 2010 at 10:38:29 AM UTC flag Report link Permalink

"Il n'y a plus que les anglophones qui aient leur mot à dire"

I happen to study English linguistics and I know for a fact that professional linguists use google all the time. There are even dedicated tools that allow you to parse google search results for the purpose of corpus analysis. This is one of those tools: http://www.webcorp.org.uk/

{{vm.hiddenReplies[2043] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic August 5, 2010 August 5, 2010 at 10:56:37 AM UTC flag Report link Permalink

Well, again, maybe most linguists do use Google as a reference for English (although I really doubt that, because I happen to know a few who would just frown at the sheer idea...), but I can assure you this is absolutely not the case for many other languages, probably the vast majority of them.
And by the way, who talked about "parsing Google" ? The argumentation that was thrown at my face here was systematically grounded on raw Google results...

sacredceltic sacredceltic August 5, 2010 August 5, 2010 at 11:00:41 AM UTC flag Report link Permalink

"Il n'y a plus que les anglophones qui aient leur mot à dire" n'a rien de particulièrement paranoïaque puisque l'on propose tout simplement de faire taire les autres sur ce mur.
Je ne combats pas l'anglais, que je pratique moi-même, mais je défends simplement le principe de l'équité linguistique.
Chacun doit fournir les mêms efforts de compréhension à l'égard des autres. L'anglais n'est en aucun cas une langue "neutre" et encore moins "universelle". Il n'y a donc aucune raison de lui accorder un statut privilégié.

sysko sysko August 5, 2010 August 5, 2010 at 10:13:28 AM UTC flag Report link Permalink

Même si je ne suis pas toujours pas d'accord avec Sacredceltic, son précédent message ne parlait absolument pas de conspiration.
Tâchons d'éviter ce genre d'attaques ad hominem, qui ne font qu'envenimer les choses.

{{vm.hiddenReplies[2041] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic August 5, 2010 August 5, 2010 at 10:17:56 AM UTC flag Report link Permalink

attaques "ad hominem" ? Ne serais-tu pas un peu parano, Sysko ?

sysko sysko August 4, 2010 August 4, 2010 at 10:53:08 PM UTC flag Report link Permalink

So for those who wants to know, the question was about "does choosing directly the language instead of "autodetect" while adding sentences will slow down the process of sentence adding, and will it slowdown the server"
my answer is that as for the moment we use the Google's API for language detection (even if we plan to use our own system, I'm not fan of depending on google stuff) it will not slowndown the server, but it maybe speed up the time it takes to add a sentence (but I wonder if it's "visible" for a user)

sacredceltic sacredceltic August 4, 2010 August 4, 2010 at 10:58:53 PM UTC flag Report link Permalink

Ich bin anderer ansicht...

sysko sysko August 4, 2010 August 4, 2010 at 10:43:46 PM UTC flag Report link Permalink

pour l'instant la détection de la langue est faite via l'API google (cela pour des raisons historiques, nous attendons une refonte prochaine du site, pour passer sur nos propres algorithmes de détection automatique)
cela ne sollicite donc pas plus le serveur, mais préciser la langue rend l'ajout plus rapide vu qu'il n'y une chose en moins à faire. Après je n'ai jamais pris le temps de comparer les deux, pour voir si le gain de temps coté utilisateur était "observable".
l'option en fait est surtout pour les langues non supportés par l'API de google, ou produisant beaucoup de mauvaise détection (comme le shanghaïen, le latin etc.)

saeb saeb August 4, 2010 August 4, 2010 at 8:44:06 PM UTC flag Report link Permalink

using an idiom as a translation:
http://tatoeba.org/eng/sentence...11354#comments

what's the official position on this?

{{vm.hiddenReplies[2017] ? 'expand_more' : 'expand_less'}} hide replies show replies
saeb saeb August 4, 2010 August 4, 2010 at 9:35:55 PM UTC flag Report link Permalink

one more thing, we need to strictly define when does two sentences 'match', when is a direct link appropriate, when is an indirect link appropriate, and when an indirect link isn't acceptable.

There's way more linguistic nuances than just idioms and we really can't start being picky about every single one of them...

{{vm.hiddenReplies[2020] ? 'expand_more' : 'expand_less'}} hide replies show replies
xtofu80 xtofu80 August 4, 2010 August 4, 2010 at 10:02:45 PM UTC flag Report link Permalink

I think it would be wise to mark non-idiomatic translations of proverbs by a tag, because often translations of proverbs are rather weird.

{{vm.hiddenReplies[2021] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul August 4, 2010 August 4, 2010 at 10:15:19 PM UTC flag Report link Permalink

> I think it would be wise to mark non-idiomatic
> translations of proverbs by a tag

I already do. If you check, you'll see there are "translated-proverb" tags out there.

FeuDRenais FeuDRenais August 5, 2010 August 5, 2010 at 8:50:36 AM UTC flag Report link Permalink

Likewise, I mark idiomatic translations as "Adapted Translation" and non-idiomatic ones as "Literal Translation".

blay_paul blay_paul August 4, 2010 August 4, 2010 at 10:14:31 PM UTC flag Report link Permalink

> when an indirect link isn't acceptable.

An indirect link is _always_ acceptable, as long as the two direct links it consists of are both acceptable.

blay_paul blay_paul August 4, 2010 August 4, 2010 at 8:48:31 PM UTC flag Report link Permalink

I don't know how 'official' my position is, but this is what I think.

Now, obviously, we'd like to have idioms included in the example sentences. We'd also like to have those examples translated. No controversy so far, I hope? ;-)

So, the point is that not every idiom will have an equivalent in every language it is translated to. Therefore it is impossible to always translate an idiom with an idiom.

Thus it must be acceptable, in some circumstances, to translate an idiom by a non-idiom phrase. Personally I see no reason why it should not always be allowed - but I prefer idom - idiom translations if they are available and natural.

blay_paul blay_paul August 4, 2010 August 4, 2010 at 9:32:51 PM UTC flag Report link Permalink

'Delete' tagging.

Please use @delete to mark these sentences for the attention of moderators. I think other tags that want action to be taken should also start with '@' (e.g. @Needs Native Check)

TRANG TRANG August 4, 2010 August 4, 2010 at 12:06:59 PM UTC flag Report link Permalink

To anyone who has time, here's a couple hundred of sentences to take care of:

http://tatoeba.org/eng/sentences_lists/edit/24

It's simply a list of all the sentences containing the character '/'.

Swift Swift August 3, 2010 August 3, 2010 at 2:31:54 PM UTC flag Report link Permalink

It's great to see that we now have Japanese readings in furigana even though there are still a couple of things that need ironing out. blay_paul pointed out what needs to be done to move the parenthesis to the right place[1] but there is another issue that I haven't seen discussed.

The generated ruby terms for "大きな" and "持っている" in sentence nº251753 are "おおきな" and "もっ", respectively. This creates a bit of redundancy and isn't quite as nice as what I've seen as standard. At the same time, it's hardly the most pressing issue.

I installed MeCab to see if this might just be a simple matter of configuration but found little of use. The output also seems to be somewhat detailed if generated at all (i.e. the default and dump modes as opposed to the wakati mode). So, I figured the furigana feature might have some code to filter the output and choose which atoms to add into ruby terms.

Should this be the case, would it be easy enough to expand it to cut the hiragana ending of the readings, or is there some other way to clean this up in a fairly simple way?

[1] http://tatoeba.org/eng/wall/sho...9#message_1869

{{vm.hiddenReplies[1962] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG August 4, 2010 August 4, 2010 at 11:53:52 AM UTC flag Report link Permalink

> blay_paul pointed out what needs to be done to move the
> parenthesis to the right place

Except that for me it doesn't output like Paul pointed out... It outputs like this:
を <ruby><rb>監視</rb><rt>(</rt><rt>かんし</rt><rt>)</rt></ruby> し た 。

I have the same version of Firefox as he does. But I unfortunately don't have time to figure out what could be wrong...

As for your issue, I wasn't the one who implemented the MeCab part so I can't say if we can simply choose another atom. If not, we could tweak the output, but it's not something I can work on right now either.

{{vm.hiddenReplies[2002] ? 'expand_more' : 'expand_less'}} hide replies show replies
Swift Swift August 4, 2010 August 4, 2010 at 1:03:59 PM UTC flag Report link Permalink

> Except that for me it doesn't output like Paul pointed out...

Odd! Wait, is this done client-side? I, by the way, get the same as Paul.

Besides, if that is actually your output, it's still wrong. the parenthesis should be enclosed in rp-tags, not rt.

{{vm.hiddenReplies[2004] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG August 4, 2010 August 4, 2010 at 2:22:09 PM UTC flag Report link Permalink

Yes sorry, I wrote <rt></rt> instead of <rp></rp>. To me it does show up as <rp>.

And yes the ruby is done client-side. <ruby> tags are not W3C compliant so I didn't want to generate them server-side.

Here's the javascript for it:
http://js.tatoeba.org/js/furigana.js

Without the javascript, you would see something like this:
あの 山[やま] に 登る[のぼる] に は 完全[かんぜん] な 装備[そうび] が 必要[ひつよう] だ 。
(http://tatoeba.org/eng/sentences/show/231003)

{{vm.hiddenReplies[2005] ? 'expand_more' : 'expand_less'}} hide replies show replies
Swift Swift August 4, 2010 August 4, 2010 at 4:13:06 PM UTC flag Report link Permalink

Actually, the ruby tag is an official W3C recommendation.[1] It just hasn't been picked up by most browsers. Internet Explorer has had support for it for a while (they even support vertical text; which is pretty cool) but even those who had plans to implement it have had to put it on the back-burner.

While it's not part of XHTML 1.0 it is part of XHTML 1.1[2], the HTML 5 draft[3] and as a module in CSS 3.[4]

Fixing the lang and name attributes mentioned in [2] and changing the Tatoeba DTD to XHTML 1.1 would make the ruby tag expected.

[1] http://www.w3.org/TR/ruby/
[2] http://www.w3.org/TR/xhtml11/changes.html
[3] http://dev.w3.org/html5/spec/te...e-ruby-element
[4] http://www.w3.org/TR/css3-ruby/

{{vm.hiddenReplies[2014] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG August 4, 2010 August 4, 2010 at 4:49:35 PM UTC flag Report link Permalink

Thank you very much, Swift!

I'll take care of this when I have more time.

CK CK August 4, 2010, edited October 26, 2019 August 4, 2010 at 3:26:58 PM UTC, edited October 26, 2019 at 4:06:23 AM UTC flag Report link Permalink

[not needed anymore- removed by CK]

{{vm.hiddenReplies[2007] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul August 4, 2010 August 4, 2010 at 3:38:41 PM UTC flag Report link Permalink

Ah, mystery solved.

I was using this : https://addons.mozilla.org/en-U...ox/addon/6812/

now it works without it.

{{vm.hiddenReplies[2009] ? 'expand_more' : 'expand_less'}} hide replies show replies
Swift Swift August 4, 2010 August 4, 2010 at 3:57:26 PM UTC flag Report link Permalink

And I'm using the same. OK, now we've found the culprit, but it would be nice to get ruby to work with both the intended and fall-back renderings.

blay_paul blay_paul August 4, 2010 August 4, 2010 at 9:34:13 AM UTC flag Report link Permalink

Quoting and Copyright

Copyright came up again in the comments:
http://tatoeba.org/eng/sentence...50322#comments

Would it be possible to, say, get an opinion in writing from someone about whether single line attributed quotes here can be covered by the "educational use" exception in French copyright law?

If they are included with a suitable disclaimer and tagged (so they can all be removed at once if necessary) I don't think there is a significant risk of real trouble (but I'm not an expert :-(

{{vm.hiddenReplies[2000] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko August 4, 2010 August 4, 2010 at 10:02:38 AM UTC flag Report link Permalink

the fact is the "educational" use exception imply that you attribute the source, it would not be problem if sentences where by tatoeba user on tatoeba for tatoeba only, the problem come the fact we release those sentences under a licence which authorize all possible use, which imply commercial and non educational use, moreover the export for the moment does not include attribution to the original author, so even for educational purpose, you need to attribute.

CK CK August 3, 2010, edited October 26, 2019 August 3, 2010 at 11:34:35 PM UTC, edited October 26, 2019 at 4:06:52 AM UTC flag Report link Permalink

[not needed anymore- removed by CK]

{{vm.hiddenReplies[1993] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul August 4, 2010 August 4, 2010 at 3:31:59 AM UTC flag Report link Permalink

Big flaw of rating systems on Tatoeba? Too many sentences.

{{vm.hiddenReplies[1996] ? 'expand_more' : 'expand_less'}} hide replies show replies
FeuDRenais FeuDRenais August 4, 2010 August 4, 2010 at 4:04:55 PM UTC flag Report link Permalink

"Too many sentences"

This could be solved with some optimization, but I don't know enough about coding to say how easy it would be to implement.

There could, for example, be a rating "tab" at the bottom of all Tatoeba pages that just showed a pair of sentences together and asked the user for an (optional) 1-10 rating. The pair of sentences would, of course, be in languages that the user has translated/linked the most pairs of sentences in. Again, could be messy to program, but this would neatly encourage ratings in an optimal fashion, rather than haphazardly (in which case, of course, there's too many sentences).

Swift Swift August 4, 2010 August 4, 2010 at 2:39:05 AM UTC flag Report link Permalink

Personally, I prefer simple systems because they are less likely to break. I'm hardly the most active member of this community, but I fail to see the great threat from rogue contributors.

Should the community want to implement something like this, the four-step method you propose may work. However, apart from the obvious problem that you note regarding edited sentences there is also the problem that our content is different from Youtube in fundamental ways.

General users of Youtube watch particular videos and on topics that they generally have a very specific reaction to. The rating reflects that reaction; a very simple, natural measure to a particular experience.

General users of Tatoeba search for words or sentence patterns and will often browse through a number of sentences for what best fits their needs. I reckon that the greater number of items to rate will decrease dramatically the incentive to rate sentences (unless there is something especially wrong; in which case one would hope the user would be inclined to fix it anyway).

I'd rather see myself as the hard critic or devil's advocate than the pessimist, but I have my doubts that voting would be something the general user will do, leaving the task to the core community ... which is already working on this very issue by leaving comments (at least in the case of strange-sounding and unacceptable). What constitutes "very natural" sentences is going to differ between areas and might be better dealt with in comments.

On the other hand, should it be presented in a fairly seamless manner (e.g. with easy to use buttons on the search results page) and users react well to it, it could help users pick the best results. I'm furthermore not enough of a nay-sayer to stop well-meaning volunteers from working on areas that the majority is for.

That said, I'm still unconvinced about the need for this, reckon it will be too complicated if it is to be any good and fear it would be more of a distraction than of use. I think the current approach to adopt sentences and comment on that which seems odd is working well for the time being.

saeb saeb August 4, 2010 August 4, 2010 at 3:52:48 AM UTC flag Report link Permalink

and too little 'active' contributors...and even less moderators...

I wonder if a simple flagging system based on tags would do...for example a button that's available to all (registered) users that when pressed adds a 'moderator attention' tag and needs a comment to be added first for the button to work...

FeuDRenais FeuDRenais August 4, 2010 August 4, 2010 at 5:10:00 AM UTC flag Report link Permalink

Assuming the worst about human nature, I think the system would be more robust if it was a simple 0 to 100% rating, and if the "gravity" of a user's vote was a function of their contributions (quality and quantity) in that language. Youtube can never do something like this, but Tatoeba's structure would allow it.

This way, anyone could vote. For example, if twenty people all downrated a specific (say, French) sentence just to be nasty (let's say it was making light of the Bible, and they were all very pious Christians), but had no contributions in French, their votes would have 0 effect on the rating. This would put a natural feedback on bad voting behavior, and would automatically give weight to "good voters" (i.e. "good contributors").

FeuDRenais FeuDRenais August 4, 2010 August 4, 2010 at 5:47:23 AM UTC flag Report link Permalink

Also, the rating system wouldn't be on the individual sentences, would it, but on pairs (i.e. the links between the sentences)? That would cover both the quality of translation and the naturalness...

TRANG TRANG August 4, 2010 August 4, 2010 at 1:11:20 AM UTC flag Report link Permalink

Submission policy

http://blog.tatoeba.org/2010/08...f-content.html

To quote myself:

"This article explains what kind of content we accept in Tatoeba, what kind of content we delete and what kind of content we review. Note that this article is not final. You have the right to object to something or to ask for more clarifications."

opti opti August 3, 2010 August 3, 2010 at 5:26:21 PM UTC flag Report link Permalink

Nutzungsbedingungen.

Bitte übersetzt die Nutzungsbedingungen auch in andere Sprachen.
Bis jetzt gibt es nur Französich und Englisch.

{{vm.hiddenReplies[1984] ? 'expand_more' : 'expand_less'}} hide replies show replies
MUIRIEL MUIRIEL August 3, 2010 August 3, 2010 at 8:12:51 PM UTC flag Report link Permalink

Das ist leider nicht so einfach, sonst wäre es schon passiert.

xtofu80 xtofu80 August 3, 2010 August 3, 2010 at 10:01:23 AM UTC flag Report link Permalink

Resigning from Tatoeba:
Today, we found lots of bad sentences by a user who has not logged in for three months. If we want to change the sentences, we have to revert to admins on a sentence-by-sentence basis. Have you given any thought about users who decide to leave the project?
1) Is there the possibility to resign, thus removing one's ownership of the senteces so that they become unowned? This possibility should be known to all new users, either mentioned in a tutorial or mentioned in a welcome mail.
2) Is there a standard procedure to check whether a user is still interested to contribute to tatoeba?
I fear that in the long run, admins will be overwhelmed with correcting sentences by inactive users.

{{vm.hiddenReplies[1942] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG August 3, 2010 August 3, 2010 at 12:07:03 PM UTC flag Report link Permalink

It actually happened with Paul. He was away from Tatoeba and WWWJDIC for a whole year, and I wasn't sure if he'd ever come back. So what I did was transfer all this English-Japanese sentences to another user (fcbond).

There is no standard procedure to resign yet. There is also no way to check if a user is still interested in contributing... You have to resort to common sense.
If someone wants to stop working on Tatoeba, the best thing they can do is to inform the community about it, and to find other active members who are willing to adopt their sentences...

Anyway, as the community grows, it will definitely become more and more overwhelming for moderators, but we will adapt progressively. A few months ago, I was still the only one who could edit other people's sentences...
But in the long run, admins and moderators will not be the only ones with the permission to edit other people's sentences. Hopefully, we'll have a more advanced system where power and responsibilities will be better distributed, instead of hierarchical like it is now.

{{vm.hiddenReplies[1950] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic August 3, 2010 August 3, 2010 at 12:17:30 PM UTC flag Report link Permalink

Vous devriez instaurer un indice de fiabilité des phrases d'un contributeur pour chaque langue dans laquelle il contribue, en fonction du nombre de corrections effectuées sur les phrases qu'il a créées.
Ainsi, vous disposeriez d'un outil objectif d'appréciation des aptitudes linguistiques de chaque contributeur dans chaque langue et pourriez leur allouer des droits de corrections en fonction de cette aptitude et de la langue de la phrase à corriger.
Le seul défaut que je verrais à ce système, c'est qu'il ne faudrait pas compter les phrases "adoptées", car souvent on adopte une phrase pour en corriger une faute qui en masque 1 ou 2 autres derrière qu'on ne voit pas au premier abord par distraction de la principale...

{{vm.hiddenReplies[1951] ? 'expand_more' : 'expand_less'}} hide replies show replies
MUIRIEL MUIRIEL August 3, 2010 August 3, 2010 at 1:00:30 PM UTC flag Report link Permalink

moi, je vois plus de défauts d'un tel système:
1) souvent je remarque tout de suite après avoir envoyé une phrase qu'elle contient une faute (de frappe normalement). Donc je corrige tout de suite. Un système automatique me le compterait comme faute et je deviendrais moins fiable grâce à ca alors qu'en vérité ce n'est pas un indice pour le fiabilité de mes phrases.
2) un tel système ne pourrait sûrement pas faire la différence entre fautes graves et fautes plutôt triviales. Comme ca un contributeur qui n'a jamais utilisé d'espace fin insécable en Francais aura bcp bcp de corrections, mais ptêt à part ca il est un très bon traducteur. On peut pas comparer des fautes comme "pas de ponctuation", "virgule de trop ou virugle oubliée" (faute fréquente en allemand), "accent oublié" à des fautes comme "phrase qui ne correspond pas à la traduc", "phrase qui est grammaticalement pas correcte" etc.

Donc pour ces raisons je suis contre un tel système aveugle.

{{vm.hiddenReplies[1953] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic August 3, 2010 August 3, 2010 at 1:13:46 PM UTC flag Report link Permalink

Il suffirait de ne pas compter les corrections qui sont apportées dans les 5 premières minutes et aussi d'ignorer les corrections de ponctuation et capitalisation (sinon j'aurais aussi zéro, sans doute).
Mais le système actuel est sans doute plein d'arbitraire, et si on veut motiver les gens à la qualité, un système objectif est préférable. De plus, les "mauvais" traducteurs ou correcteurs seront tentés de partir, ce qui bénéficiera à la qualité d'ensemble.

{{vm.hiddenReplies[1955] ? 'expand_more' : 'expand_less'}} hide replies show replies
xtofu80 xtofu80 August 3, 2010 August 3, 2010 at 1:36:40 PM UTC flag Report link Permalink

I think the greatest boost in translation quality can be gained by discouraging nonnative speakers to generate foreign language sentences. I found that the most objectionable sentences (objectionable as in "this is horrible, you cannot possibly say it like this") were generated by non-native speakers. It would be far more preferable if everyone basically produces translations in one's mother tongue. Of course, this does not need to be a fixed rule, since many of us are also fluent in a foreign language. But I read several sentences which clearly show that the user was not capable of translating into that language.

{{vm.hiddenReplies[1957] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko August 3, 2010 August 3, 2010 at 2:22:48 PM UTC flag Report link Permalink

I think it's about more being able have a way to clearly make the difference between a "just added sentence/ waiting for validation" and a "you can trust us, it's natural and correct sentence" rather than discouraging non native to contribute in a given language.
@sacredceltic, yep I agree with you we do need such a system, but as for automatic correction, the main problem is the time the developper of tatoeba have, and the power of server behind, because such an algorithm, even if easy to write on paper, and not so hard to write in code, is tricky to make without needing some computation, checking in the database etc. and unfortunately the server with the current feature, and current activity has already reached its limit.
So unfortunately this will take some time to implement until we found a fast/optimized way to do this.

{{vm.hiddenReplies[1960] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic August 3, 2010 August 3, 2010 at 3:23:03 PM UTC flag Report link Permalink

I think the algorithm could be made very simple and straightforward, once we agree on the rules.
It could be something like this:
Each time a sentence is corrected
=> check when the last update took place
=> if it is over x minutes from now (x to be assessed)
==>check if it matches the former version regardless of punctuation and spaces (easy!)
==> if not, +1 in the (un)reliability counter for that sentence.
-end-
It should not load the server much more than the present history functionalities do, I think...

{{vm.hiddenReplies[1969] ? 'expand_more' : 'expand_less'}} hide replies show replies
Demetrius Demetrius August 3, 2010 August 3, 2010 at 9:38:56 PM UTC flag Report link Permalink

I don’t think the number of corrections can be equal to unreliability.

In fact, I think what we need is a rating system, which would allow users to vote for the translations they believe to be good.

By the way...
> check if it matches the former version regardless
> of punctuation and spaces (easy!)
Punctuation rules and spaces are also very important. Comma in the wrong place can change the meaning of the sentence more than a misspelled word.

{{vm.hiddenReplies[1987] ? 'expand_more' : 'expand_less'}} hide replies show replies
FeuDRenais FeuDRenais August 3, 2010 August 3, 2010 at 10:00:15 PM UTC flag Report link Permalink

Quick word on the idea of a rating system:

It's good, but I'm hoping it'll have a trusted_user filter, or something of the sort. If not, it could become a mess like, say, YouTube, where completely unrelated things like boredom and political ambitions lead to total havoc.

Just taking (my beloved) Uighur as an example... If this website became relatively known to a large group of patriotic Chinese (or even anti-Muslims), there might be people who would just go and downrate Uighur sentences because they are Uighur. It sounds silly to bring this up here, but I've seen it happen on YouTube with perfectly good videos (and comments).

Just putting that out there in case the rating system is aimed to be available to ANY registered user in the future...

sacredceltic sacredceltic August 3, 2010 August 3, 2010 at 11:13:19 PM UTC flag Report link Permalink

The public vote for translations will always benefit the worst one, as in finance, where bad currency always beats the good one...
For punctuation, I was thinking of only these unnerving ending spaces and dots...

Scott Scott August 3, 2010 August 3, 2010 at 3:43:49 PM UTC flag Report link Permalink

I'm not sure I like the idea of rating users. And I don't like the idea of punishing users if they correct their sentences. This might incite them to never correct them to get a better rating.

I think there are a few things that we could do:
- Have a warning somewhere that tells users to be careful when they enter sentences in languages with which they're not familiar.
- Have a warning when you enter a sentence that reminds you to capitalize the sentence and end it with a period.
- Have a better system to track down sentences that need to be corrected my moderators.

{{vm.hiddenReplies[1974] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic August 3, 2010 August 3, 2010 at 4:00:16 PM UTC flag Report link Permalink

But how do you assess the "languages with which they are not familiar" in order to raise the message ?
And when you get a million sentences, the moderators will just be drowned!

As for the incitation not to correct, I disagree:
- most people here want to do good and they will if they must. What I see since I'm here, is that most people correct or challenge the suggestions on their sentences very quickly, apart from a few "dead" contributors, which is another issue already raised by xtofu80.
- A batch procedure could check the number of days since an error has been reported and not cleared, and lower the score of the author accordingly.

{{vm.hiddenReplies[1975] ? 'expand_more' : 'expand_less'}} hide replies show replies
saeb saeb August 3, 2010 August 3, 2010 at 4:28:47 PM UTC flag Report link Permalink

I don't think it's a good idea to discriminate against users based on their 'score'...

{{vm.hiddenReplies[1977] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic August 3, 2010 August 3, 2010 at 5:20:41 PM UTC flag Report link Permalink

Well, what do you suggest,then ? A poll ?

{{vm.hiddenReplies[1982] ? 'expand_more' : 'expand_less'}} hide replies show replies
MUIRIEL MUIRIEL August 3, 2010 August 3, 2010 at 6:56:39 PM UTC flag Report link Permalink

I think the tool that shows users that they overrate their skills exists already: comments. Simple comments like we post every day, alerting a mistake in a sentence. Users will remark themselves when they're corrected often and can conclude themselves that they overestimated their skills if they did. I never saw a user making many corrected mistakes who didn't stop contributing in the concerned language (or at least contribute less and simplier sentences). As for old contributions you should also consider that comments didn't always exist on Tatoeba.

xtofu80 xtofu80 August 3, 2010 August 3, 2010 at 4:56:08 PM UTC flag Report link Permalink

Right, it is not about discrimination, it is rather about education. What I was originally aiming at was an encouragement, for example in an introductory video/text/mail, to add only sentences in one's mother tongue or a language one is confident in. I still think that tatoeba is underdocumented / underguided. We should develop a nice tutorial, also some translation guidelines.

{{vm.hiddenReplies[1978] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic August 3, 2010 August 3, 2010 at 5:18:01 PM UTC flag Report link Permalink

I disagree. I can testify that over all my career working with languages, people systematically overrate their skills, including in their native language.
Make the following test in a meeting with ANY people of different nationalities: Ask if anyone of them doesn't understand proper English. You will be amazed that hardly anyone raise their hands...

blay_paul blay_paul August 3, 2010 August 3, 2010 at 4:19:26 PM UTC flag Report link Permalink

So what happens when a native speaker of a language, who has corrected a couple of thousand sentences in that language, points out that a sentence is unnatural and the sentence owner (not a native speaker) disagrees?

{{vm.hiddenReplies[1976] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic August 3, 2010 August 3, 2010 at 5:09:09 PM UTC flag Report link Permalink

It follows the present procedure and goes up to a moderator...

{{vm.hiddenReplies[1979] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul August 3, 2010 August 3, 2010 at 5:14:57 PM UTC flag Report link Permalink

Oh, I forgot to mention. The person pointing out the mistake was a moderator in the first place.

{{vm.hiddenReplies[1980] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic August 3, 2010 August 3, 2010 at 5:23:13 PM UTC flag Report link Permalink

Yeah, and this moderator would harrass the contributor tiredlessly...
Well, all this pushes towards an impartial mechanical assessment system. We'll set the scores then...

blay_paul blay_paul August 3, 2010 August 3, 2010 at 2:35:28 PM UTC flag Report link Permalink

> "just added sentence/ waiting for validation"

The problem is that 'validation' is not going to happen. Or, rather, that new sentences are added faster than sentences are 'validated' (not that there is a proper validation system yet, anyway).

{{vm.hiddenReplies[1963] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic August 3, 2010 August 3, 2010 at 3:27:13 PM UTC flag Report link Permalink

But I think there is a good case for sysko's suggestion of a "to be validated status", because some people (me) would translate very fast on the fly and re-check later on after second thoughts.
For instance, multiple errors in one single sentence are often overlooked when correcting the first one. Only after some time can you re-read the sentence without the focus on the former error and your mind can be free hunt for the others more efficiently.

sacredceltic sacredceltic August 3, 2010 August 3, 2010 at 3:27:58 PM UTC flag Report link Permalink

*to hunt

QED

FeuDRenais FeuDRenais August 3, 2010 August 3, 2010 at 2:05:31 PM UTC flag Report link Permalink

@xtofu: I understand, but disagree.

"to generate foreign language sentences outside their comfort zone" would be a more suitable, though even harder to realize, goal (I think). Harder because you need honest users who will admit their own limitations.

Flat-out forbidden translating into foreign languages has two apparent drawbacks, in my opinion.

1) Certain languages (e.g. Uighur, Uzbek, Tartar, Serbian, and others) would have no active (or any) contributors.

2) For a language learner, to put forth a *correct* translation in a foreign language, with the feeling that you've contributed to a community in the process, is a very good feeling, and an excellent supplement to book/class study (I do not mention spoken skills, since those are not yet very present at Tatoeba). To take this feeling away, and to say "no, you should not translate into languages that you are not completely proficient in"... kills a lot of the magic of this website. There should be no reason why a beginner-level student should not be allowed to translate beginner-level sentences.

I doubt you would disagree with me on this. The real problem that you mention, I think, is when people try to take on higher levels...

{{vm.hiddenReplies[1959] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic August 3, 2010 August 3, 2010 at 3:16:19 PM UTC flag Report link Permalink

it not about "Flat-out forbidden translating into foreign languages", it is about granting the right to correct others, which was my initial proposal.
I agree with you that learners should not not be discouraged. But learned ones should be able to correct them based on their higher skills, which is not the case now.
As there is no technology available to assess the true nationality or descent of anybody, (hopefully!) and 99.99% of natives would never acknowledge their limits, the only path in my view is a mechanical, perfectly impartial approach to the problem.

{{vm.hiddenReplies[1966] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko August 3, 2010 August 3, 2010 at 3:23:19 PM UTC flag Report link Permalink

I agree with you, and we will still be able to judge if the system works bad in some "extreme" case (I have none in mind, but my point was that even if it's automatic, the human judgement will still be here to decide), so I think sacredceltic idea is not bad.

{{vm.hiddenReplies[1970] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic August 3, 2010 August 3, 2010 at 3:37:21 PM UTC flag Report link Permalink

But think: If a "superuser for language L" corrects a layman because he has been granted the rights to do so as a result of this skills-determination-algorithm and makes a mistake, you could always discipline him the same way if he his in turn corrected (maybe with a double sentence...) so his status of "superuser" would recede...
So in any case, it should be working...

blay_paul blay_paul August 3, 2010 August 3, 2010 at 3:18:00 PM UTC flag Report link Permalink

> it is about granting the right to correct
> others, which was my initial proposal.

That seems a little out of character.

xtofu80 xtofu80 August 3, 2010 August 3, 2010 at 2:27:17 PM UTC flag Report link Permalink

Well, I used the word "discourage" to indicate that the issue is more complex than a flatout prohibition of all contribution apart from one's mother tongue, and I can partly agree to the notion of "generating foreign language sentences outside their comfort zone".

However, I fear that example sentences created by non-natives will often be slightly awkward. Personally, I even hesitate to contribute in English, often double-checking with Paul if I have a correction of an English sentence. I think it is a question of corpus quality. In my personal opinion, it is far better to use this website by translating sentences from the language you want to learn into your mother tongue. By doing so, you 1. learn the natural sentence structure, collocations, phrases etc. of the foreign language. On the other hand, if you produce sentences in a foreign language, they will sometimes be flatout wrong, sometimes be slightly wrong, sometimes sound weird etc. All of those less-than-wrong, but still awkward sentences will deteriorate the quality of the example sentence database. You are right that this restriction would lose the possibility of producing L2 output. But it would reduce the amount of poor quality data on this website.

If people are looking for an opportunity to produce L2 content and be corrected by others, they can go to http://lang-8.com/, where you can write texts in your L2 and get corrected by others. Maybe it is only my personal opinion, but I would rather opt for a high-quality sentence database.

{{vm.hiddenReplies[1961] ? 'expand_more' : 'expand_less'}} hide replies show replies
FeuDRenais FeuDRenais August 3, 2010 August 3, 2010 at 2:53:58 PM UTC flag Report link Permalink

You make a very good argument.

Perhaps a quota on sentences/day that increases as your "trust level" goes up in a particular language?

On a different note, a quota would also help deal with Tatoeba binging and the high sentence birth to sentence correction ratio.

sacredceltic sacredceltic August 3, 2010 August 3, 2010 at 1:49:41 PM UTC flag Report link Permalink

I agree, as a Japanese was just teaching me my own native language, but the situation is more complex as there are illiterate people among the natives as well and there are many people of mixed cultures. So I definitely think a tool is needed and learned natives will become the judges for their iwn languge as a natural consequence.

MUIRIEL MUIRIEL August 3, 2010 August 3, 2010 at 1:10:58 PM UTC flag Report link Permalink

ah, encore un point important:
un tel système minorerait la volonté des gens de corriger des phrases acceptables mais pas parfaites.

blay_paul blay_paul August 3, 2010 August 3, 2010 at 10:45:51 AM UTC flag Report link Permalink

I agree that this has the potential to become a serious problem. However I think it can be alleviated significantly by some technical suggestions (some of which I've made before).

The comment system is no longer sufficient to keep up with requests for moderator action. The 'latest comments' disappear too fast, and it is too easy to miss ones that actually require moderator intervention - especially given the different languages involved.

I think there should be a specific way to mark comments as "action required", and all such comments should be included in a list until dealt with. The list should include sorting by date so that moderators will automatically know which ones have been waiting for two weeks or more without being dealt with.