** New feature: unapproved sentences **
We are soon going to release a new feature, and I would like to take some time to talk about it. First of all, here's what this feature will do:
* Corpus maintainers will be able to mark a sentence as "unapproved".
* Admins will be able to change the "level" of a contributor. By default contributors have a level of 0, but admins can set this level to -1 so that any new sentence/translation from these contributors are marked as "unapproved".
* Unapproved sentences will still be in the database and will still be indexed whenever we run the indexation, but will be displayed in red on the website.
* Unapproved sentences will however NOT be exported into the CSV file that we distribute.
The goal of this feature is to deal with 2 issues:
1. Bad quality sentences. We want Tatoeba to become more useful for language learners. The problem is that since everyone can contribute sentences and translations, some contributions are not reliable enough for language learning, but maybe not bad enough that it's clear they should be deleted.
2. Non CC-BY sentences. It often happens that new contributors copy-paste sentences from other language learning sources. This is a problem because Tatoeba redistributes the sentences under the CC-BY license and the content needs to be CC-BY compliant.
Setting those sentences as "unapproved" allows us to warn users that there is an issue about the sentence and they should use it with extra care. This feature will also allow admins to act more quickly when a contributor is somehow polluting the corpus. Admins can lower the level of a contributor so that all their next contributions will be marked in red. The contributor will notice themselves that their contributions are red as soon as they are saved.
This feature can obviously be tuned a lot more. Ideally we should treat differently the bad quality sentences from the non CC-BY sentences. Ideally we should set a different level for each user for each language instead. Ideally we should also have approved sentences, and we can also have different levels of approved and unapproved sentences. We just don't have the time and resources to implement these things right now, but they are part of the next steps.
I think that it will be quite useful if every user could set his level of mastering known languages. And that weigt to be taken into account when he contributes a sentance. And if admins mark too many sentences of some language for a user as bad one, that will lower the weight for the language for the user. So I am proposing that the unit of control be the language of the user instead a whole user, also that that weight to be a real number (floating point, not an integer). This way will be much more complicated; it's just an idea.
Une phrase peut être parfaite en elle-même et être une très mauvaise traduction. Qu'est-ce qui sera désapprouvé ? La phrase ou la traduction ?
Ce n'est pas du tout la même chose...
Si on ne fait qu'exclure de mauvaises phrases, on ne résout pas grand chose, car elles sont souvent rapidement corrigées : c'est la partie émergée de l'iceberg.
Mais les erreurs de traduction persistent durant des années...
La solution est de verrouiller une phrase lorsqu'elle est parfaite, de manière à ce qu'elle ne soit plus modifiable, puis d'approuver ses traductions (seulement avec une autre phrase verrouillée)
Unowned sentences should be automatically set to -1 until they are adopted by a user with a higher level.
So only new contributions of a downgraded member will appear as "unapproved"?
As I see it, downgrading a member is pretty much the same as banning a member. It is to say: you are a bad contributor, we don't want your work here. So I guess (and hope) setting someone to -1 is some ultimate measure limited to severe cases when there are already a lot of 'dubious' contributions. Then it would maybe be better to automatically set all their previous contributions to "unapproved", thus saving the CMs' time to do it manually or to delete them.
I think it may be useful to have a way to mark your own contribution as unapproved when you’re not sure about your own sentence.
That is done already for years by tagging @needs native check
Yes, but newbies can't tag...
Newbies can at least leave the comments for any sentence. As for me, I prefer to see the project easy to understand and easy to use even for newbies. Any innovation could make it more complex.
In the tags systems the big problem are New tags, and maybe litle use of existing tags.
I think permision for use tag should to be different of make new tags.
The problem is many people overestimate their own abilities. I think, it will be good, if such tag would be able automatically.
Can the level be set differently for different languages? Like, level 0 in English, but level -1 in German?
From what Trang wrote, it appears that’ll not be part of the first version, but hopefully it’ll be implemented at a later time.