** Tatoeba Day #5 **
Tatoeba Day starts now and as usual, will end in 24 hours.
If you don't know what Tatoeba Day is about, you can find information about it here:
# New feature
The main theme of today's event is the new feature to mark sentences as "OK", "not OK" or "unsure". You can test this feature on the dev website (https://dev.tatoeba.org) and you can take some pop corn and read the discussion about it here:
There is still a lot of things we could implement around this feature but to my eyes, it is functional enough to be released on Tatoeba next Monday. My question is: do you think so too?
If you think so, then simply reply to this post, and write "+1 for the new feature". This means you feel the feature is ready enough, and can already be useful even though it's not perfect.
If you don't think so, then let me know what is missing, or what could be improved before the feature gets released.
Depending on what everyone has to say, the released date of this feature may be delayed.
Even if there is a new feature in progress, you still get to make requests, like in the previous Tatoeba Days. The rules are the same:
1. Each participant is limited to 2 requests.
2. For the request to be implemented during the event, it must be an issue that is already logged in our issues tracker in GitHub (https://github.com/Tatoeba/tatoeba2/issues).
3. Exception: if the request is related to the new feature, it doesn't need to be logged in GitHub.
4. You will know if your request is going to be implemented if it is listed here: https://github.com/Tatoeba/tato...nes/2015-08-10
5. You will know if your resquest is testable on the dev website when it is "closed".
If you have any question, feel free to ask, and happy Tatoeba Day!
I copy Trang's message on GitHub so that everybody will see it.
1. Replaced the wording "corpus" by "collection". Suggested by tommy_san.
2. Instead of "correct" / "incorrect" / "not sure", we're going to use "OK" / "not OK" / "unsure". Since we already have the tag "OK" that is part of Tatoeba's vocabulary, we might as well reuse it.
3. Fixed a bug that prevented from marking a new sentence. Reported by Silja."
"My collection" of sentences that are either OK, not OK or that I'm unsure of doesn't make much sense to me. Wouldn't it be better to write "My collections"? Or maybe we could use "My ratings" or "My marks".
Besides, this page https://dev.tatoeba.org/collections/of/TRANG (list of sentences you think are OK, not OK or you're unsure of) also looks almost meaningless to me. How about displaying one of the three icons next to each sentence or divide the list into three, each of them with a caption?
> Wouldn't it be better to write "My collections"? Or maybe we could use "My ratings"
> or "My marks".
I think it depends how you look at it.
I see it as one big basket in which you put all kinds of sentences. So to me "My collection" makes more sense.
"My collections" means that you see it as several baskets and you put each sentence in a different basket based on how you mark it.
"My ratings" may define more accurately the feature at the time being, but it doesn't represent the more long term plans I have for it.
> How about displaying one of the three icons next to each sentence or divide the list
> into three
On this matter, I'm personally hesitating between:
1) To not display all the sentences, and display by default the sentences marked "OK".
2) To display all the sentences by default, with an icon next you it, as you suggested.
> "My ratings" may define more accurately the feature at the time being, but it doesn't represent the more long term plans I have for it.
What are those plans?
I want to shift away from the concept that Tatoeba is one single corpus and everybody is contributing to this one big corpus.
My idea of the future Tatoeba is a platform where each user has their own little Tatoeba (their collection, their corpus, whatever you want to call it). And Tatoeba is the union of all these collections.
The difference is that when you are building your collection, you do it according to your rules. Tatoeba gives you the tools to categorize your content, but doesn't force you to follow its own definition of what is valid, useful, acceptable, etc.
The challenge will be to give more intelligence to Tatoeba. To make it learn how to understand what each user wants, and how to pick the most relevant sentences from each collection to address each user's needs.
The reason why "ratings" doesn't fit my definition is because when you think about "ratings", it feels more like looking at what others do, and just giving an opinion.
I prefer to use the wording "My collection" because it represents this idea that you're building something. You add sentences to your collection (either by creating them yourself, or by picking from existing sentences).
Then Tatoeba gives you the tools to mark/tag the sentences in your collection. Right now it's only limited to only "OK"/"not OK". But the point is that in the long term, we have many more indicators ("this is something I would say", "this is useful for beginners", "this is funny", "this is offensive", etc).
I see your point. I still think it would be handy if we had a term for each set of sentences that share some characteristics. Note that you yourself sometimes use this word as if one member has multiple collections ("your '0' collection" https://tatoeba.org/wall/show_m...essage_23869).
I'd like to have separate terms for each of the following sets of sentences.
(a) A set of sentences with common features collected by a user.
"Tommy's collection (?) of non-OK sentences", "... of good spoken Japanese sentences"
(b) A set of sentences in all the (a)s that belong to a user.
"Tommy's library (?)"
(c) A set of all the sentences on the platform.
"the Tatoeba Corpus (?)"
> There is still a lot of things we could implement around this feature but to my eyes, it is functional enough to be released on Tatoeba next Monday. My question is: do you think so too?
I think we need more time to let users try the feature and discuss it.
[not needed anymore- removed by CK]
I wouldn't go with more than 3 options. It's definitely less accurate, but it would be way too overwhelming.
> Also, what are you going to do about all ratings when a sentence has been edited?
> I assume, you'll delete them all.
I've mentioned this in GitHub:
--> "Things to do next"
1. When a sentence is modified, all the reviews need to be marked as deprecated.
* Users need to have a page where they can view all their deprecated reviews.
* On the sentence's page, we need to display the deprecated reviews in grey, or maybe even hide them.
Could you specify how you'd use the "unsure" and "unrated" options? What kinds of sentences fall into each of them?
Perhaps you could just name some of your lists that correspond to them.
I have added an option to activate/deactivate the feature.
By default it is deactivated and users can activate it from their settings.
The link "My collection" in the user's menu will be visible no matter what. This way users who don't know about the feature can find out about it.
A very personal feedback:
I noticed I'm more inclined to mark a sentence "OK" with the new feature. This is because of the "unsure" option.
Previously, I picked up the sentences that I found either good or bad, and then put the rest of the sentences automatically into my #3 list. The main purpose of this list was to avoid seeing the sentences again later when I browse through some sentences I haven't proofread. I didn't have to worry about anything because almost no one paid attention to my lists.
I think most users have done more or less the same thing: add the OK tag to especially good sentences, add a comment or a tag to especially bad sentences, and just ignore all the others.
Now that all my ratings are displayed publicly, I find myself thinking much differently. I hesitate a lot before marking "unsure". I ask myself "Am I really unsure about this sentence? What would I answer if someone asked me what I am unsure about? Wouldn't the author feel unpleasant to find out I'm 'unsure' about his/her sentence?" and sometimes mark it OK even if I'm still not sure whether it's really good.
This happens only when I'm really not sure about a sentence. It doesn't happen when I'm sure that the quality of a sentence is doubtful, and that it's not bad enough to be marked "not OK".
I wonder if other members feel the same way. It's all right if they don't. I'll just try to get used to it then. I'm just a bit concerned because I have the opinion that people should be rather picky when they evaluate sentences. I wouldn't like it if the new system prompted people to mark as positive those sentences that they wouldn't have tagged OK.
Would it then be better if only the "OK" and "not OK" reviews were displayed on the sentence's page? And the "unsure" is only visible to you.
There are cases where we do want to publicly express that we are unsure about certain sentences, so I think I'd prefer keeping all the three options on the sentences page and add the fourth option that we discussed yesterday (https://tatoeba.org/wall/show_m...message_23866) that is only visible to ourselves. This is the same as CK's Option 6 (https://tatoeba.org/wall/show_m...essage_23879).
This extra option assumes that we have a page that shows all unrated sentences, which we currently don't have yet.
I'm keeping this in mind for the next evolutions of this feature though.
+1 for the new feature
I'm not sure yet. In the "old system" a corpus maintainer after 2 weeks checks if a correction has been made and changes "@change" to "OK". What if the owner corrects it well, or what if the owner with a new mistake makes it even worse?
I see in the dev site that the reviews don't change after a correction. Can the reviewer be noticed somehow that a correction has been made?
> I see in the dev site that the reviews don't change after a correction. Can the reviewer be
> noticed somehow that a correction has been made?
Yes, it's planned. Cf. https://github.com/Tatoeba/tatoeba2/pull/738, "Things to do next".
+1 for the new feature!
Tatoeba Day is now over.
You can find in the link below information about what the update on Monday will include:
The new feature will be part of the update, but deactivated by default.
For those who have time on Sunday, please do some more tests on the dev website and report any bug that you find.
Next Tatoeba Day is scheduled on August 29.
Thank you for your participation.