なお、Tatoebaは文明的な討論を行うために健全な雰囲気を維持することを目指しています。以下の悪質な行為に対するルールも併せてお読みください。 rules against bad behavior.
* Tatoeba Most Translated Sentences Charts *
Tatoeba "250 Most Translated Sentences" and "Most Translated Sentences In Each Language" charts have been updated:
**Welcoming back gillux**
I'm very happy to announce that starting from Monday next week and till the end of April, gillux is back as official staff of Tatoeba! He will then be back again from the beginning of July till the end of September.
During this time, I will be delegating to him some of the duties I usually take care of, such as merging pull requests and deploying new changes to the main website. This will allow me to focus more on completing the responsive UI. For the rest, he has some plans of his own. I will let him give you the details in due time :)
Welcome back, gillux!
Nice to see him back and looking forward to the next updates ～～
Welcome back, gillux! :D
Yeah, thank you Trang for letting me work again for Tatoeba. I’m very happy to be back. :-)
Last year, I worked at making it easier for new developers to join the project and start hacking, facilitating the setup of local development environments. While in the past, almost only Trang and I worked on the code, we have seen new people like Andreas, Danail, Yorwba, Aiji and others contributing more or less regularly. This really gives us hope that future developments on Tatoeba will be more smooth and diverse. Please join the fun if you’d like to help, anybody is welcome, including beginners.
Of course, there are still a zillion things to do. In the past, I’d organize my work mainly by going through our Github issues, but this time I would like work differently. I believe that while Github issues are relevant, they are too specific and they prevent us from seeing the forest for the trees. I have the feeling that a lot of our problems are rooted in some more general/core issues I’d like to tackle instead. So my plan is to first perform some (ideally a lot of) UX tests to better define and identify these issues. I’m better at programming than UX, but hey, we are all learning aren’t we.
Anyone is welcome to help in UX testing. You can send us your test reports to firstname.lastname@example.org or publish them on https://en.wiki.tatoeba.org/articles/show/ux-tests if the tested person is okay with that.
I’d also like to work on the following things, which may prove more or less connected to the problems UX testing will identify.
1. Tags are an incomprehensible mess I’d like to improve: https://tatoeba.org/fra/wall/sh...#message_33780
2. New members rarely stick around. While looking at the profiles, you can see a lot of people who joined the project at some point, contributed a few dozens of sentences, and then nothing. It is normal that people leave at some point in any collaborative project, but if we have more leavings than joinings, it indicates something’s wrong.
3. Tatoeba should better help members to connect one another, see this thread: https://tatoeba.org/fra/wall/sh...#message_32097
4. The community is too western-centred. As an open multilingual project, I think we should set an example for diversity. However, if you compare this list https://en.wikipedia.org/wiki/L...er_of_speakers with this one https://tatoeba.org/fra/stats/users_languages, you’ll see that languages like French and German are overrepresented, while languages like Mandarin, Hindi, Arabic or Bengali are underrepresented.
5. English takes too much space. I don’t know how to put it another way. On the Wall, despite the "all languages are equal" message, at the end of the day members almost always use English. It’s good that we have English as a bridge language, but it has drawbacks. I think it deters people who are not confident with English from becoming a part of the community. Also, overusing English prevents us from thinking in different ways, having new approaches. So I think Tatoeba should provide better tools and spaces to allow anyone to truly and fearlessly express in any language.
welcome back :)
Sometimes this appears: a-,
I opened an issue on GitHub: https://github.com/Tatoeba/tatoeba2/issues/2039
If you have any additional information to share, such as your browser version and how often this happens, feel free to add it there.
Sorry for answering late, but can you give us more information when this happens (e.g. OS version, browser version)?
Today I performed a UX test on an old Tatoeba contributor. With his permission, I am publishing the contents of the tests on the wiki https://en.wiki.tatoeba.org/art...show/ux-test-2
I’d like to encourage anyone to perform similar tests, be it on new or regular users. It helps identifying UX problems and understanding people’s needs. You just need watch people using Tatoeba to do something. They have to voice out their thoughts so that you can understand what’s happening in their head.
I personally use a screencasting software + voice recorder or take notes while testing, and then write everything down as soon as I can while my memory is still fresh.
"S. clicks on tag List 907. "What the heck is that?"
Sentences with tags List 907
S. is confused. "It’s a tag on a giant corpus. Is is Tom’s corpus? No. I don’t get it."
S. clicks on the back button."
This part that is confusing will be eliminated when this github issue is resolved.
See the last comment.
I've already removed over 200,000 of the tagged sentences the slow way.
It's not a UX test because I was guiding the person so I have only one feedback. The person was really wondering if she could really add any sentences. Silly ones, over-excited ones, etc. I had to confirm that any correct sentence is welcome, not only basic ones, and push her to contribute the sentences she wanted.
Új tagokkal kapcsolatban
Amit látok, hogy az új tagok (vagy inkább nevezzük őket is szerkesztőknek, mivel főleg azokról lesz szó, akik már adtak hozzá mondatokat), egyáltalán nincsenek tisztában az alapszabályokkal.
Beregisztrálnak, nekikezdenek a fordításnak, új mondatok írásának a szabályok ismerete nélkül.
És van, hogy ha odaérnek egy névhez, legyen az Tom, pl kínaira lefordítják, majd még zárójelbe odateszik, hogy Tom -> "汤姆(Tom)"; vagy ha két külön mondatot lehetne írni a nemek használata miatt -> "Muitos(as) influenciadores(as)"; vagy akár így jelzik "him/her/them".
Ahonnan most idéztem, újaktól, vagy olyanoktól származott, akik 100 mondatnál nemigen mentek tovább.
Hogy miért is? Mert beregisztráltak, senki sem szólt nekik a szabályokról, ha mégis, akkor csak linket kaptak, nem nézték meg, vagy nem értették, majd amikor valaki azt merte mondani: 'Inkább szedjétek ketté a mondatot.' nagy eséllyel besértődtek.
Nap mint nap veszít el a Tatoeba esetleges újabb szerkesztőket azzal, hogy beregisztrálásukkor semmit sem kapnak kézhez az újak.
Én is csak beregisztrálásom másnapján kaptam levelet CK-tól, főként amiatt, mert bejelöltem nyelvnek az angolt. A levél első mondata a köszöntés után a következő volt: "Here are some good ways to find sentences to translate."
Természetesen a levél alján ott voltak a szabályokhoz vezető linkek is, de látszólagosan csak másodlagosak voltak a levélt tekintve, meg aztán vagy felkeresi az oldalt az ember, vagy nem. Valamint, csupán angolul, franciául, németül, oroszul és eszperantó nyelven elérhető a szabályokról szóló rész, a részletesebb verzió pedig ennél kevesebb nyelven.
Kicsit több nyelvre is le kéne fordítani, mert mi van, ha valaki egyiket sem beszéli? (google fordító nem teljes biztosíték)
Ami jó lenne:
Ha az új tagok beregisztrálásukkor automatikusan kapnának egy levelet a szabályokról. A linkelésnél meg még jobb, ha például anyanyelvi szint bejelölése esetén adott nyelven megkapnák a szabályokat.
I've used a translation tool to read Pandaa's message but if I understood well, I fully agree to most of what is said, especially the last section. We discussed it several times but it may be time to increase the priority of the rework of the homepage and support to new users. I believe we already have some wiki articles ready for that. With the effort of some of us, we could translate them to many languages. My opinion is that translating only the most useful pages would be quite enough. So, two or three pages really...
Szerintem az új tagok inkább nem találják meg a számításaikat, azért hagyják abba pár napon belül a tevékenységüket.
A párezer tagból kevesebb mint 100-an tevékenyek.
On hyvä muistaa, että juuri kukaan ei lue ohjeita. Käyttöliittymä pitäisi rakentaa siten, että ohjeiden lukeminen ei olisi tarpeellista, koska juuri kukaan ei kuitenkaan tee niin. Jos jonkun pakottaa lukemaan ohjeita, hän lopettaa lisäämättä yhtään lauseita.
Lisäksi kannattaa muistaa, että oleellisesti kaikilla verkkosivuilla pieni vähemmistö tuottaa suurimman osan sisällöstä. Tämä koskee keskustelupalstoja, Stackexchange-tyylisiä kysy ja vastaa -sivuja, avoimen lähdekoodin projekteja, tiedettä ja niin edelleen. Eli kannattaa asettaa odotuksensa sopiviksi.
Suosittelisin panostamaan enemmän käyttöliittymään ja vähemmän ohjesivustoihin.
Ha nem olvasnak az emberek szabályt, legyen akkor csak a legfontosabb elmondva, ami megmutatkozik.
Pont emiatt is szóltam, az újak nem tudják mi a baj a "he/she" leírásával.
Ez az, amit sorra javítgathat a többi hozzájáruló.
I have created an issue on GitHub for improving the experience for newly registered users: https://github.com/Tatoeba/tatoeba2/issues/2112
I think having a private message sent to every new users is a good idea. They can read it any time, it's not too intrusive so it's a good way to give them some pointers without overwhelming them.
I agree with Thanuir that most users will probably not read long instructions. So we should still try our best to improve the UI so that users can intuitively guess the rules. We cannot expect everyone to read several pages of text before they start contributing. But it is still very useful to have good documentation and we should make sure the user knows that documentation exists and where they can access this documentation.
I think it's a good occasion to mention that Aiji has been working on a draft of a new Quick Start guide:
This article is a good candidate to replace our current Quick Start guide in my opinion.
Sajnálom, lehet én értettem félre a lényeget, vagy én lettem félreértve.
"Ha az új tagok beregisztrálásukkor automatikusan kapnának egy levelet a szabályokról."
Nem mondtam ki, de akkor tisztázzuk, itt a belső mail szekcióról beszéltem.
Természetesen megoldható, hogy a beregisztrált e-mail-re is jöjjön valami, de inkább jobb, ha itt az oldalon, kéznél van a segédlet, és amúgy is, ebből még egészen más is kijöhet.
« Happy Friday!!! »
Friday is not a reason to relax. Once you relax, fate will throw an unpleasant gift to you.
> fate will throw an unpleasant gift to you
Bad fate! Go stand in the corner, we ain't want no stupid gifts from you.
* Tatoeba As A Graph *
Tatoeba internals represented as undirected graphs.
That's quite an heavy web!
By the way, are "circo" "fdp" "sfdp" and "neato" the name of tool used to build the graph?
** Selecting Sentences Using English Verbs **
This page has a list of verbs, sorted by their frequency of use in spoken English.
The search links have all forms of the verbs, so even irregular verbs such as swim-swam-swum get search results for all forms.
I can't seem to search anything. There is an error. Please help!
Thanks for letting us know, satokeigo. The search feature is back.
**Renaming of the "collection" feature**
There has been suggestions to rename the "collection" feature into "ratings".
After giving it some more thoughts though, we are not convinced that "ratings" would be the most appropriate name.
I personally think "proofreading" would fit better for the direction where we would like this feature to go to.
For the context, when it was named "collection", there was another idea behind this feature: that each user could build their own personal corpus of sentences and they would proofread sentences along the way. You could technically choose to have a sentence in your collection that has an error, but you still add it because the sentence itself has some value to you. It just has to be corrected. However, time has shown that there is no strong desire or need for users to build personal corpora and the feature is a lot more used as a tool for quality control.
Do you agree that "proofreading" would be a suitable new name for this feature? Or do you have a better idea?
>Do you agree that "proofreading" would be a suitable new name for this feature? Or do you have a better idea?
Yes, I agree completely.
I translated it as "Reviews" into German for the German user interface because I found "collections" misleading (making it hard for people to understand what it was being used for).
I like "ratings" best and "reviews" second best.
Proofreading is about not only indicating the quality of text, but marking it up to indicate how it should be improved. That's not what this feature provides.
When I see a sentence that I think should be improved, I generally leave a comment containing the current text and my suggested replacement. That's what a proofreader does. I generally only use the checkmark/question mark/exclamation mark feature in two situations:
(1) The process of suggesting an improvement has come to a dead end, such as when the sentence sounds unnatural but it's too hard to fix, or when the owner responds aggressively to comments. In such a case, I use a question mark or exclamation mark to indicate that I consider the sentence problematic.
(2) A non-native speaker writes a sentence and asks for a native check. I use a checkmark to indicate that I think the sentence is fine.
I think either "rating" or "review" does the best job of describing this action.
Actually, I translated it as "ratings" (Bewertungen). 🙂
My concern with "rating" is that when I think of rating, I think of the system that is widely used to rate apps, products, restaurants, etc.
I would like this "collection" feature to make it out of the beta phase one day and my hope is that people don't associate it to something similar as rating products and services, and don't, for instance, rate sentences just to express support (or dislike) for the sentence owner.
When I gave it second thoughts, I felt the only healthy way to use this feature is for proofreading. It is true that the feature as it is now, only offers a way to mark sentences. But the feature is not complete and indicating how the sentence can be improved could be part of its evolution. For instance, when you click "not OK", you could have a form to add your suggested improvement and wouldn't have to go to the sentence's page for that.
But even as it is, the feature can be used for proofreading already. You can mark a sentence as "not OK" as a way to have a summary of the sentences where you posted a comment to suggested a correction. You could technically go to your list of comments for that, but it wouldn't be easy to see which ones have been indeed fixed. Whereas if you marked it as "not OK", you can see it from the "Outdated ratings".
Technically, you could download the users_sentences.csv provided on the Downloads page in order to create a list of sentences to proofread next. You would be able to know which sentences have already been proofread (and by who), so you can exclude from your list sentences that have been proofread by people you trust.
As for the "unsure" mark, it can be used for sentences where you have doubts whether it is correct or not and you are not able to find an answer, or need more time to find the answer. After all, we don't know everything about our own language.
First, we can do our best to ensure that the "healhty way" is used. Write a wiki article, add it to the functionality page, etc.
Second, whatever the name, if people want to twist the function to "express support or swift attacks" to another user, they will do it anyway. There's not much we can do to prevent it. Up to us to avoid giving credit to the action. For example, avoid adding rating bias in the search, do not consider sentences with more ratings better, etc.
> whatever the name, if people want to twist the function to "express support
> or swift attacks" to another user, they will do it anyway
Yes, that's clear, but that's not exactly what I'm concerned about. It's one thing to have very stubborn people who will care for their personal needs above everything else and will twist features to fit these needs no matter what, and it's another thing to have cooperative people who just misinterpreted the feature because the name was ambiguous to them.
If you have feature that is called "Like" and if you have a feature that is called "Bookmark" your choice will be different. It will make little sense for you to "Like" an article that makes you angry, but it makes sense that you "Bookmark" it.
Both features can provide the exact same service: just a page where you can see all the articles you liked/bookmarked, but the name influences how you use the feature.
At the end of the day, we can obviously go with any name and write documentation or add some info/tips about the feature on the website itself to clarify what it means. But we can avoid a lot of headache by choosing a good name.
My post above was meant to share my personal interpretation of "ratings" and the possible drawbacks of going for this name. But I do not know if people generally understand "ratings" the same way I do.
Renaming "collections" to "ratings" or "reviews" would still be an good improvement because it is pretty clear that no one uses this feature to build a collection, it is only used to rate/review sentences.
Lisäksi, juuri kukaan ei lue ohjeita, eivätkä ne vaikuta suoraan suurten joukkojen käytökseen.
Something like "correct" or "correctness" might discourage rating sentences one disagrees with.
Some other words, not so great I think, just for the sake of brainstorming.
For now, my preference goes to reviews and proofreading.
Some more brainstorming ideas.
proofreading (TRANG's idea)
My preference would be "rating", since I might say "I rated it OK" or "I gave it an OK rating."
I wouldn't say "I reviewed it OK", "I proofread it OK" or "I gave it an OK proofreading."
The word "review" could also be used for leaving comments on apps and products, so this word would have the same problem that TRANG mentioned above for "rating."
Personally, I only use the "OK" rating to let members know that the sentence is OK and ignore the other two ratings.
The "not OK" rating is something I don't use, since if something isn't OK, it might become so after leaving a comment.
There is a possibility that some members would not rate a grammatically-correct and natural-sounding sentence OK if the sentence was not factually correct or if they felt that the sentence was vulgar, pornographic, archaic, old-fashioned or otherwise not appropriate for their own use. I'm not sure this is really a problem since that's what professional proofreaders do.
Not rating something OK is fine.
Rating something not-OK because one disagrees with the sentiment in the sentence would slightly undermine the quality of the database. However, it is also something that can not really be avoided; what one can do is make it unintuitive and so only used by people committed to waging an ideological war.
I believe you can have any reasons you wish to ignore certain sentences which are perfectly fine.
To add to CK's list, other common reasons could be: sentences from certain users you ignore for any reason; sentences with names or words you dislike. That doesn't sound as "professional" and the reasons listed by CK, but I see no issues with that.
I like "reviews".
"Reviews" would have the advantage that this name is already used as the title of the section on the sentence's page. So at least that string won't need to be changed.
I think what we are doing is rating sentences on a 2-value scale, OK or not OK, with the 3rd option being "unsure" which is like "undecided" on an opinion poll.
* assign a standard or value to (something) according to a particular scale.
* assess (something) formally with the intention of instituting change if necessary.
* write a critical appraisal of (a book, play, film, etc.) for publication in a newspaper or magazine.
* a classification or ranking of someone or something based on a comparative assessment of their quality, standard, or performance.
* Members That Have Added Over 1,000 Ratings *
It might be nice to hear the opinions of some of the members who use this function a lot.
-1 means "not OK"
0 means "unsure"
1 means "OK"
CK : total : 752404
CK : 1 : 752404
PaulP : total : 179512
PaulP : -1 : 246
PaulP : 0 : 2549
PaulP : 1 : 176717
Guybrush88 : total : 27322
Guybrush88 : -1 : 1
Guybrush88 : 0 : 3
Guybrush88 : 1 : 27318
bill : total : 24215
bill : 0 : 1
bill : 1 : 24214
Pfirsichbaeumchen : total : 5265
Pfirsichbaeumchen : -1 : 41
Pfirsichbaeumchen : 0 : 231
Pfirsichbaeumchen : 1 : 4993
Selena777 : total : 4415
Selena777 : -1 : 11
Selena777 : 0 : 49
Selena777 : 1 : 4355
deyta : total : 3792
deyta : 1 : 3792
jegaevi : total : 3540
jegaevi : -1 : 4
jegaevi : 0 : 5
jegaevi : 1 : 3531
Gulo_Luscus : total : 3479
Gulo_Luscus : -1 : 2
Gulo_Luscus : 0 : 1
Gulo_Luscus : 1 : 3476
tornado : total : 3111
tornado : -1 : 219
tornado : 0 : 482
tornado : 1 : 2410
Thanuir : total : 3019
Thanuir : -1 : 141
Thanuir : 0 : 66
Thanuir : 1 : 2812
alexmarcelo : total : 2754
alexmarcelo : -1 : 25
alexmarcelo : 0 : 103
alexmarcelo : 1 : 2626
soliloquist : total : 2607
soliloquist : 1 : 2607
tulin : total : 2520
tulin : 1 : 2520
Wezel : total : 2003
Wezel : -1 : 40
Wezel : 0 : 450
Wezel : 1 : 1513
raggione : total : 1715
raggione : -1 : 8
raggione : 0 : 661
raggione : 1 : 1046
odexed : total : 1620
odexed : -1 : 22
odexed : 0 : 220
odexed : 1 : 1378
driini : total : 1542
driini : -1 : 3
driini : 0 : 7
driini : 1 : 1532
shekitten : total : 1508
shekitten : -1 : 14
shekitten : 0 : 38
shekitten : 1 : 1456
Raizin : total : 1467
Raizin : -1 : 6
Raizin : 0 : 46
Raizin : 1 : 1415
* Some Stats *
In last Saturday's exported data, ...
1,039,935 sentences had ratings
2,306 of these had "not OK" ratings.
6,457 of these had "unsure".
3,401 of these sentences had ratings by more than one member.
I don't really have a strong opinion on renaming this feature, but I'd like to suggest that it be explained in a prominent place that these ratings only pertain to a particular sentence and not its translations.