We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
**Renaming of the "collection" feature**
There has been suggestions to rename the "collection" feature into "ratings".
After giving it some more thoughts though, we are not convinced that "ratings" would be the most appropriate name.
I personally think "proofreading" would fit better for the direction where we would like this feature to go to.
For the context, when it was named "collection", there was another idea behind this feature: that each user could build their own personal corpus of sentences and they would proofread sentences along the way. You could technically choose to have a sentence in your collection that has an error, but you still add it because the sentence itself has some value to you. It just has to be corrected. However, time has shown that there is no strong desire or need for users to build personal corpora and the feature is a lot more used as a tool for quality control.
Do you agree that "proofreading" would be a suitable new name for this feature? Or do you have a better idea?
>Do you agree that "proofreading" would be a suitable new name for this feature? Or do you have a better idea?
Yes, I agree completely.
If you have some time, please go to our dev website and test anything you can :)
1) The "Translate" feature has been implemented in the new sentence design. To test it, you will need to enable the option "Display sentences with the new design..." in your settings.
More details: https://github.com/Tatoeba/tatoeba2/pull/2058
2) We will need testing on any feature that involves a creation date. For instance adding a comment or creating a list. Just to make sure nothing is broken.
More details: https://github.com/Tatoeba/tatoeba2/pull/2052
If you find any issue, you can report them by replying to this thread.
I made some tests to check both features and they work for me. The one involving creation dates works also with the Arab UI (the one that triggered the original issue).
I also have a suggestion about the "Translate" feature: currently, it's possible only to edit the new translation and to open the related sentence's page. My suggestion is to add also the button to copy the translation into the clipboard (when activated from the user's settings), since this would greatly improve (at least for me) from a mobile device the possibility to add multiple translations to a single sentence.
Thanks for testing, Guybrush.
The copy feature will come later. It's not a default feature (it has to be activated in the "beta" options in the settings), so it will be one of the last things to be migrated to the new sentence design.
We're continuing to migrate towards the new sentence design. This time with the edit and delete features.
Mode details: https://github.com/Tatoeba/tatoeba2/pull/2077
If you have a moment, please test it on the dev website: https://dev.tatoeba.org/
Check this out, Trang: https://giant.gfycat.com/Spotle...tedHackee.webm
Every time I click save, nothing happens.
I didn't face any issue. I was just wondering why the new design is only displayed for results of a search (not on the homepage, not on the sentence page).
We will need another round of testing before Sunday.
Please test the new sentence component in general (expand/collapse, going to sentence's page, playing audio, translating, editing).
Additionally, please test the duplicate detection. These bugs should be fixed on the dev website:
Just one thing. When clicking on the "Fewer translations" link, the page doesn't go back up. So suppose you have 50 sentences displayed, you check the "254 more translations" of the first one, then you click the "fewer" link and you are at the bottom of the page. Not a big deal, but not ideal.
By the way, is Horus working on the dev site?
And if Horus is supposed to be working, how long should we expect to have to wait after adding a duplicate sentence before it gets merged?
Horus is not running on the dev website but we don't need to check if Horus is merging duplicates properly because there has been no changes in that regard.
We need however to check if duplicates are still properly detected upon submission because there has been some refactoring of the code to fix the issues I mentioned above.
I was wondering if there are tags for grammatical constructions too. I would like to tag some of my senteces like this: zero/first conditional and so on. Is it possible?
I believe that having them tagged, it would help language learners better understand the sentence in term of grammar.
I haven't checked all the tags available, as there are too many of them, so that's why the question.
It's certainly possible to add and use tags for grammatical constructions, but give some serious thought to your last sentence ("I haven't checked all the tags available, as there are too many of them") when you think about how they are likely to be used (or not) by other people.
Thanks Alan for the answer and checking.
It was a typo, I missed the second "o"
Is there any place where Tatoeba useful endpoints are summarized? Well, in particular, the endpoint to submit a new sentence (but also editing, commenting)? I remember TRANG explaining the POST details to someone on the Wall before but I can't find it.
I know that we shouldn't encourage script contributions but I also know that it is not forbidden if one is reasonable, let's say POSTing every 5 seconds or so. Sometimes I feel that processing locally and sending everything via a script at the end would be much more profitable.
I think you mean this thread: https://tatoeba.org/eng/wall/show_message/30691
I found it by downloading the wall.tar.bz2 and grepping for POST.
Ha ha, so nice of you :)
But there is no such a file in the downloads section if I'm not mistaken :P
I wonder if the Tatoeba Project needs someone or several people to maintain a history of milestones.
I started this page a while ago when I needed to know some of this information. I compiled most of it from TRANG's blog posts.
Perhaps some of this page could be used to start a wiki page that could be added to from time to time to document the progress of the project.
Maybe TRANG herself would want to do this, or already has a record that could be made more visible. Or, maybe one or two members could work on this together.
I think I'd be interested in seeing this done, but I don't really want to spend the time doing this myself.
** Introducing Tatoeba playground **
I was thinking, for a while, of sharing all the scripts I use to play with Tatoeba data. My goal was to provide clear tools out of the box, customizable, and that do not require any programming knowledge. And it's finally done, available online. All my thanks to Alan and Ricardo for their help and precious feedback.
So what is the playground? It is a collection of notebooks providing fully customizable functions to do pretty much anything you want. For now, the following are available out of the box, and more should come later if people are interested:
- All sentences. Filtered by language. All sentences containing a word.
- Sentences owned by a user. Sentences owned by other than specific users.
- Corpus analysis: word counts (beta version. Incorrect for most languages)
- Sentences that do not have final punctuation. Sentences that do not start with a capital letter
- Audio contributions (sentences with / without audio)
- Languages of a user. List of speakers of a language. List of natives of a language. List of natives of X speaking Y
It does not require any installation, as it is available online, here: https://mybinder.org/v2/gh/agro...yground/master
When you click on that link, the technology provided by binder (https://mybinder.org ) will build the playground environment for you. It may take a while but when it's done, you can access the playground in a closed and safe environment.
For more information on how all that works, you can check the README available on this github repository: https://github.com/agrodet/Tatoeba-playground
Some functions are useful for corpus maintenance, some are there just for fun. I'll let you judge :)
If you have any question, feel free to ask here or directly on the github repository: https://github.com/agrodet/Tatoeba-playground. Notice that it has nothing to do with the Tatoeba development repository.
PS: I'm aware it might be difficult to use for non-English speakers.
Nice work! Requiring users to change values directly in the code is a daring choice of interface, but I think you explained everything well enough.
When filtering for sentences without final punctuation, your examples are all single characters, but I'd like to point out that it's also possible to define longer sequences. For example, quotation marks at the end of a sentence should be preceded by the punctuation of the quoted sentence: '."', '!"', '?"'.
Tatoeba has been very slow the last few days. Also I've been getting "Tatoeba is currently unavailable" message every now and then.
Is it only me?
I face this problem too.
I see it, too.
J’ai le même problème.
I have the same issue.
I have this problem also the last two days.
Me too, I get the same message these days.
I have the same issue too.
Similar here. In addition (and maybe it can provide a clue) when I try to access Tatoeba from Mexico (thanks to VPN) the DNS don't resolve... no access at all. Could there be an issue with the DNS server at the web hosting site?
Tatoeba is facing a DDoS attack. If you’re curious what a DDoS attack is, have a look at https://en.wikipedia.org/wiki/D...service_attack
We are working on it, but there is little we can do about it without blocking legitimate users. We’ll keep you updated.
Thanks you very much for informing us. I hope that everything is sorted out soon.
Isn't the DDoS attack a kind of cybercrime?
I fail to understand to whose benefit something like that is, except for satisfying someone's weak mind.
perhaps someone else was the target and tatoeba just got caught because they had the wrong IP number... like getting shot in the wrong neighborhood.
Yeah, that's possible, other reasons that come off the top of my head could be the following:
* They're targeting the hosting provider, not tatoeba per se, so they're attacking some of their web sites
* Someone is holding a grudge against Tatoeba.
* Before attacking the real target, it might make sense for the attackers to use some random guinea pigs to test their scripts.
The attack looks finished now. So here is a follow-up.
It was an attack on the application level, sending a lot of forged http requests. The contents of the requests looked like randomly built from a set of real data: a combination of URLs, referers and cookies that looked valid individually, but did not make any sense put together. So after analyzing the traffic, I set up some countermeasures and the attack was contained, so you probably didn’t notice that it lasted for about two weeks. It peaked at 30-40 requests per seconds (or 1500 packets per second), which is rather moderate for a ddos, but apparently enough to partially disturb our service. Interestingly, dev.tatoeba.org and wiki.tatoeba.org were also targeted, but at an unnoticeable rate.
Note that it looks like the countermeasures partially blocked a few legitimate users connecting from China. Sorry about that! It should be okay now.
The meaning of this attack is still unclear. It happened at a time when most western people are on vacation. It was very limited in scale and not so hard to dodge. Also, the zombies were all located in China (which doesn’t necessarily mean the attacker is from China nor operating from China). It is worth noting that Tatoeba is not blocked by the GFW according to greatfire.org. Which makes me think that maybe…
* the attacker wants us to block all users from China
* the attacker wants us to think that the attack is related to China whereas it has nothing to do with it
* the attacker has limited resources (not competent, not rich, not state-sponsored)
* we are part of a bigger attack in which we are not the main target
* the attacker is testing our ability to respond to a ddos attack
* the attacker is just holding a grudge against us
Thanks for keeping us in the loop, gillux, and thanks for having dealt with all that.
Amég mü eze hazajárulást ítunk kábé egy vík alatt, addig a felzárkózó folk öteze szentenszt, A mink előtt járó hándrídhuszat. Lehet kalikulálni, vát fog pászíren. (Diekt ítam esztet lefóditatatlanul.)
My dear Turkish friends (or people who speak Turkish),
There's a inactive account which you can adopt and correct sentences if needed
Obrigado pelo seu interesse.
We are aware of the issue. I collected the sentences by that account in a proofreading list. A considerable amount of them need improving.
With an average of 100 sentences per day, it will take 12 years to finish proofreading. Slowly but surely, I hope. :-)
This is my personal list, and I'm proofreading from recent to oldest, but there's also another collaborative, hidden list with reverse order. Other Turkish ACs and CMs are dealing with it.
Bunu bildiğim için çok mutluyum. :)
What about adding mingrelian language? There're a lot of sentences in mingrelian with just "?" flag. Iso-code is XMF. I guess, we can add it to tatoeba.