Wall (7129 threads)
Astúcias
Before asking a question, make sure to read the FAQ.
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
frpzzd
1 days ago
hecko
2 days ago
frpzzd
2 days ago
araneo
3 days ago
gillux
3 days ago
araneo
3 days ago
gillux
3 days ago
gillux
3 days ago
PaulP
3 days ago
frpzzd
3 days ago

Hello, Tatoeba was updated today. What’s new?
- Content report: there is a new "flag" button on the top-right of Wall posts and sentence comments to ease the report of inappropriate content to admins. This is a counter-measure to the increased spam Tatoeba is seeing recently, and also a feature some people have been asking for in the past.
- Two new languages have been added: Svan and Ao Naga. Cheers to abiniz and monsen_sanang_ai, respectively, for requesting them! Tatoeba now supports 429 languages.
Learn more about Svan: https://en.wikipedia.org/wiki/Svan_language
Learn more about Ao Naga: https://en.wikipedia.org/wiki/Ao_language
- A total of 50 language icons (flags) should now look nicer. The icons have been updated to a vector-based image instead of a raster image, which means they won’t look pixelated or blurry but always sharp, even when zooming in.

Thanks for the update, Gillux.
If I might use this opportunity to make a suggestion, I think it would be good to implement something that will make it more difficult for spam accounts to register themselves in the first place (captcha, email links etc.?) and delete tens of thousands of existing ones (there are quite possibly over a hundred thousand by now). Nine out of ten newly registered accounts are spammers. Of course, most of them remain silent and passive with ads and links in their profiles. This cannot be not a problem.

I agree. I have been watching the spam accounts come in literally by the minute.
I’ve been reporting spam comments as I see them, but it’s troubling that they can still have a bunch of links in their profile without making any posts. I’d like to help out with that if there’s any way I can. Probably I am too new of a user to have admin privileges, but if it would be helpful, I’m happy to throw together a script that will make a list of likely spam accounts based on common indicators (long link lists in the bio, certain keywords, lack of languages or sentences, etc). Then some admin can run through the list and delete the ones they see fit.
EDIT: just realized that user bios aren’t included in the downloadable data dumps, so maybe that won’t work.

frpzzd, thank you for your help. I have tried to figure out a common feature of spam accounts, but there is nothing really standing out, so I also think we’d need some human verification to confirm account purge.
However, I think it is too early to start deleting spam accounts. It would be better to find a way to stop the influx first, and only then start to cleanup.
As you realized, we do not export user bios. If you are willing to help, I can provide you with a database export of user bios and other information.

Certainly, carrying out any kind of account purge without double-checking each account would be a bad idea. Maybe the job can be streamlined with some analysis of the spam accounts though.
If you think it is too early to begin removing the spam accounts, I guess there's no rush. But in any case, I'd definitely love to take a look at a data dump of user bios if you're willing to share one. Maybe there are some useful features that can be extracted.

Waldelfe, I understand your concern. I am aware of the constant influx of spammy accounts, and be sure that I am willing to stop it. But this is not an easy task.
About adding a confirmation email step, I think it’s is unlikely to help. According to my own research, these accounts are created by humans. They are already going through as many open-registration websites as possible, to create user pages containing a lot of links (see, for example, user xx88art56). Tatoeba is likely the only website in 2025 not requiring email confirmation. If they can do it on other websites which require email confirmation, they will keep doing it on Tatoeba regardless of email confirmation.
About adding a captcha, I don’t think the benefits of the captcha (less spam accounts that are barely noticeable) outweigh the drawbacks (less legitimate users are able to register). Besides, captchas are unlikely to stop the spammers because they are likely real humans.
You are welcome to join the ongoing discussion about prevention of spam accounts on GitHub: https://github.com/Tatoeba/tatoeba2/issues/1613

Why would less legitimate users be able to register if a captcha was added? And thank you for the update!

You are welcome, araneo.
Captcha notoriously lower accessibility: https://en.wikipedia.org/wiki/C...#Accessibility
Captcha can be challenging to solve, even when not visually impaired. If you have poor mental of physical health, simply browsing a website may require a lot of energy; having to solve captchas on the top of that is not helping.
In general, I don’t think anybody is happy to be prompted to solve a captcha.
For these reasons, I’d rather not add more captchas on the web unless there is a really good reason to do so.

Thank you for the explanation and for taking that into consideration!

in my experience captchas are easily solved by modern language models, perhaps even more so than by humans https://www.youtube.com/watch?v=satnl1KTEXM
and even before that there are services employing humans from third-world countries to solve captchas for pennies
one approach i personally like is to hide submissions from new users until at least one sentence/post/whatever from them is approved by a moderator, but that might take effort to implement here since i don't think there's even a way to hide sentences yet

Interesting idea. Alternatively, for an approach that avoids making more work for the mods, we could consider hiding profiles (or just hiding profile bios) of users who have not written any sentences yet by default.
Many spammers seem to not write any sentences at all, and just fill up their bios with links. The spam accounts that write sentences are at least a little more likely to get noticed, flagged and deleted.

> - Content report: there is a new "flag" button on the top-right of Wall posts and sentence comments to ease the report of inappropriate content to admins.
Thanks! That's very useful!
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

Hi Guys! This is Alexander from Brazil! I am very happy to contribute to the development of this site. And I am very happy that the participants here can contribute to the project.

Welcome, Alexander!

Hello, thank you for all your contributions to this project!

Hi, just a quick comment. I wish speakers of minority languages would contribute more on these. Many of these languages have less than 10 phrases. Let's try to add more digital presence on these linguistic groups.

Which languages are you talking about, if you have specific ones in mind?
Anyways, I agree. For a few weeks, I have also been mulling over how to attract more speakers of Panjabi, Gujarati, Telugu and other major languages of India to this site. They are some of the world languages that are the most underrepresented on Tatoeba (comparing the number of speakers in the world with the number of sentences in the corpus). From what I understand, multilingualism is also extremely common in India so these contributors would probably be very beneficial for the corpus.
For the moment, I don't have any good ideas for accomplishing this. Maybe someone else will chime in with thoughts.

Hello, thanks for your reply. Mostly I mean African and Indigenous languages. If you go to the all languages section, you will find many that have less than 10 phrases like Urhobo, Aymara, Haida, Cuyonon to mention some.
By the way, I also have Indian languages. In the past I studied some Kannada, which is a Dravidian language, it has an amazing alphabet, quite artistic.
Now going back to the main topic, I think we might capture some people via social media. It's the idea I have now. On Facebook you may find lots of groups that promote polyglottery and defend minority languages.

✹✹ Stats & Graphs ✹✹
Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.