{{}} No language found.
{{}} No language found.

Wall (5454 threads)

2019-06-12 14:48
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.
2019-06-03 07:18 - 2019-06-03 07:21

A problem with placeholder names is that people start treating them as placeholders:

#7947998 Tom becomes Loša
#7860419 Tom/Mary becomes Mikita/Tania

The tag localized:Belarus has been invented for such 'placeholder replacement', but the problem is, the tag belongs to sentences, not to translations. If a sentence has many links, it won't be clear which one was un-localised.

As seveleu_dubrovnik puts it in his profile:

> I translate neutral names, social structures,
> grade names etc as neutral ones in the destination
> language's socio-linguistic landscape.
> E.g. "Police" → «міліцыя» [militsiya — this and further
> comments in square brackets are by Impersonator],
> "Tom / Mary" → «Юрась [Juraś], Лёха [Locha],
> Анатоль [Anatol] / Настасся [Nastaśsia], Света [Śvieta],
> Паліна [Palina] »,
> "Tom got an A" → be «Васіль атрымаў дзясятку» [Vasil got 10] /
> ru «Вася получил пятёрку» [Vasia got 5].
> I see such a localization as the only viable long-term way
> to enforce language equilibrium in a systematically polyglossic
> environment and as the only way to produce *natural* single
> language corpus.


This approach is not new. People have been turning Tom into Foma and Mary into Masha for quite some time.

They are etymologically related, but in practice Tom and Foma are a pair like Yury and George — no one would replace one with another in real texts (except perhaps when discussing mideaeval or earlier history: Foma Akvinskiy = Thomas Aquinas; but then, most sentences about Tom are not about Thomas Aquinas and would sound quite strange if you try to imagine them in such a context).

The examples of Tom becoming Foma, and Mary becoming Masha, are too numerous, just type Фома or Маша in the search:
hide replies
2019-06-03 07:39
Are you suggesting a particular best practice, or avoiding a certain behaviour?
hide replies
2019-06-03 08:01
I don't have a solution for it, sorry

Maybe we should acknowledge that different translation types exist, and find some way to mark them in Tatoeba? Maybe we need different link types ('direct translation', 'adapted') or link tags?
hide replies
2019-06-03 09:17
We had a discussion about that with @seveleu_dubrovnik here:


And he came up with an idea to tag translations where he substitutes "placeholder names" with localized Belorussian names with "localized:Belarus", which seems like a sound idea to me.

An example:


In general, I'm pretty uncomfortable with translating Tom as Лёша or any other random but local name.

However, I do understand that in certain context this can be a valid or even desirable translation.
2019-06-05 19:58
> Maybe we should acknowledge that different translation types exist, and find some way to mark them in Tatoeba? Maybe we need different link types

I agree with this. Some users are in favor of using links for translations only, but in a linguistic sense, the definition of link is not restricted to translations. Synonymous sentences are semantically linked, too. Sentences with similar patterns are also logically linked. What could be linked and what couldn't is a bit relative and vague.

I think it would be more comfortable having different types of links: one for translations, one for synonymous or closely-related sentences in the same language, and one for sentences with similar patterns (or localized sentences like #7947998 ) in the same language. I'm not sure how all these categories could be shown at the same time without creating a visual chaos though. Maybe some filtering/hiding options or a tree view with collapsing/expanding feature might be necessary.

Here is a rough image of the idea.
2019-06-05 11:26

I'm not sure some of the options in advanced search are working as intended. The one I have in mind is the "exclude sentences" one.

I wanted sentences in Swedish without translations in French (any type of link). I received the sentence 1433364, that is indirectly linked to the English 511978, itself (directly) linked to the French 1742035.

Is it a bug, or because that a third-order link (link of a link of a link)?

hide replies
2019-06-05 18:35
It’s because it’s a third-order link.
2019-06-03 10:37
** Stats & Graphs **

New version of "Tatoeba Sentences & Translations Stats" is available. "Contributors" and "Natives" counters have been added.
hide replies
2019-06-03 16:22
Thanks for that! :-)
2019-06-04 01:38
Thanks a lot!
2019-06-04 09:39 - 2019-06-04 09:40
It's interesting how many native speakers of constructed languages we have.

The discrepancies with the data on are due to the following.

1. If the number on my page is higher, it's because I've visited profile pages of older members and added them to my list if they had that information there. If the information wasn't there, I wrote private message to these members and included them if they responded, telling me what their native languages were.

2. If the number is lower, it's likely because I filter out all members claiming more than one native language. However, I do write each of these members asking them to tell me what their actual native language is or what their strongest language is and add them into my list if they respond.

While, of course, there are those who have more than one native language, I'd rather err on the safe side.

According to last weekend's exported data, we have 820 members claiming to be native speakers of 2 or more languages, 80 or those claim 3 or more native languages, 14 of those claim 4 or more native languages, the highest being a member who claims 48 native languages.

hide replies
2019-06-04 10:10
> the highest being a member who claims 48 native language

Google the Almighty is among us!
2019-06-04 11:27
It is important to note that the UI string in the Languages interface says "Native level" and not "Native".

Tatoeba does not, in fact, collect data on what is a user's native language, it only asks a user if they feel their language skill is at a "native level".
2019-06-04 10:02 - 2019-06-04 10:03
** Old Screen Shot **

This is a screen shot of part of the main page from 2014-11-12 at 12.41.12 UTC

2019-05-29 12:35
** Stats & Graphs **

Tatoeba Stats, Graphs & Charts have been updated:
hide replies
2019-05-29 20:00
Thanks for the data. Have you ever considered including numbers of native speakers for each language in the stats? Watching their development might be interesting. The file "user_languages.tar.bz2" on the downloads page has the necessary information for that, I believe.
hide replies
2019-05-30 10:57
Technically it's possible and it would be relatively easy to add numbers of native speakers to the stats. The only note I'd like to make is one about the fact that data contained in the "user_languages.tar.bz2" file doesn't reflect the real situation since many members have never specified their native languages. There are also members who specified native languages incorrectly. All in all, I like the idea and will implement it eventually. Thanks for your advice. :-)
hide replies
2019-05-30 12:27 - 2019-05-30 12:37
Thank you. I'm aware that native speaker counts might not be so reliable, but they would at least give a rough picture.

The last stats show that there's a sudden boost in Turkmen this month. Nearly 50 self-declared native Turkmen accounts have been created recently.

Unfortunately, I suspect that most, if not all of those accounts might belong to the same person, possibly a non-native speaker. I explained my suspicion here:

The real problem here isn't using multiple accounts or contributing in non-native languages, but falsely stating oneself as a native speaker. Turkmen sentences here shouldn't be trusted as a reliable source until this issue is cleared up.

Having native speaker counts in the stats would help us noticing such unusual changes, too.
hide replies
2019-05-30 16:40 - 2019-05-30 16:45
> Unfortunately, I suspect that most, if not all of those accounts might belong to the same person

If your suspicions are based only on the fact that those accounts were created at about the same time, this might be a wrong assumption.

Tatoeba might have been advertised in some Turkmen community, thus a boost in user registrations. This happens from time to time to many language, and is especially noticeable when this language hadn't had a lot of contributors before that.

The use of the letter ı, since it doesn't exist in Turkmen, is suspicious though, indeed.

Having read this:

and this:

I was left under impression that Common Turkic Alphabet can be used in some cases to write in Turkmen, and it does have the letter ı, but, well, I'm not a specialist.
hide replies
2019-05-30 20:02
I sent you a PM.
hide replies
2019-06-03 09:10
Ok, I've reviewed all the evidence and it does seem very probable that all those accounts belong to the same person.

However, before anyone does anything about that we still need a trusted native speaker to verify how good those translations are, I'm afraid it's virtually impossible to do anything about that without such verification.
2019-05-30 12:29
2019-05-30 17:25
What the heck are these data?
2019-06-02 19:27
Hello everyone!
I have a question to you. What does it do when I sort sentences by Relevance? Is this a new thing? I think I haven't seen it before.
hide replies
2019-06-03 05:18 - 2019-06-22 05:34
hide replies
2019-06-03 06:25
Köszi a magyarázatot!
=" "-os-ról tutdam, de nem emlékszem, hogy korábban láttam a Relevance-t. Lehet, hogy csak nem tűnt fel. De aztán nagyjából magamtól is rájöttem, mire való.
2019-06-02 09:49
** Native Speakers with Native Language Sentences **

If you prefer to read sentences by native speakers when you're studying a language, this page will help you.

I updated this page based on the June 1, 2019 exported data.
hide replies
2019-06-02 13:45
When you sent me the link of that page a few days ago, Turkmen was not included on the list. But now it is. You should exclude Turkmen. Those self-declared native Turkmen accounts are fake. There has probably never been a native Turkmen speaker on Tatoeba, unfortunately.
2019-05-29 02:13
** Unowned Sentences That May Be Adopted **
hide replies
2019-06-01 21:51
2019-06-01 20:56
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.