clear
{{language.name}} No language found.
swap_horiz
{{language.name}} No language found.
search

Wall (5415 threads)

sharptoothed
2019-06-10 14:58
** Stats & Graphs **

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/
hide replies
maaster
2019-06-10 17:39
Great. Thanks.
Guybrush88
2019-06-11 13:18
thanks
hide replies
Boto
2019-06-12 14:54
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.
Boto
2019-06-12 14:48
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.
Impersonator
2019-06-03 07:18 - 2019-06-03 07:21
ON PLACEHOLDER NAMES, once again


A problem with placeholder names is that people start treating them as placeholders:

#7947998 Tom becomes Loša
#7860419 Tom/Mary becomes Mikita/Tania

The tag localized:Belarus has been invented for such 'placeholder replacement', but the problem is, the tag belongs to sentences, not to translations. If a sentence has many links, it won't be clear which one was un-localised.

As seveleu_dubrovnik puts it in his profile:

> I translate neutral names, social structures,
> grade names etc as neutral ones in the destination
> language's socio-linguistic landscape.
> E.g. "Police" → «міліцыя» [militsiya — this and further
> comments in square brackets are by Impersonator],
> "Tom / Mary" → «Юрась [Juraś], Лёха [Locha],
> Анатоль [Anatol] / Настасся [Nastaśsia], Света [Śvieta],
> Паліна [Palina] »,
> "Tom got an A" → be «Васіль атрымаў дзясятку» [Vasil got 10] /
> ru «Вася получил пятёрку» [Vasia got 5].
> I see such a localization as the only viable long-term way
> to enforce language equilibrium in a systematically polyglossic
> environment and as the only way to produce *natural* single
> language corpus.

_______

This approach is not new. People have been turning Tom into Foma and Mary into Masha for quite some time.

They are etymologically related, but in practice Tom and Foma are a pair like Yury and George — no one would replace one with another in real texts (except perhaps when discussing mideaeval or earlier history: Foma Akvinskiy = Thomas Aquinas; but then, most sentences about Tom are not about Thomas Aquinas and would sound quite strange if you try to imagine them in such a context).

The examples of Tom becoming Foma, and Mary becoming Masha, are too numerous, just type Фома or Маша in the search:
https://tatoeba.org/rus/sentenc...rom=und&to=und
https://tatoeba.org/rus/sentenc...rom=und&to=und
hide replies
Thanuir
2019-06-03 07:39
Are you suggesting a particular best practice, or avoiding a certain behaviour?
hide replies
Impersonator
2019-06-03 08:01
I don't have a solution for it, sorry

Maybe we should acknowledge that different translation types exist, and find some way to mark them in Tatoeba? Maybe we need different link types ('direct translation', 'adapted') or link tags?
hide replies
deniko
2019-06-03 09:17
We had a discussion about that with @seveleu_dubrovnik here:

#7857041

And he came up with an idea to tag translations where he substitutes "placeholder names" with localized Belorussian names with "localized:Belarus", which seems like a sound idea to me.

An example:

#7947998

In general, I'm pretty uncomfortable with translating Tom as Лёша or any other random but local name.

However, I do understand that in certain context this can be a valid or even desirable translation.
soliloquist
2019-06-05 19:58
> Maybe we should acknowledge that different translation types exist, and find some way to mark them in Tatoeba? Maybe we need different link types

I agree with this. Some users are in favor of using links for translations only, but in a linguistic sense, the definition of link is not restricted to translations. Synonymous sentences are semantically linked, too. Sentences with similar patterns are also logically linked. What could be linked and what couldn't is a bit relative and vague.

I think it would be more comfortable having different types of links: one for translations, one for synonymous or closely-related sentences in the same language, and one for sentences with similar patterns (or localized sentences like #7947998 ) in the same language. I'm not sure how all these categories could be shown at the same time without creating a visual chaos though. Maybe some filtering/hiding options or a tree view with collapsing/expanding feature might be necessary.

Here is a rough image of the idea.

https://prnt.sc/ny3d99
Rockaround
2019-06-05 11:26
Hello,

I'm not sure some of the options in advanced search are working as intended. The one I have in mind is the "exclude sentences" one.

I wanted sentences in Swedish without translations in French (any type of link). I received the sentence 1433364, that is indirectly linked to the English 511978, itself (directly) linked to the French 1742035.

Is it a bug, or because that a third-order link (link of a link of a link)?

Thanks!
hide replies
gillux
2019-06-05 18:35
It’s because it’s a third-order link.
sharptoothed
2019-06-03 10:37
** Stats & Graphs **

New version of "Tatoeba Sentences & Translations Stats" is available. "Contributors" and "Natives" counters have been added.
https://tatoeba.j-langtools.com/transtop/
hide replies
soliloquist
2019-06-03 16:22
Thanks for that! :-)
Ricardo14
2019-06-04 01:38
Thanks a lot!
CK
CK
2019-06-04 09:39 - 2019-06-04 09:40
It's interesting how many native speakers of constructed languages we have.

The discrepancies with the data on http://bit.ly/nativespeakers are due to the following.

1. If the number on my page is higher, it's because I've visited profile pages of older members and added them to my list if they had that information there. If the information wasn't there, I wrote private message to these members and included them if they responded, telling me what their native languages were.

2. If the number is lower, it's likely because I filter out all members claiming more than one native language. However, I do write each of these members asking them to tell me what their actual native language is or what their strongest language is and add them into my list if they respond.

While, of course, there are those who have more than one native language, I'd rather err on the safe side.

According to last weekend's exported data, we have 820 members claiming to be native speakers of 2 or more languages, 80 or those claim 3 or more native languages, 14 of those claim 4 or more native languages, the highest being a member who claims 48 native languages.


hide replies
deniko
2019-06-04 10:10
> the highest being a member who claims 48 native language

Google the Almighty is among us!
sabretou
2019-06-04 11:27
It is important to note that the UI string in the Languages interface says "Native level" and not "Native".

Tatoeba does not, in fact, collect data on what is a user's native language, it only asks a user if they feel their language skill is at a "native level".
CK
CK
2019-06-04 10:02 - 2019-06-04 10:03
** Old Screen Shot **

This is a screen shot of part of the main page from 2014-11-12 at 12.41.12 UTC

https://prnt.sc/nxed75

sharptoothed
2019-05-29 12:35
** Stats & Graphs **

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/
hide replies
soliloquist
2019-05-29 20:00
Thanks for the data. Have you ever considered including numbers of native speakers for each language in the stats? Watching their development might be interesting. The file "user_languages.tar.bz2" on the downloads page has the necessary information for that, I believe.
hide replies
sharptoothed
2019-05-30 10:57
Technically it's possible and it would be relatively easy to add numbers of native speakers to the stats. The only note I'd like to make is one about the fact that data contained in the "user_languages.tar.bz2" file doesn't reflect the real situation since many members have never specified their native languages. There are also members who specified native languages incorrectly. All in all, I like the idea and will implement it eventually. Thanks for your advice. :-)
hide replies
soliloquist
2019-05-30 12:27 - 2019-05-30 12:37
Thank you. I'm aware that native speaker counts might not be so reliable, but they would at least give a rough picture.

The last stats show that there's a sudden boost in Turkmen this month. Nearly 50 self-declared native Turkmen accounts have been created recently.

https://tatoeba.org/eng/users/for_language/tuk

Unfortunately, I suspect that most, if not all of those accounts might belong to the same person, possibly a non-native speaker. I explained my suspicion here: https://tatoeba.org/eng/sentenc...omment-1098662

The real problem here isn't using multiple accounts or contributing in non-native languages, but falsely stating oneself as a native speaker. Turkmen sentences here shouldn't be trusted as a reliable source until this issue is cleared up.

Having native speaker counts in the stats would help us noticing such unusual changes, too.
hide replies
deniko
2019-05-30 16:40 - 2019-05-30 16:45
> Unfortunately, I suspect that most, if not all of those accounts might belong to the same person

If your suspicions are based only on the fact that those accounts were created at about the same time, this might be a wrong assumption.

Tatoeba might have been advertised in some Turkmen community, thus a boost in user registrations. This happens from time to time to many language, and is especially noticeable when this language hadn't had a lot of contributors before that.

The use of the letter ı, since it doesn't exist in Turkmen, is suspicious though, indeed.

Having read this:

https://en.wikipedia.org/wiki/Turkmen_alphabet

and this:

https://en.wikipedia.org/wiki/C...urkic_Alphabet

I was left under impression that Common Turkic Alphabet can be used in some cases to write in Turkmen, and it does have the letter ı, but, well, I'm not a specialist.
hide replies
soliloquist
2019-05-30 20:02
I sent you a PM.
hide replies
deniko
2019-06-03 09:10
Ok, I've reviewed all the evidence and it does seem very probable that all those accounts belong to the same person.

However, before anyone does anything about that we still need a trusted native speaker to verify how good those translations are, I'm afraid it's virtually impossible to do anything about that without such verification.
Guybrush88
2019-05-30 12:29
thanks
Colbo
2019-05-30 17:25
What the heck are these data?
jegaevi
2019-06-02 19:27
Hello everyone!
I have a question to you. What does it do when I sort sentences by Relevance? Is this a new thing? I think I haven't seen it before.
hide replies
Etvreurey
2019-06-03 05:18 - 24 days ago
.
hide replies
jegaevi
2019-06-03 06:25
Köszi a magyarázatot!
=" "-os-ról tutdam, de nem emlékszem, hogy korábban láttam a Relevance-t. Lehet, hogy csak nem tűnt fel. De aztán nagyjából magamtól is rájöttem, mire való.
CK
CK
2019-06-02 09:49
** Tatoeba.org Native Speakers with Native Language Sentences **

http://bit.ly/nativespeakers

If you prefer to read sentences by native speakers when you're studying a language, this page will help you.

I updated this page based on the June 1, 2019 exported data.
hide replies
soliloquist
2019-06-02 13:45
When you sent me the link of that page a few days ago, Turkmen was not included on the list. But now it is. You should exclude Turkmen. Those self-declared native Turkmen accounts are fake. There has probably never been a native Turkmen speaker on Tatoeba, unfortunately.
CK
CK
2019-05-29 02:13
** Unowned Sentences That May Be Adopted **

http://tatoeba.byethost3.com/st...5-unowned.html
hide replies
Ricardo14
2019-06-01 21:51
Thanks