پروفائل
جملے
ژخیرہ الفاظ
ریویو
تندیراں
پسنداں
تبصرے
FeuDRenais دے جملیاں تے تبصرے
وال سنیہے
لوگو
آڈیو
نقلاں
FeuDRenais دے جملیاں دا ترجمہ کرو

It would be nice to have a "similar sentences" tab somewhere on the side while you're viewing a particular sentence.

Or rather, when given 100 search results, optimize it so that the first 10 have the greatest variety and are the most mutually diverse (with diversity decreasing as you go up the pages).
That would require some math and some criteria to characterize what is "diverse" with respect to one another (i.e. ranking of importance for different vocabulary, with names having lowest importance), but it would be an elegant way to do it.

I mean automate search results to not display similar sentences in the same language.

Yes. Slavic languages have quite a few declensions for names. Turkish ones as well, and then you have things like vowel harmony to worry about.
Personally, I don't much like the idea of "efficient names". If it's a problem of database size, then sure. Otherwise, optimizing presentation methods would be a better approach to the problem being discussed, IMO.

Intentional misinterpretation.

It's not that sacredceltic's point isn't valid - it is. There's certainly an exploitable flaw.
It's just that I've yet to see any real group warfare here, and the example that he used is, IMO, unnecessarily melodramatic. But could it happen? In theory, yes.

Mais juste pour me contredire, je me demande parfois pourquoi c'est "Taiwan, Province of China" dans la liste pour le profil... À mon avis, Taiwan serait plus neutre.

> Pour un Chinois Han de la République Populaire de Chine, dire que le Tibet est occupé par la Chine est offensant.
Si c'est offensant pour quelqu'un, c'est offensant (ce sont tes propres mots).
Si un Tibétain rejoigne cette site, il ne doit pas s'en servir afin de évoquer cette "occupation", mais afin d'apprendre et de traduire des phrases. Peut-être que me je trompe, mais je ne vois pas la relation entre Tatoeba et les droits de l'homme.

Bijective, mais seulement en théorie. Voilà :
http://tatoeba.org/eng/sentences/show/661001
Une phrase qui est liée mais pas vraiment.

Why don't you guys ask dominiko if he has a problem with this "theft", instead of blowing things out of proportion and acting like Scott has committed some ridiculous act?
You're a third party here, sc. "Le comportement correct" may only be "correct" in your little microcosm, don't forget.

Is this satire?

Ce n'est pas un peu extrême comme avis ? Les acronymes sont là parce qu'ils sont utiles.
Je peux comprendre que l'idée d'acronymes fondés sur l'anglais te gêne, mais ça ne change pas le fait qu'ils restent utiles. Si tu veux, invente des acronymes neutres, pour que tout le monde puisse s'en servir. Même si ça serait 须检 (ou je sais pas quoi), c'était plus pratique de taper ça au lieu de "Needs Native Check" ou son équivalent traduit.

> Ben si t'es Ouïghour, ça m'étonnerait que "@NNC" te dise grand chose...
Chaque communauté a son propre langage, TTB y comprise. S'il existe des acronymes flous comme NNC, on pourrait les mettre dans un FAQ ou quelque chose de pareille.
Si c'est toujours pas clair à cause du barrière de la langue, je crois que le problème serait résolu par des étiquettes générales traduites. T'es d'accord ?

What is this now? TTB witch hunts?

That's good speed though. If you partitioned by language and string length it probably would go fast enough not to bother a regular user too much.
But I agree that it doesn't really make sense if the duplicate script is going to do the job anyway.

> La proportion de FeuDRenais devrait considérer les langues.
"Toutes les langues sont égales sur Tatoeba."

+1

Out of curiosity, how much time would it take to do something as brute as run a similar_text() comparison between a new translation and all the existing sentences already in the database? Really, really long, I'm guessing?

I wonder what the worst case number is.
Let's just go nuts, and say that it's, I dunno, 80,000. So, roughly 10% of your contributions are duplicates.
I would say that *even then*, if you told me that 1 out of every 10 of my translations was a link instead of a brand new thing, I wouldn't throw a fit and start complaining about mass inefficiency. It's really not that big of a deal...

Also, (840000-13000)/840000 = 98.5% efficiency. I don't know what everyone's standards are, but that's pretty damn good, IMO.
(or is the total number of duplicates over all time >> 13000?)