clear
{{language.name}} No language found.
swap_horiz
{{language.name}} No language found.
search

الحائط (٥٬٥٥١ موضوعًا)

CK
CK
قبل 10 ساعات - قبل 10 ساعات
We now have 44,717 members.

https://tatoeba.org/eng/users/a...direction=desc

Only 27% of these members have contributed sentences.
أخفِ الردود
Pfirsichbaeumchen
قبل 10 ساعات
Can you find out how many have contributed sentences? Spam accounts are added every day.
أخفِ الردود
CK
CK
قبل 10 ساعات - قبل 10 ساعات
12,177 members own 1 or more sentences.

3,040 members have contributed exclusively in their own native languages.
4,314 members without identified native languages have contributed in just 1 language.
2,811 members have contributed in 2 languages.
1,001 members have contributed in 3 languages.
394 members have contributed in 4 languages.
215 members have contributed in 5 languages.
104 members have contributed in 6 languages.
72 members have contributed in 7 languages.
45 members have contributed in 8 languages.
26 members have contributed in 9 languages.
21 members have contributed in 10 languages.
16 members have contributed in 11 languages.
19 members have contributed in 12 languages.
11 members have contributed in 13 languages.
9 members have contributed in 14 languages.
10 members have contributed in 15 languages.
5 members have contributed in 16 languages.
8 members have contributed in 17 languages.
4 members have contributed in 18 languages.
4 members have contributed in 19 languages.
4 members have contributed in 20 languages.
5 members have contributed in 21 languages.
5 members have contributed in 22 languages.
2 members have contributed in 23 languages.
2 members have contributed in 24 languages.
3 members have contributed in 25 languages.
2 members have contributed in 26 languages.
1 member has contributed in 27 languages.
1 member has contributed in 28 languages.
1 member has contributed in 29 languages.
1 member has contributed in 30 languages.
1 member has contributed in 31 languages.
1 member has contributed in 32 languages.
1 member has contributed in 33 languages.
1 member has contributed in 34 languages.
1 member has contributed in 35 languages.
1 member has contributed in 36 languages.
1 member has contributed in 38 languages.
1 member has contributed in 39 languages.
1 member has contributed in 41 languages.
1 member has contributed in 42 languages.
1 member has contributed in 44 languages.
1 member has contributed in 47 languages.
1 member has contributed in 50 languages.
1 member has contributed in 52 languages.
1 member has contributed in 54 languages.
1 member has contributed in 60 languages.
1 member has contributed in 63 languages.
1 member has contributed in 157 languages.
أخفِ الردود
deniko
قبل 8 ساعات
> 1 member has contributed in 157 languages.

Our overlord! The Messiah!
أخفِ الردود
fjay69
قبل 4 ساعات
I'm pretty sure it's Horus.
أخفِ الردود
deniko
قبل 4 ساعات
I don't think Horus adds new sentences. Just deletes some, and create new links for duplicates.

My bet is on Balamax
MaartenMutsaers
قبل 8 ساعات
Jap-Eng translator & annoying millennial here.

I have a question.
Can we start using "tech" instead of typing "technology" all the time? Or aren't we there yet? Just curious. My superior (senior) tells me "tech" is still too informal. I think it doesn't matter.
أخفِ الردود
Thanuir
قبل 7 ساعات
Hey,

For a question and answer website about English usage, you might have more luck at the Stackexchange sites: https://english.stackexchange.com/ for most question you might have and https://ell.stackexchange.com/ for simple questions

Though please familiarize yourself with the website before using and provide some context for the question. Maybe it even has been asked already over there?

Tatoeba is mostly a website for translating actual and concrete sentences, rather than a forum for how to translate things.
Thanuir
قبل 6 ساعات - قبل 6 ساعات
But if you really want to use Tatoeba for this, here are the sentences that use "tech":

https://tatoeba.org/est/sentenc...sort=relevance

Here are the sentences that use "technology":

https://tatoeba.org/est/sentenc...sort=relevance

You can check those examples for how people in this community have used those words. While you are here, maybe you would like to translate some of those sentences to Dutch, since such translations are the main purpose of this website?
أخفِ الردود
MaartenMutsaers
قبل 6 ساعات
Got it. I think my English proficiency exceeds my Dutch, but I will definitely give it a go.
أخفِ الردود
Thanuir
قبل 6 ساعات
Hi,

If English is your strongest language, then it might be better to translate to English.

Here are some presumably high quality Dutch sentences without English translations: https://tatoeba.org/est/sentenc...o=&sort=random

Here the same for Japanese: https://tatoeba.org/est/sentenc...o=&sort=random

You can play around with the search settings to get more sentences, or to search for sentences with particular words. See the sidebar.
CK
CK
قبل 5 ساعات
You can also search for both words at the same time using the bar | which means "or" and at the same time limit your searches to sentences that appear on my proofread list of sentences.

tech|technology

https://tatoeba.org/eng/sentenc...o=&sort=random

You will notice that there are a number of sentences with "technology" that would seem strange if the word "tech" were used in its place.

Aiji
2019-09-14 01:39
https://tatoeba.org/fra/user/profile/LINGUISTE

Toutes ses phrases me semblent avoir été écrites en mode singe intelligent. Beaucoup d'entre elles n'ont aucun sens. Malheureusement, comme à chaque fois, certaines d'entre elles sont correctes.

Je suggère de bloquer ces phrases pour éviter que des gens traduisent n'importe quoi.
أخفِ الردود
marafon
2019-09-14 12:00 - 2019-09-14 12:01
+1
TRANG
2019-09-15 19:51
Est-ce qu'il y a vraiment un problème à les laisser ouvertes aux traductions et laisser chaque contributeur décider par eux-même quelles sont les phrases qui méritent d'être traduites?

On n'a pas vraiment besoin de forcer les choses. À vue d'oeil, seule une minorité de ces phrases ont été traduites.

Il faudrait d'abord voir si des phrases qui n'ont aucun sens ont été traduites. Et si oui, il faudrait consulter les traducteurs pour avoir leur avis: est-ce qu'ils ont juste traduit en mode auto-pilote ou est-ce qu'ils ont traduit parce que la phrase originale leur a servi à quelque chose?
أخفِ الردود
Aiji
2019-09-16 09:31
Oooh que oui, il y a un intérêt.
Je me suis occupé de milliers de phrases ophelines et de phrases clairement pas en français correct, et oui, il y a un intérêt. Je suis certain que si on demande l'avis d'AlanF_US, il sera d'accord avec moi.

Le problème c'est que des gens qui croient comprendre la phrase la traduisent dans ce qu'ils croient comprendre. Et comme souvent, ce ne sont pas des natifs français, ils pensent que des erreurs qui sonnent bien dans leur langue ne posent pas de problème.
Un exemple frappant serait par exemple « Je te serre ta main. » qui n'est clairement pas correct, mais qui se traduit tel quel dans beaucoup de langues.
Un autre problème sont toutes les phrases de Maxence par exemple, qui à l'inverse ont d'abord été traductions (horribles) de phrases espagnoles. Le problème surgit quand des traductions portugaises arrivent, parce que la phrase incorrect française fait sens telle quelle en portugais.

Et des exemples comme ça, y en aurait plein. Donc oui, forcer les choses ça peut avoir du bon. Par exemple, en s'arrangeant pour que si un doublon se fait fusionner qu'il appartienne à l'utilisateur qui n'est pas dans le rouge (ou le plus récent, ou autre), c'est juste une suggestion.
أخفِ الردود
CK
CK
قبل 30 يومًا - قبل 30 يومًا
I don't read French, so this is from Google Translate.

The problem is that people who think they understand the sentence translate it into what they think they understand. And as often, they are not French natives, they think that errors that sound good in their language do not pose a problem.

Additional comment.

This is also one of the primary problems with non-native sentence contributions, which is one reason I keep trying to get members to contribute in their own native languages. We could likely drastically improve the quality of our corpus if all members would do so.
أخفِ الردود
soliloquist
قبل 30 يومًا - قبل 30 يومًا
> This is also one of the primary problems with non-native sentence contributions,
> which is one reason I keep trying to get members to contribute in their own native
> languages. We could likely drastically improve the quality of our corpus if all
> members would do so.

This is true when one adds original sentences in their native language like you do, but not so true with translations. Awkward word-for-word translations are a huge problem, too. It's possible to filter out non-native or unowned sentences when searching, but such unnatural translations by native speakers are not so easy to avoid. We don't have an option to exclude sentences by particular users or sentences with particular tags.
AmarMecheri
قبل 29 يومًا
Salut
J'ai déjà signalé la phrase « Tu pars sous ton nez. » ( que je trouve incohérente).
Mais je m'y suis arrêté, sans lire les phrases restantes.
AlanF_US
قبل 28 يومًا - قبل 28 يومًا
The most useful thing you can do if you see bad sentences is to write comments on them. This accomplishes multiple things:

(1) It allows the author, or a corpus maintainer, to fix the sentence.
(2) It allows people to determine from the comments left on the author's sentences whether their sentences can be trusted in general.

I like to write comments in a form that indicates the current contents and the suggested rewording. This allows people to see easily whether the comment is still valid. For example:

My dog have fleas. -> My dog has fleas.

Such comments are quick and easy to write, especially with the use of the "copy sentence" button at the right of each sentence. Advanced contributors should also add the "@change" tag so that the sentence will be seen when corpus maintainers review all the sentences with that tag.

Of course, you don't have to go through all of a contributor's sentences. But if you see 10 bad sentences and write comments on them, someone who looks at the comments on that contributor's sentences will see the pattern.

The other piece of this approach, naturally, is that when you choose sentences to translate by an owner you are not already familiar with, you should look at the comments left on their sentences to see whether there are a lot of complaints on them.

If you leave comments on sentences, you give people a concrete basis for deciding not to translate sentences from that contributor (or for administrators to take some other action, though that is less likely). The comments will also persist and be searchable when someone checks the contributor's profile, whereas Wall messages eventually drop off the front page and become hard to find.
أخفِ الردود
AmarMecheri
قبل 27 يومًا - قبل 27 يومًا
@AllanF_US
I agree with your comment and thank you for writing this very helpful reminder:

... "The most useful thing you can do if you see bad sentences is to write comments on them. This accomplishes multiple things:

(1) It allows the author, or a corpus maintainer, to fix the sentence.
(2) It allows people to determine from the comments left on the author's sentences whether their sentences can be trusted in general.

I like to write comments in a form that indicates the current contents and the suggested rewording. This allows people to see easily whether the comment is still valid."

It's wonderful!

NB: Last but not least, it encourages non-English speaking contributors to improve their English and to allow them to explain (at least in the comments) the meaning of the sentences they write in their native languages that do not find competent translators (because they are considered as minor or infrequent since their speakers belong to linguistic minorities)?
TRANG
قبل 28 يومًا
Ce que tu décris n'est pas vraiment un problème de mon point de vue.

Si un contributeur décide de traduire "Je te serre ta main", mais que la traduction est une phrase parfaitement correcte dans la langue cible, on corrige la originale phrase qui sonne faux et puis c'est tout. Même si la phrase originale n'était pas correcte, elle aura au moins inspirée à la création d'une phrase correcte, et il n'y a pas de mal à cela.

Certes, on peut se demander est-ce que le traducteur a vraiment compris la phrase ou non. Comment corriger la phrase en s'assurant qu'elle garde la même signification que les traductions? Dans certains cas, il faut discuter avec les traducteurs sur comment ils ont compris la phrase. Mais j'ai du mal à imaginer qu'il y ait beaucoup de cas complexes à ce point. Pour ton exemple, changer "Je te serre ta main" vers "Je te serre la main" ne va pas influencer en quoique ce soit les traductions.

Au pire, si c'est juste trop galère de décider comment corriger la phrase originale, on peut toujours la délier de ses traductions.

Je ne suis pas en train de dire que les phrases de LINGUISTE sont de bonne qualité et de grande utilité. Mais les bloquer ne résoud pas grand chose. Ça ne fait qu'imposer un point de vue, relativement subjectif, sur ce qu'est une "bonne phrase".

Si une phrase n'a absolument aucun sens et ne sert pas à grand chose, il est très improbable qu'elle soit traduite. Les gens préfèrent traduire des phrases qui leurs sont utiles, qui sont intéressantes et qui veulent dire quelque chose.

Si tu as une preuve que les phrases qui sonnent faux ont nettement plus de chance de générer des traductions qui sonnent faux, je comprendrais que tu veuilles empêcher la traduction de telles phrases. Mais selon moi ce n'est pas le cas. Des phrases parfaitement correctes ont autant de chances de générer des traductions qui sonnent faux. La variable principale, c'est le niveau de compétence du traducteur et non pas la qualité de la phrase originale.

Ton problème, j'ai l'impression, est plutôt que c'est frustrant de passer en revue les phrases d'un utlisateur qui n'a pas beaucoup de rigueur et n'a pas mis beaucoup d'effort dans la création de ses phrases. Et je comprends très bien. Mais je pense que c'est un problème qui d'une part se résoud un peu plus en amont, en limitant la quantité de phrases/traductions qu'un nouveau contributeur peut créer. Et d'autre part, en implémentant des filtres pour que chaque utilisateur puisse ignorer les phrases qui ne leurs plaisent pas selon des critères génériques.
أخفِ الردود
CK
CK
قبل 28 يومًا - قبل 28 يومًا
This search will find over 900 of his sentences that do not yet have translations.

https://tatoeba.org/eng/sentenc...=&sort=created

I think that if a native French speaker would have the time to at least go through these and delete the bad ones, it would help the project a lot. It would help a lot, too, if at the same time an OK rating would be added to the good sentences.

It's a lot more difficult to deal with bad sentences once they are linked to other sentences, so doing this now would likely avoid a lot of hassle.
أخفِ الردود
TRANG
قبل 28 يومًا
You're suggesting that we take the easy way. But in this case, I don't think the easy way is the best way.

I, personally, do not mind if someone takes the liberty to delete some of these sentences because I, personally, do not care about them. But by principle, I would have zero arguments to defend us on why we have deleted them. The deletions would be a personal, subjective choice.

Yes, it takes more effort to deal with bad sentences that have been linked, rather than deleting them before they get to be linked. But it can be a worthwhile effort:

1) We get to practice improving sentences. Native speakers could learn things about their own language, or sharpen their literary skills, by doing this exercise.
2) We can explain to those who added the translations about why it was a bad sentence and maybe they can learn something out of it and will choose more wisely their next sentences to translate.
3) It can be an eye-opener on what really is a good or bad sentence.
أخفِ الردود
CK
CK
قبل 28 يومًا
I assume from your response that these French sentences are probably a lot better than the original message implied, and that they are the kind of sentences that you yourself might use. I had sort of figured that maybe these were more of those bot-created-type of sentences that have come from several usernames contributing French sentences, so I thought that it would be better to quickly eliminate the non-translated bad ones before they caused any more problems. The ones that are already linked to other sentences would take a little more time to take care of.
أخفِ الردود
TRANG
قبل 27 يومًا
I have not proofread those sentences and I have no idea how bad they really are. Some might be bad beyond repair. But mostly, I was questioning the argument that "it's a lot more difficult to deal with bad sentences once they are linked to other sentences".

Because to me, if a sentence has no sense and no value, it's very unlikely that it will be translated. If it has been translated (by a sentient, intelligent being), then it may not be bad enough to be deleted and it's worth taking the time to figure it out.
AmarMecheri
قبل 27 يومًا
@CK
Have you thought about Francophones (non-French) whose French was (and still is) their working language throughout their working life (and some of whom speak French better than their mother tongue and sometimes better than some real French speakers)? Do you consider them as native speakers?
Friendly yours. AmarMecheri
AmarMecheri
أمس - أمس
All that for this! would we say in French!
This controversy about the flag was raised long before I joined Tatoeba (July 28, 2018) ... by one or two people who have tried to multiply themselves under several names (some of them produced ZERO phrases, no other than vehement comments).
We have explained at length, at least since then, that we have adopted the Kabyle flag for the sake of convenience of visual identification. But we have not been listened to; the result speaks for itself!
There will be a big tchekchuk (Algerian ratatouille) because of this confusion of LOGOS (KAB and/or BER). We hope someone will fix it as soon as possible!
A Kabyle proverb says, "Awal am teṛṣaṣt; mi yeffeɣ ur d-iţţuɣal!" [A word is like a bullet (rifle); once gone, it does not come back!).
Cordial greetings
------------------------------------------------------
Tout ça pour ça! dirait-on en français!
Cette polémique au sujet du drapeau a été suscitée bien avant avant que je rejoigne Tatoeba (le 28 juillet 2018) ... par une ou deux personnes qui se sont évertué(e)s à se démultiplier sous plusieurs pseudos (dont certains n'ont produit aucune phrase autre que des commentaires véhéments).
Nous avons expliqué, en long et en large, au moins depuis cette date, que nous avions adopté le drapeau kabyle par simple commodité d'identification visuelle. Mais nous n'avons pas été écoutés; le résultat parle de lui-même!
Il y aura une grande tchektchouka (ratatouille algérienne) à cause de cette confusion de LOGOS KAB et BER. Nous espérons que quelqu'un va y remédier, le plus tôt possible!
Un proverbe kabyle dit bien: "Awal am teṛṣaṣt; mi yeffeɣ ur d-iţţuɣal!" [Un mot est comme une balle (de fusil; sorti, il ne revient plus!].
Salutations cordiales
أخفِ الردود
BakirHamou
أمس
La mobilisation de la militance de votre mouvement pour venir en aide aux deux contributeurs, que j'apprécie sur le plan de la linguistique mais pas sur le plan politique, suite au changement de l'icône de la langue amazigh et de sa variante kabyle n'aura aucun effet car les concepteurs de ce site sont des scientifiques immunisés aux manipulations politiques.
أخفِ الردود
belkacem77
أمس
@BakirHamou

Vous êtes extrêmement dangereux non seulement pour l'identité Kabyle, mais pour l'Afrique du Nord entière . J'espère que tu as compris que même les arabophones ont compris la nécessité de la diversité, de la décentralisation et du respect mutuel. Si tu ne le sais pas encore, le peuple algérien en entier est dehors à cause de la pensée merdique que vous êtes entrain de défendre et d'imposer aux autres non seulement sur le plan linguistique.

Vous avez tété de l'arabo-islamisme. Mais eux, il ont pris conscience de la nécessité de changer, et vous, vous êtes coincé dans le stalinisme primaire conjugué à votre culture arabo-iẓẓanique primitive que vous avez hérité de vos maîtres bédouins.

La polémique vous colle comme une odeur merdique.
أخفِ الردود
BakirHamou
قبل 17 ساعةً
Enfin vous retrouvez votre niveau UZULIX
AmarMecheri
قبل 18 ساعةً - قبل 17 ساعةً
@BakirHamou

Shouf ...., a ssi khouna! Well, brother of circumstance!
Licking the rangers is not enough for you. Now, you are so imbued with your infused science that you are awarding merit by decree to those who do not need your praise for whether they are scientists or not.
-----------------------------------------------------------
Chouf...., a ssi khouna! Tiens donc, frère de circonstance!
Lécher les rangers ne vous suffit pas. Maintenant, vous êtes tellement imbus de votre science infuse que vous attribuez du mérite "par décret" à ceux qui n'ont pas besoin de vos éloges pour savoir s'ils sont ou non des scientifiques.
samir_t
أمس - أمس
Here is already a sentence in Moroccan Tashelhit written in the Kabyle corpus.
https://tatoeba.org/eng/sentences/show/8256445
Now that the Berber and Kabyle flags are almost alike, I remind contributors not to be wrong, do not write in Moroccan Tashelhit for example or other under the Kabyle flag, you must look at the letters associated with the Berber flag.
أخفِ الردود
belkacem77
أمس
@samir_t

On aura beaucoup de problèmes avec ce drapeau. Difficile plus tard de faire des traitements de masse ce qui risque de prolonger les autres projets lancés à cause de la segmentation.
On est pas sorti de l'auberge.
أخفِ الردود
samir_t
أمس - أمس
En effet, ça ne règle pas vraiment le problème, ces drapeaux sont presque identiques, on ne peut plus les distinguer du premier coup d'œil, moi personnellement je trouve ça déroutant, puisque ça me prend du temps pour savoir si une phrase est déjà traduite en kabyle ou en "berbère".
أخفِ الردود
TRANG
أمس
J'ai reporté ce problème à notre responsable des icônes de langues. Je suis d'accord que ce n'est pas idéal que les icônes pour ces deux langues soient aussi similaire, mais bon, pour l'instant c'est le mieux qu'on puisse faire. Merci de votre patience.
TRANG
أمس
Le plus gros du problème ne vient pas du drapeau, je pense, mais de notre outil d'auto-détection des langues.

Par défaut, lorsqu'un contributeurs a ajouté deux langues ou plus dans son profil, la langue sélectionnée lors de l'ajout d'une phrase est auto détectée. Malheureusement, notre auto-détection n'est pas parfaite et il arrive qu'une phrase soit marquée comme kabyle alors qu'elle est dans une autre langue.

Le problème survenait déjà avant le changement du drapeau, par exemple: https://tatoeba.org/eng/sentences/show/8218688.

Si cela arrive de plus en plus souvent, on peut investiguer pour voir si on peu améliorer l'algorithme d'auto-détection. En attendant, il faudra sans doute encourager les contributeurs à choisir précisémment la langue des phrases qu'ils ajoutent au lieu de laisser la sélection sur "Auto detect".
أخفِ الردود
belkacem77
أمس
J'ai partagé sur Github un algo écrit en Python sur la détection automatique. Il est basé sur les NGrams. Peut être qu'il pourra servir à l'équipe technique.

Mais je suis sur que ce drapeau présente aussi une ambiguité pour beaucoup
أخفِ الردود
TRANG
أمس
On utilise aussi un algo basé sur les NGrams. Le code source est aussi sur GitHub, au cas où vous voudriez contribuer à l'amélioration:
https://github.com/Tatoeba/Tatodetect

Quant à l'ambiguité due à la similarité entre les icônes du kabyle et du berbère, j'ai répondu à ce sujet à samir_t.

À savoir que nous avons le problème pour d'autres langues, notamment:
https://tatoeba.org/eng/sentenc...ne/indifferent
https://tatoeba.org/eng/sentenc...ne/indifferent

C'est un problème de design pour lequel nous n'avons pas encore de solution définitive. Il faudra faire preuve de patience.
samir_t
أمس - أمس
Il n'y a pas que l'auto-détection, il se trouve qu'on voit parfois un grand nombre de phrases portant à première vue le même drapeau, puisque les trois lettres les distinguant n'attirent pas forcément le regard, ce qui fait qu'on doit prendre du temps à les examiner. Je souhaite voir à l'avenir une autre différenciation pour le corpus kabyle, celui du berbère pouvant rester tel qu'il est sans y ajouter de lettres (BER), puisque il représente à lui seul tout l'espace des langues berbères.
أخفِ الردود
BakirHamou
أمس
Nous sommes dans un terrain linguistique dans l’intérêt de notre langue évitons l'amalgame avec l’idéologique car l'histoire nous enseigne que cette confusion aboutit toujours au fascisme. Je remercie l’équipe de Tatoeba pour l'effort qu'ils ont fait pour aboutir à ce consensus.
أخفِ الردود
belkacem77
أمس
La polémique vous colle comme une odeur infecte. LE peuple algérien est dehors pour dénoncer le stalinisme primaire que vous êtes entrain de défendre. Ne t'inquiète pas, tout ira vers la décentralisation politique et linguistique. C'est une nécessité absolue pour toute bonne gouvernance et pas seulement sur les aspects linguistique mais aussi territorial. On ne peut y échapper. Le mur de Berlin est tombé. L'arabo islamisme est vaincu. Place aux peuples et à la diversité.
أخفِ الردود
BakirHamou
أمس
Je partage votre analyse. Effectivement place à la diversité linguistique en Algérie, le tamazight est désormais langue national et officielle à coté de la langue arabe.
أخفِ الردود
belkacem77
أمس - أمس
La culture anglo saxone a vaincu le jacobinisme français qui lui aussi est entrain de disloquer. Le Breton, Le Corse, Le Basque, Le Catalan, L'Ecossais, LE Gallois, .... montent en flèche car l'occident a compris que la stabilité est dans la diversité.
JE vous suggère d'aller développer votre novlangue au Moyen Orient, ce sera un terreau naturel pour vous.

Hitler est mort le 30 avril 1945.
Staline est mort le 5 mars 1953.
L'arabo-islamisme est vaincu.
Le berbérisme est mort.
Place au monde des peuples, aux couleurs et à la diversité.

Les langues berbères segmentées feront un saut sans vos contraintes idéologiques comme c'est le cas pour le Kabyle.
TheBahi
أمس
C'est par ce que toutes les propositions que nous avons faites avec les admins ont été refusées par les uns et les autres.
Je trouve moi aussi que les drapeaux se ressemblent, alors je maintient ma proposition d'un simple drapeau vert et jaune, ou même un drapeau proche de l’emblème de la JSK
أخفِ الردود
AmarMecheri
أمس - أمس
Dacu-t uxeṣṣsaṛ-agi? What's this mishmash?
What is this diktat? In the name of who and what people who call themselves Kabyles and do NOT contribute anything in the Kabyle corpus), will they impose the banner of the JSK (which I respect but which has nothing to do with the question)? By what right will "collaborationnist Kabyles" and long-term or recently assimilated persons continue to upset the noble work of others? Did we ever say anything about what you put in the BER corpus that I never consulted? If not when someone translates in (kabyle) BER my Kabyle sentences? I have never tried to correct anyone and there are not the opportunities that are lacking, especially on the ideological, clearly Arab-Islamic, if not subversive, content of certain contributions.
So, leave us alone, go! Go to sleep, go; as FELLAG says!
https://www.youtube.com/watch?v=7MAbJ_zfXss
--------------------------------------------------------------------------------------
Dacu-t uxeṣṣsaṛ-agi? C'est quoi, ce micmac?
C'est quoi, ce diktat? Au nom de qui et de quoi des gens qui se disent kabyles et ne contribuent AUCUNEMENT ET D'AUCUNE SORTE (dans le corpus kabyle), vont-ils imposer la bannière de la JSK (que je respecte mais qui n'a rien à voir avec la question)? De quel droit des Kabyles de service et des assimilés de longue date ou fraîchement, vont-ils continuer à chambarder le travail noble des autres? Est-ce que nous, nous avons dit un jour quoi que ce soit sur ce que vous mettez dans le corpus BER que je n'ai jamais consulté? Si ce n'est quand quelqu'un traduit en (kabyle) BER mes phrases kabyles? Je n'ai jamais essayé de corriger qui que ce soit et ce ne sont pas les occasions qui manquent, notamment sur le contenu idéologique, nettement arabo-islamique, sinon subversif, de certaines contributions.
Alors, fichez-nous la paix, va! Va dormir, va; comme dit FELLAG!
https://www.youtube.com/watch?v=7MAbJ_zfXss
أخفِ الردود
samir_t
قبل 23 ساعةً - قبل 23 ساعةً
@AmarMecheri
En Algérie, le pouvoir a toujours voulu infantiliser le peuple, alors beaucoup de ceux qui le servent se croient encore capables de faire de même et dicter aux autres quelles couleurs choisir, dans quelle rang classer leur langue, ou comment vivre, etc ; mais ce sont des gens qui vivent dans autre temps ; déjà d'avoir peur de couleurs quelconques et vouloir les combattre à tout prix, cela prouve l'ère où végète encore leur esprit.
أخفِ الردود
AmarMecheri
قبل 18 ساعةً
@samir_t
Alors que le monde avance, "la société qui recule" va replonger l'Algérie dans les fins fonds de l'obscurantisme "éclairé" par les intérêts mesquins et bien compris des hirsutes aux moumoutes flamboyantes et ceux des longs turbans oints de pisse de chameau.
belkacem77
قبل 23 ساعةً
@TheBahi

Pas une blague j'espère. La Kabylie a son drapeau. Pour la majorité, c'est un drapeau local qui n'a rien à avoir avec la politique. Une identité visuelle et pas plus que ça.

Comme vous avez voulu philosopher d'avantage, voilà votre résultat.

à propos, moi je préfère le drapeau de mon club culte: Maccabi Tel-Aviv et non pas la JSK

Chaque contributeur du corpus kabyle possède un club culte: FC Barcelone, Real Madrid, Paris Saint Germain, AC Milan, Bayern de Munich..

Voilà donc. Vous avez atteint votre objectif a été atteint. Mais Bla Ṛebbi nous allons continuer à travailler sur ce corpus Kabyle quitte à adopter le drapeau le plus détestable à nos yeux jusqu'à dépasser votre ratatouille de merde qui ne sert qu'à produire des phrases pour plaire à vos mentors.

Nous étions en paix entrain d'avancer avec bonne volonté jusqu'à ce que vous interveniez comme des démons fuyant l'enfer. Oui mais vous êtes le pur produit de votre religion.
Thanuir
2019-09-05 09:15
*What to do with elliptical expressions*

A fine example, but let us discuss the general case: https://tatoeba.org/dan/sentences/show/748252 , or "If I had brushed my teeth..."

I personally have a mild preference for allowing such in the corpus, but only a mild one.

* They might not be complete sentences (the example is not), in the sense of a having a main clause and maybe some other stuff, since the main clause can be implied, as in the example.
On the other hand, there are many other things, such as greetings and other interjections, which are also not complete sentences, but still an established part of the corpus.

* They are valid utterances that are and can be used in conversations and in literacy. Often the context implies the omitted part of the sentence.
One could, as always, say that the context should be added as a part of the sentence. But most sentences and "sentences" are always improved by adding context and more material, yet this is not a corpus of books.

* They add marginal linguistic content, since the ellipsis is an accepted part of many languages. Also, based on the linked sentence, there seems to be some variety in how the ellipsis is expressed, though not much. (Japanese is different and Spanish may be different, if having the spaces there is right.)

Summarising:

I do not see a compelling reason for removing such utterances, though also little reason to encourage contribution of them.

But please provide other perspectives on this.
أخفِ الردود
AlanF_US
2019-09-05 21:59
> I do not see a compelling reason for removing such utterances, though also little reason to encourage contribution of them.

I agree with you. I don't add them myself. However, I wouldn't remove one unless there were some other issue with it.
CK
CK
2019-09-06 05:37 - 2019-09-06 05:38
> * They are valid utterances that are and can be used in conversations and in literacy. Often the context implies the omitted part of the sentence.

This problem can be solved by following the first rule on our Rules and Guidelines page.
https://en.wiki.tatoeba.org/art...how/guidelines

This shows one example of how context can be added to show usage of non-sentence expressions.

Can you think of a dialogue in which the item at #748252 would naturally fit?
If so, I'd suggest that you add it, so we have a good example.



أخفِ الردود
Thanuir
2019-09-06 07:43
For this particular sentence, I would imagine a situation where there is something wrong with one's teeth, and this could have been avoided by brushing them. Though I would add the word "only" to the sentence. I'll leave writing natural English dialogue to those who are more skilled at it.

Could you be more explicit about the harm the "sentence" is causing so as to merit its deletion?
Thanuir
2019-09-06 07:51
(I added a couple of uses of ellipsis but with unrelated meaning. https://tatoeba.org/dan/sentences/show/8166172 and https://tatoeba.org/dan/sentences/show/8166177

The first is more natural than the example here and the second should be complete by any measure.)
Thanuir
2019-09-06 07:47
From the comment thread of the sentence, by @CK:
"Note that since there is no context, the Japanese could refer to more than one tense and the subject could also be something other than "I.""

This is not very relevant. For example, many English sentences with "you" do not have sufficient context to determine if it is the singular, the plural or the general "you", or the degree of formality. This leads to several different translations in many other languages.
أخفِ الردود
Thanuir
2019-09-06 08:16
Further by @CK, copied here because I see it as relevant for the general case:

"It's "relevant" because this is not complete and the context is not given, so this is not a complete thought. On the other hand, complete sentences are felt to be "complete", at least in some sense, so alternate translations make more sense, doesn't it?"

I do not see, yet, the hard line between this and "Take a shower before you go swimming." in terms of ambiguity. (I obviously see the hard line in terms of grammar.)

Both could be uttered in a number of different circumstances. The grammatically incomplete example at least specifies the subject, whereas the alternative, above, might be a general principle of hygiene, or something said to a single person or to several people.

Is the completeness here a matter of grammar, or is there somethign else at play?
Thanuir
2019-09-06 07:56
Relevant blog post from Trang, as linked by @brauchinet in the comments:

https://blog.tatoeba.org/2010/0...f-content.html

A quote:

"
As far as I'm concerned, I think Tatoeba can handle a loose definition of "sentence". We don't strictly need to have an entity with at least a verb. To me, when spoken, everything is a sentence. When written, the main difference between a sentence and a non-sentence is punctuation. That's all. For the rest, as long as people can imagine context where the "sentence" can be expressed, then it's a sentence.
So yes, I'm roughly saying that you can take all the words in the dictionary, add punctuation and perhaps a capital letter, you'd turn it into a sentence. I don't encourage it because it's not useful (dictionaries do that already), but one-word sentences are still tolerated. I'll trust people's common sense for adding only one-word sentences that are significant (for instance, "Hello" is, "House" isn't).
"
Thanuir
قبل 19 ساعةً - قبل 19 ساعةً
UPDATE: The disputed sentence is gone, with no announcement in the sentence discussion or the well, in spite of the unresolved dispute here.

I do not feel terribly sorry for the sentence, but the procedure does not look very good:

There was a discussion about what to do with these types of elliptical sentences.
Consensus was not reached.
A particular case of elliptical sentences was resolved silently (without any notice).

I think that it would be a very good idea to inform of the action taken in these cases. Here the sentence was not particularly important and people were not particularly passionate about it, as far as I see, but I do think that consistency in handling disputed sentences would be good. And sentences where people are more committed should certainly be handled with more communication.
BakirHamou
قبل أمس
L'intercompréhension dialectale entre les variantes de la langue amazighe est assez importante, surtout entres les dialectes qui ont une proximité géographique tels que le kabyle et le chaoui, ou bien le kabyle et le chenoui (nord-ouest de l'Algérie).

L'intercompréhension est également importante entre les dialectes dits zénètes, même s'ils sont géographiquement éloignés. Ainsi, les Chaouis et les Chenouis se
comprennent parfaitement bien comme ils comprennent le dialect rifain parlé au Maroc ainsi que tous les dialectes parlés dans l'énorme chapelet d'oasis éparpillés au sud-ouest algérien (Béchar, Naama, El-Bayadh, Adrar, Timimoun, etc.). Ces derniers dialectes, à leur tour, et dû leur proximité géographiques aux dialectes marocains parlés plus à l'ouest de l'Afrique du Nord, sont également compris pas les populations berbérophones du Maroc central et méridional (populations qui parlent le chleuh).
أخفِ الردود
AmarMecheri
أمس - أمس
@BakirHamou
Since you continue to portray that Kabyle is a dialect, then why more than 90% of your sentences in BER are in Kabyle?
Take a seat then and do not write anymore in Kabyle disguised as ber!
****
And who are you to decree which language has the status of dialect or not?
------------------------------------------------------------------------------------------------------------------------------
@BakirHamou
Puisque tu continues à pérorer que le kabyle est un dialecte, alors pourquoi plus de 90% de tes phrases dans BER sont en kabyle?
Assume-toi alors et n'écris plus en kabyle déguisé en ber!...
****
Et qui es-tu pour décréter quelle langue a ou non le statut de dialecte?
أخفِ الردود
belkacem77
أمس
@AmarMecheri
La notion de dialecte est une notion politique utilisée pour mépriser les langues et créer des complexes au nom du jacobinisme. Rassures toi, les brobros ont appris des arabo-islamiste les mêmes réflexes.

En dialectologie (science linguistique), la notion de dialecte, variante, langue se confondent. Les linguistes présentent bien la définition.
Quant au parler, il est plutôt lié à la prononciation, tonique et l'accent.

Laisse ces chérubins. Tu n'as qu'à jeter un coup d’œil sur les phrases en relation avec l'actualité en Algérie. Ce sont des suceurs de rond jasses. Ne te casse pas ta tête avec eux.
sharptoothed
قبل 3 أيام
** Stats & Graphs **

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/
أخفِ الردود
Guybrush88
قبل أمس
thanks :)
أخفِ الردود
sharptoothed
قبل أمس
you're welcome :-)
CK
CK
قبل 7 أيام
** Is it legal to use CC-BY sentences on Tatoeba.org? **

There seems to be a discrepancy on this page.

https://blog.tatoeba.org/search?q=cc-by

Trang says, "Anything that basically doesn't say "You can do absolutely whatever you want with this" is NOT compatible with CC-BY."

However, later she says, "Anything that is under CC-BY is compatible with CC-BY. "

My interpretation is that since people use our data under a CC-BY license, that we can't use other people's CC-BY material since those who use our data can't also do a CC-BY for the other source. Trang's first statement seems to indicate this, since people who release material under CC-BY require that they be given credit for the material and do not grant the right to "do absolutely whatever you want with this."
أخفِ الردود
TRANG
قبل 7 أيام
I've updated the article.
https://blog.tatoeba.org/2011/0...d-content.html

The line you quoted was more of a simplified guideline for people who are not too familiar with licenses. It was also written back in 2011 when we overall had much less experience with licenses. It is obviously not a precise legal statement.

In general, you should avoid making interpretations out of blog posts when it comes to licenses. You should instead read the license text and make your own interpretation based on that text, as it is the original source.
أخفِ الردود
CK
CK
قبل 6 أيام
I think you're mistaken and that you should ask a lawyer first if you are going to encourage members to take someone else's CC-BY material and put it into the Tatoeba Corpus which is distributed under it's own CC-BY license, which only requires attribution to tatoeba.org.

FROM:
https://creativecommons.org/lic...by/4.0/deed.en

Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
أخفِ الردود
shekitten
قبل 6 أيام - قبل 6 أيام
CC-BY is literally designed specifically to work like this. Reusing CC-BY content and making it also CC-BY is ideal. There are more restrictive licenses (not permitted on Tatoeba) that REQUIRE this.
أخفِ الردود
CK
CK
قبل 6 أيام
Some licensors choose the BY license, which requires attribution to the creator as the only condition to reuse of the material.
FROM: https://creativecommons.org/faq/

How can someone using the Tatoeba Corpus and properly crediting the Tatoeba Project, know that they also need to give credit to a third party as well? If you don't specifically give credit to the third party, then you would be violating the third party's CC-BY license, I think.



أخفِ الردود
Thanuir
قبل 5 أيام
Yes, someone publishing Tatoeba data would have to figure out how to credit the third parties (if they do not want to breach the terms of CC licenses), too, which would be quite challenging.
TRANG
قبل 6 أيام
As shekitten said, CC BY is designed for reuse. No one in their right mind would choose a CC BY license if they didn't want their content to be reused somewhere else. So it is nonsense to say that you cannot reuse CC BY content into another CC BY content.

What you are pointing out is that we may not be doing the attribution properly. So let's break it down:

1) You must give appropriate credit

That is done by adding a comment on the sentence with a link to the original source. I believe that is appropriate enough, but I suppose the best way is to be sure is to contact the author to confirm. And indeed, we may not have been very diligent on that, so we could add it as a guideline that when copying from another CC BY content, one should always try to contact the author to ask about attribution. Now if the author doesn't reply, I think we're still very, very safe sticking to adding a comment with a link to the original source.

2) provide a link to the license

This is done indirectly: since we provide a link to the source, the source will have the link to the license. In the comment on Tatoeba, I think the mention to the license name and version is fine and there is no need to additionally put a link to the license itself.

3) and indicate if changes were made

This is done with the logs: whenever someone edits the sentence, the logs indicate when and how the sentence has been modified.


With all of that, I think we're okay.


Now I understand very well that one flaw of CC BY is that if you want to be 100% sure that you're doing attribution properly, it can be very tedious work, because indeed, you would have drag along all the attributions from previous reuses. And that is actually why we agreed to introduce CC0 when Common Voice approached us. We know how painful that is with CC BY, and we know that CC0 would alleviate this pain. With CC0, there is no need to worry about this whole trail of attribution when content is reused in content that is then again reused, and again reused.

In our case with CC BY though, we are following some common sense and we assume that someone who shares their work under CC BY is okay with indirect attribution. Meaning that if I create CC BY content and you reuse my CC BY content into your own CC BY content and shekitten reuses your CC BY content, I'm okay if shekitten only gives attribution to you and not to me. Because indirectly, I'm still being given attribution (through you).

I think it is a fairly reasonable assumption. But if for some reason, we are copying from someone who is not okay with this concept of indirect attribution, then we can figure out something. We can readapt the way we give credit by adding some warning on the Downloads page about the people who are not okay with indirect attribution so that projects that reuse our content will know that they need to mention these people. But again, we don't have to reject copied sentences from external CC BY sources right off the bat because it's really borderline paranoia to do so.
أخفِ الردود
Thanuir
قبل 6 أيام
The link to the license should be direct. (The target website might vanish.) But luckily there already is a direct link to CC-BY license on the sentence page. I have no idea what happens if the original was licensed under a non-French version of the license; is the French license sufficient or should one also link to the original license?

...

Here is more information on what is proper attribution: https://wiki.creativecommons.or...mparison_chart

In particular, the legality is not a matter of what the author intended or wanted to accomplish, but rather of the license.
In any case, only the original source needs to be attributed, not any intermediate sources.
أخفِ الردود
TRANG
قبل 6 أيام
> is the French license sufficient or should one also link to the original license?

The French license would not be enough. Each version of CC BY should be considered as a different license, even if they are very similar.

For the context, this topic was brought up because of this sentence:
https://tatoeba.org/eng/sentences/show/8242255

So going from this example, if we want to be absolutely strict about attribution, then we would have to ask shekitten to also post a link to the CC BY 4.0 license (not just mention the license name). And we may have to do other things in order to be 99.999999% safe legally speaking.

> In any case, only the original source needs to be attributed

It's very clear that when possible the original source needs to be attributed. But when content gets mixed and remixed, it can be difficult and confusing to find out who is the very first author. And in such cases, I think we are still safe if we are only attributing to the intermediate source. No one is going to sue Tatoeba because we didn't give them attribution directly, but instead gave attribution to someone who reused their content. They will most likely just let us know and we can update the information when we find out that we were referencing an intermediate source.

My whole paragraph about "indirect attribution" was mostly to argue on the fact that there is no imminent danger by referencing an intermediate source unknowingly and therefore we do not need to reject every sentence copied from other CC BY sources (concretely, it doesn't make sense to mark shekitten's Láaden sentences in red).

CK still thinks that no matter what, it is wrong to let contributors copy into Tatoeba sentences from other CC BY sources and that I risk being sued for it. In other word, that we should completely forbid people from copying sentences from other CC BY content.

I think that enforcing such a rule would be unreasonable. I know there is a risk and I know we are not handling the whole legal aspect perfectly, but that's normal considering that we have grown on a very scarce (nearly non-existent) budget and considering that the topic of intellectual property in the internet era is still a fairly new territory.

If anyone would like to help out and investigate on the safety of allowing Tatoeba members to copy CC BY sentences and on what else we can do at this stage to avoid any risk of lawsuit, I would be infinitely grateful. On my side, I cannot put any more effort into this topic.
أخفِ الردود
shekitten
قبل 6 أيام - قبل 6 أيام
I think the issue of how to attribute the sentences - whether to just attribute Tatoeba or to directly attribute the original source - is ultimately up to whoever is making use of Tatoeba data to resolve.

It's indisputable that it is possible to release a CC-BY work (such as Tatoeba) that makes use of other CC-BY works (such as individual sentences), and I'm not the first person who has ever done this. From our end, all we have to do is attribute the original source.

From the POV of the downstream service, that's for them to resolve. I would say the most legal and ethical practice is to attribute both, which is entirely within the realm of possibility on their end and is something they should already be doing.
Thanuir
قبل 5 أيام
Personally, I think that there is nothing morally wrong with breaking copyright laws, as they hurt humanity in general and the mission of Tatoeba in particular.

I guess the risk to be sued due to the content of Tatoeba is tiny. I think the risk to be sued on a legally sound basis is even tinier, since even copying entire single sentences from a book while breaking the order of sentences is not obviously wrong, AFAIK. (This is not legal advice.)

I read that some computational linguistics researchers, when they want to share a corpus, put the sentences in a random order so that the original work can not be recovered from there, but their approach is ad hoc and has not been tested in court. I do not remember the source anymore.

I would suggest linking to the source and the license when using CC-BY-licensed content. The effort is not big when one should link to the source anyway and the source should link to the license, so one can simply copy the link from there.
أخفِ الردود
shekitten
قبل 5 أيام
I agree that there's nothing morally wrong with breaking copyright laws, but I still think there is something morally questionable about not attributing a source - particularly in cases where you are making money off of that source.
أخفِ الردود
CK
CK
قبل 5 أيام - قبل 5 أيام
* Here are 2 statements by shekitten and my comments.


> From the POV of the downstream service, that's for them to resolve ...

I think that when the Tatoeba Project distributes their corpus with the understanding that it's OK to use it if attribution is given to the Tatoeba Project, the implication is that it is free to use with no other restrictions.


> ... but I still think there is something morally questionable about not attributing a source ...

I think when a person releases their material under a CC-BY license that they do it with the expectation that they will receive attribution when their material is reused. I don't think that they expect it to be reused without attribution as suggested in another comment above.

So, regardless of the legal aspect, I would say it's morally wrong to reuse CC-BY material that is going to be redistributed without the required attribution that the person who chose the CC-BY license wants and expects.




* Additional comments.

If it is indeed possible to add CC-BY material to the Tatoeba Corpus and redistribute it under Tatoeba's own CC-BY license, then perhaps TRANG should find all the parallel corpora with CC-BY licenses and import them all into the Tatoeba Corpus, like she did with the public domain Tanaka Corpus.

Also, if this is true, would it mean that anyone who reuses bilingual pairs from http://www.manythings.org/anki in another project not need to to give attribution to the Tatoeba Project anymore?

I think that it would be a lot better and safer for us to not include CC-BY licensed material by others in the Tatoeba Corpus. We have a number of native speakers who can easily add their own material without needing to reuse (steal?) CC-BY material.
أخفِ الردود
Thanuir
قبل 5 أيام
There is no "stealing" with copyright breaches, and there is no copyright breach on Tatoeba users' side, here, so please do not use needlessly inflammatory language.

I agree that the current situation is inconvenient for people who would like to republish Tatoeba content. If someone wanted to do that, they would have to figure out a way of identifying which sentences need further attribution, and then either provide that or exclude those sentences.

It would be helpful to those people if the sentences requiring further attribution would be marked somehow, perhaps with a different "license" option: CC-BY license with additional attribution required or something like that as a licensing option, maybe.
TRANG
قبل 4 أيام
I want to stress that no one said that CC BY material should not be given attribution. It is very, very clear that we should give attribution when reusing CC BY material.

But you are apparently advocating for "viral" attribution and you are also advocating to forbid people from mixing CC BY content just because those who reuse the remix might not give proper attribution. I don't know if you realize that this point of view is also morally questionable and is a creativity killer...

We are going to do our best to be as fair as possible to everyone who is a content creator, but we cannot take measures that are disconnected from reality.

> perhaps TRANG should find all the parallel corpora with CC-BY licenses
> and import them all into the Tatoeba Corpus

I would but I'm not interested in quantity. Tatoeba still has too many flaws and there's really, really a lot of challenges to solve on a software engineering level, on a UI/UX level, on an organizational level... Having more sentences is at the very bottom of my priorities. We are not scalable enough for the corpus to grow much faster than ~2000 sentences per day.

> Also, if this is true, would it mean that anyone who reuses bilingual pairs
> from http://www.manythings.org/anki in another project not need to to
> give attribution to the Tatoeba Project anymore?

Yes, those who reuse the Anki bilingual pairs do not need to give attribution to Tatoeba. These subsets have been processed and reorganized in a different manner than what Tatoeba originally provides. There has been actual work put into reorganizing the data and it's enough work that Tatoeba does not need to be attributed anymore. Giving attribution to manythings.org alone would be completely fine and I would personally find it outrageous if people were forced to also give attribution to Tatoeba.
أخفِ الردود
CK
CK
قبل 4 أيام - قبل 4 أيام
I still think you're wrong about not needing to credit the person who has released something as CC-BY, if it is then distributed as CC-BY by someone else. I still think that you should not be distributing someone else's CC-BY material to others under your own license, and assuming that it is then OK for others to use that material if they give your website credit, but not give credit to the original person who released material under a CC-BY license. It think this is morally wrong, not following the spirit of CC-BY and a copyright infringement. A copyright owner has the right to control distribution of his/her material. If they choose to distribute their material for just the cost of attribution, their rights are being violated if that is not done.
أخفِ الردود
TRANG
قبل 4 أيام - قبل 4 أيام
If that is really what you believe, why do you only give attribution to Tatoeba when you reuse the Tatoeba corpus in your projects instead of giving attribution to every contributor individually?

Or why haven't you protested against the release of the Tatoeba corpus since the beginning?

Each contributor has provided their sentences to Tatoeba under CC BY (or CC0 since early 2019), and Tatoeba is only packaging them into one big corpus.
أخفِ الردود
CK
CK
قبل أمس - قبل أمس
> If that is really what you believe, why do you only give attribution to Tatoeba when you reuse the Tatoeba corpus in your projects instead of giving attribution to every contributor individually?

I thought that the message on the downloads page meant that developers could use the Tatoeba Corpus if they credited tatoeba.org. Even now, the message on the downloads page implies that.

Actually, most of my projects have a direct link to each sentence's page on tatoeba.org and the username of the owner of sentence. I think perhaps a couple of projects only include a link to the page on tatoeba.org.

The only project that didn't have that was http://www.manythings.org/anki. I have corrected that today, by inserting one extra field on each line to include attribution. This does add a bit to the file sizes, but shouldn't really bother people too much.

You can see a quick screenshot, so you don't need to download a file.

https://imgur.com/a/08iX5Gh



Ricardo14
قبل 3 أيام
CK,

I have agreed that all the sentences I post on Tatoeba now "belong" to Tatoeba. That said, whatever Tatoeba wants, needs to do with them, I'll have no objection. I truly believe that it's every single user's feeling. :)
أخفِ الردود
CK
CK
قبل 3 أيام - قبل 3 أيام
I thought that we all agreed to let the Tatoeba Project release our sentences under their CC-BY license. However, at this time, the "Terms of Use" are only in French, so I'm not sure what people are agreeing to now.
أخفِ الردود
shekitten
قبل 3 أيام
Tatoeba is attributing our CC-BY content by our username, and then doing the standard thing that people do with CC-BY work: reusing it with attribution and releasing the whole thing as CC-BY. Using a CC-BY license is a way of giving forwards permission to anyone who wants to reuse your work.

If you really want to be within the letter and arguably the spirit of CC-BY, you should be attributing individual contributors. This is what people are generally supposed to be doing when they reuse content from Wikipedia as well.
TRANG
قبل أمس
You can find the old Terms of Use here:
https://en.wiki.tatoeba.org/art...erms-of-use-v1

It says: "for any text to which you hold the copyright, by submitting it, you agree to license it under the Creative Commons Attribution License 2.0 (fr)."

The new Terms of Use were written with the goal that nothing should change for people who still contribute under CC BY, but by taking into account that Tatoeba will expand to allow more than just CC BY.

The whole section about intellectual property describes that.
https://tatoeba.org/eng/terms_of_use#section-6

But the two relevant paragraphs are:

"L’infrastructure technique de Tatoeba utilise par défaut, pour la contribution de phrases textuelles, la licence Creative Commons Attribution 2.0 France (CC-BY 2.0 FR)."
= This is saying that we use CC BY 2.0 FR as the default license.

"Lors de la contribution, sur notre Site Internet, d’une phrase dont vous êtes propriétaire, en votre qualité d’auteur·e, vous attribuez une licence à cette phrase."
= This is saying that when you contribute a sentence that is your own sentence, you are applying a license to this sentence.

If you combine these two paragraphs, the idea is that when someone submits a sentence to Tatoeba, by default, they license it under CC BY 2.0 FR.
Thanuir
قبل 5 أيام
Yes, providing attribution is the polite thing to do.
Thanuir
قبل 4 أيام - قبل 4 أيام
From the English CC-BY 2.0 legal code, https://creativecommons.org/lic...2.0/legalcode. This is just an example; one needs to read the relevant license of what is to be added to Tatoeba if one wants to be sure.

I am quoting or referencing the parts that might be problematic for Tatoeba, or that are otherwise good to know. Not a lawyer, not legal advice, and so on. I am not suggesting any particular way of going forward, here, just trying to figure out what the license exactly says. A native speaker or someone with background in law should go through the text, too.

...

From part 1, definitions:
"
"Collective Work" means a work, such as a periodical issue, anthology or encyclopedia, in which the Work in its entirety in unmodified form, along with a number of other contributions, constituting separate and independent works in themselves, are assembled into a collective whole. A work that constitutes a Collective Work will not be considered a Derivative Work (as defined below) for the purposes of this License.

"Derivative Work" means a work based upon the Work or upon the Work and other pre-existing works, such as a translation, musical arrangement, dramatization, fictionalization, motion picture version, sound recording, art reproduction, abridgment, condensation, or any other form in which the Work may be recast, transformed, or adapted, except that a work that constitutes a Collective Work will not be considered a Derivative Work for the purpose of this License. For the avoidance of doubt, where the Work is a musical composition or sound recording, the synchronization of the Work in timed-relation with a moving image ("synching") will be considered a Derivative Work for the purpose of this License.
"

Basically, this means that the sentence itself could be a part of collection, but any translations are derivative works, and any edits also create a derivative work. Any larger collection that includes both original CC-BY sentences and their modifications is a derivative work (I think).

...

This part is 3.b, i.e. rights given to for example Tatoeba project:
"to create and reproduce Derivative Works"

This is 3.d.
"to distribute copies or phonorecords of, display publicly, perform publicly, and perform publicly by means of a digital audio transmission Derivative Works. "

Recall that translations are derivative works. I think this implies that every translation of an outside sentences with the CC-BY license should attribute the original source, provide copyright notice (if any), and link to the relevant license. This is currently unfeasible on Tatoeba, since many user interfaces for translating do not suggest the original sentence is under specific attribution requirements.

...

This part is from under restrictions, 4.a.:
"...You may not offer or impose any terms on the Work that alter or restrict the terms of this License or the recipients' exercise of the rights granted hereunder. You may not sublicense the Work. You must keep intact all notices that refer to this License and to the disclaimer of warranties...."

I do not know if Tatoeba is sublicensing the sentences or translations thereof.

I do not know whether all the different CC-BY licenses are compatible enough with CC-BY 2.0 French that we do not "alter or restrict the terms of this License or the recipients' exercise of the rights granted hereunder". I suspect it would be better to use the same license as the original.

We also need to add any potential disclaimers of warranties from the source, if any are written there.

...

Restrictions, 4.b. This is important.
"
If you distribute, publicly display, [...] the Work or any Derivative Works or Collective Works, You must keep intact all copyright notices for the Work and give the Original Author credit reasonable to the medium or means You are utilizing by conveying the name (or pseudonym if applicable) of the Original Author if supplied; the title of the Work if supplied; to the extent reasonably practicable, the Uniform Resource Identifier, if any, that Licensor specifies to be associated with the Work, unless such URI does not refer to the copyright notice or licensing information for the Work; and in the case of a Derivative Work, a credit identifying the use of the Work in the Derivative Work (e.g., "French translation of the Work by Original Author," or "Screenplay based on original Work by Original Author"). Such credit may be implemented in any reasonable manner; provided, however, that in the case of a Derivative Work or Collective Work, at a minimum such credit will appear where any other comparable authorship credit appears and in a manner at least as prominent as such other comparable authorship credit.
"

This outlines exactly what information should be included when giving credit to the creator.

First, how to give credit due. It should be "reasonable to the medium or means You are utilizing";
and later
"Such credit may be implemented in any reasonable manner; provided, however, that in the case of a Derivative Work or Collective Work, at a minimum such credit will appear where any other comparable authorship credit appears and in a manner at least as prominent as such other comparable authorship credit.".

Second, the contents of the credit notice. It should include the name of the Original Author if supplied; the title of the Work if supplied; to the extent reasonably practicable, the Uniform Resource Identifier, if any, that Licensor specifies to be associated with the Work, unless such URI does not refer to the copyright notice or licensing information for the Work; and in the case of a Derivative Work, a credit identifying the use of the Work in the Derivative Work (e.g., "French translation of the Work by Original Author.")."

...

Section 7, termination:
"This License and the rights granted hereunder will terminate automatically upon any breach by You of the terms of this License."
CK
CK
قبل 3 أيام
For Stylish users, I've updated this.

Tatoeba.org - Hide Top 5 Language Stats on Home
https://userstyles.org/styles/1...-stats-on-home

Description:

This hides the top 5 language stats on the main page and brings the previews of the latest Wall postings up closer to the top of the page.

To see all the language stats, you will need to go to this URL: https://tatoeba.org/eng/stats/s...es_by_language

This is what it looks like on my computer.
https://userstyles.org/style_sc...g?r=1571027016

There are other userstyles for tatoeba.org, written by me and others.
Here's a search.

https://userstyles.org/styles/b...eba&type=false
أخفِ الردود
Pfirsichbaeumchen
قبل 3 أيام - قبل 3 أيام
Replying to this thread rather than starting a new one because the topic is related.

The "top five language stats" seem to occupy too much space. The fields for every language seem to be too high, resulting in too much line spacing. The font seems to be too large. This is what it looks like for me: https://imgur.com/a/mIuNdpq.

I wonder if it wouldn't be better to either remove it completely or make it (much) smaller and display more languages instead. The numbers of the top five languages aren't really so interesting and don't show a lot of diversity. Making the languages displayed random would be another option.
أخفِ الردود
CK
CK
قبل 3 أيام - قبل 3 أيام
Personally, I'd vote for something more like what's on the main non-logged in page instead of these Top 5 languages.

https://imgur.com/a/Brg7euP

I think regular members would find this more interesting.

Also perhaps adding a note about with the "day" began, or making these numbers reflect the last 24 hours would be nice. Maybe the line about "supported languages" isn't needed, since the link directs to a page that shows all the supported languages.

Another idea would be to display the last 2 or 3 full days of these stats instead, not including today's stats.

https://tatoeba.org/eng/contrib...meline/2019/10


That said, I've grown accustomed to seeing Wall message previews at the top, which I like, so I'm not sure we really need anything above those.
أخفِ الردود
Pfirsichbaeumchen
قبل 3 أيام
Good suggestions worth considering. That would indeed be more interesting. 🙂
sabretou
قبل 3 أيام
I would like it if the Top 5 languages listed the top 5 contributed-to languages of the day, or maybe the week, as opposed an unchanging all-time list.
TRANG
قبل أمس
> The "top five language stats" seem to occupy too much space.

This is part of the transition to the responsive UI. You will see that in general, anything with a list of clickable items will take more space. This was already done for:
- the list of tags
- the links on the sidebar on the profile page

My guess is that the space it takes is not really the problem itself, but rather that the information displayed is too irrelevant for you.

> I wonder if it wouldn't be better to either remove it completely

There was an issue about it that I already wanted to solve several weeks ago:
https://github.com/Tatoeba/tatoeba2/issues/842

I aborted my plans because the whole Kabyle discussion was a bit too overwhelming and I had no time to really think carefully about what this block can be replaced with.

I had two ideas in mind:
- Displaying the same stats that are displayed on the homepage for non-authenticated users.
- Displaying the top 5 languages, but limited to the languages that the user has added in their profile.

Removing the whole block was also something I considered, but doing that would remove the possibility to access stats of all languages. The link to https://tatoeba.org/eng/stats/s...es_by_language is only available from the stats block and I wasn't sure where is the best place to put it if this block was to be removed.
أخفِ الردود
Pfirsichbaeumchen
قبل أمس - قبل أمس
I immediately noticed it for the links in the sidebar on my profile page. I was hoping it was only temporary because I use that sidebar a lot and it now requires me to scroll up and down a lot because of the immense spacing. I didn't think that could be intentional. I would probably have kept silent if I hadn't noticed that design choice spreading, in this case to the top five languages.

I think all suggestions made for the latter so far would be improvements to what there is now.

On the other hand, the field where you write comments seems to have shrunk. There I would find something bigger more comfortable. It felt good the way it was before. 🙂
أخفِ الردود
TRANG
قبل أمس
The increase in spacing is intentional for the reason that we will be reusing as much as possible design patterns from Material Design (https://material.io/). In Material Design, there is more emphasis on space because it takes into account mobile experience. If you are browsing from a mobile phone, you need more space between items in order to be able to tap on the desired item.

Having a "compact mode" would definitely be a possibility in the future, but as of now, we are still designing with the default paddings and margins that come with the AngularJS Material framework (https://material.angularjs.org/).

I take note for the comment form. It has indeed shrunk but that one was not intentional.
CK
CK
قبل أمس
> The link to https://tatoeba.org/eng/stats/s...es_by_language is only available from the stats block and I wasn't sure where is the best place to put it if this block was to be removed.

A logical place for the link would be in the drop-down menu, just below "Show activity line."
Perhaps both could go under "Community" rather than "Contribute."