menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search
damascene damascene April 23, 2016 April 23, 2016 at 7:31:44 AM UTC link Permalink

Hello,

I've been invited to this project by a friend. I contributed to Arabic. Used the website to help me learn the Turkish language.

well that is my experience:
1. my first concern that the "CC BY" license. Me I like the GPL viral characteristic (like virus). I hoped my work would not be taken by some commercial company and then get locked and sold. "Share Alike" sounded better. anyway I though it would not be that bad as it could help people learn Arabic.

2. The heated discussion about Arabic language getting split to different languages because the SIL organization which wrote the ISO-639-3 think that Arabic is actually 30 languages without providing a study to support that.

https://github.com/Tatoeba/tatoeba2/issues/1084 https://github.com/Tatoeba/tatoeba2/issues/1079 http://linguistics.stackexchang...ne-for-english

At the end I got banned from #tatoeba IRC channel after some disagreement and exchange of word with a member. we apologized to each other afterward but my ideas seems does not fit there. even after I talked to the leader of the project it was not his decision. I could not get back. it's just does not seems a comfortable and fair environment.

3. I like a word of poetry in Turkish language, I posted it to Tatoeba to find that I've broken many rules. Well, even Youtube does not follow such strict rules.

Then I find that even sentences from Wikipedia are not compatible with Tatoeba, because it does not allow CC SA "creative commons share alike" text.

This is the comment I got "@possible copyright infringement

From the song "Hoşçakal" by Ayşe Gökalp. Please do not add lyrics to songs; it is against our copyright policies.

Also, damascene, please do not include annotations in your sentences, it is one of our rules in the Quick-Start Guide: http://en.wiki.tatoeba.org/arti...ow/quick-start
So in the future, for sentences like "...that we lived (together)." you would have to add two separate translations: one "...that we lived." or "...that we lived together."

We would also recommend contributing sentences in your own native language. You could be helping us much more that way, since people would be able to trust that what you have contributed is likely to be good and natural-sounding.

[#1230823] If you translate from your second language into your own native language, rather than the other way around, you're less likely to make mistakes.

[#1907470] It's very easy to sound natural in your own native language, and very easy to sound unnatural in your non-native language.

Even if some sentences by non-native speakers are good, it's really hard to trust that they are good, so members would be helping us much more by limiting their contributions to sentences in their own native languages. Remember that the purpose of the Tatoeba Project is to create example sentences that can be used for studying languages. It’s not really a place to be contributing non-native language sentences for others to correct for you.

[#3946394] We recommend adding sentences and translations in your strongest language. If you are interested primarily in having your sentences corrected, you should try a site like Lang-8.com, where that's the focus."

If I want to translate some text or sentence that I like to English I'll have to ask a native to add it instead of implementing a way that they can review it with a click of a button? what about the "fair use policy" a sentence or 3 from a poetry is fine by all normal standards.

I really disappointed. form all of these. hope you be more friendly and understanding.

{{vm.hiddenReplies[26269] ? 'expand_more' : 'expand_less'}} hide replies show replies
cueyayotl cueyayotl April 23, 2016 April 23, 2016 at 9:13:01 AM UTC link Permalink

First of all, we are sorry you feel this way, however there simply have to be certain conventions by which we must abide in order for this project to be successful.

1,3) I will reply to these two issues together. Copyright laws are stronger in some countries than they are in others. Even YouTube follows copyright laws and takes down millions of videos that break them. If you like poetry, then great! So do we :) Just make sure that the actual poetry cannot have any copyright protection applicable. If the artist died more than 50-100 years ago, you can be sure that their works can be uploaded here on Tatoeba.

If you add sentences in other languages, they may contain mistakes and be overlooked by native speakers (remaining uncorrected). This will unfortunately result in lowering the quality of the corpus, which we have to minimize at all costs. Even if your sentences are corrected, they may still sound unnatural because the sentence structure you suggested may not be the best, so that the sentence gets corrected AROUND the structure that you suggested, rather than reworded to sound more natural.
If you want to practice English, Lang-8 is an excellent site to do so! I often use that site myself :)

2) sil.org is full of great studies supporting their decisions for separating languages; definitely check them out. Your case is similar to mine; my native language is Nahuatl and we too consider it a single language. However, in all honesty, some linguistic varieties are just TOO distinct to be considered to be the same "linguistic unit"; it is even worse for Zapotec (which has been spoken in the area for thousands of years). We simply cannot have all varieties of Nahuatl added into ONE category; we must divide them into "linguistic units" which we call here "languages". Doing so is no easy task, as no matter what you do, somebody will disagree. But, it must be done through research and field studies, and careful analyses of linguistic variety, especially when dealing with a linguistic continuum.
You must recognize, that a rose by any other name... is still a rose. What we call the Arabic macrolanguage, you call the Arabic language, and what we call the Gulf Arabic language, you call the Gulf Dialect of Arabic. My Nahuatl language is divided into 30 "languages" through the ISO 639-3 classification, and to me they are "linguistic varieties", but in the end, I know what each "linguistic unit" means, and so does everyone else... so now we can work together.

We hope that you understand, and we do apologize if we hurt your feelings, as that is most definitely not our intention. Please do let us know if you have any other concerns that need to be address, we will be here for you :)

{{vm.hiddenReplies[26272] ? 'expand_more' : 'expand_less'}} hide replies show replies
damascene damascene April 24, 2016 April 24, 2016 at 8:34:28 AM UTC link Permalink

Fair use of others work
=================

If I copied two-four lines of a 30 lines poetry I should attach a death certificate of the poet? or other legal documents? I do not see this practical. its fair use that is annoying to find such non standard strict rules about it, that also prevent the use of Wikipedia CC-SA licensed source.

Translating to non native language
=================
Me translating English part, I do not feel that I'll translate to English again. I did not want to practice English. Just wanted to share a poetry. I do not feel like English should always be the feeder and other be the receiver. It would be better if other cultures were translated to English without waiting for native English speaker to learn their language then translate it. It's English that we are talking about. it's the least language that you should worry about. it has billions of studying material. I believe English is necessity but it does not mean that it should be placed higher than other languages.

Language learning website Lang-8
=================
Lang-8, I did not see what license they are using? are they non profit? do they share their sentence database publicly or it's closed for their own use? could it disappear as it happen to livemocha with all user contributions?

SIL Specifications
=================
I do not want to extend the discussion on this as I do not like to get into more Arguments about it without having enough knowledge on other languages. but I'm sure we can understand each other easily without the need to learn another language as stated in https://en.wikipedia.org/wiki/M...ntelligibility that differ between languages/variations.

Thank you for taking time to responding.

{{vm.hiddenReplies[26282] ? 'expand_more' : 'expand_less'}} hide replies show replies
Ooneykcall Ooneykcall April 24, 2016, edited April 24, 2016 April 24, 2016 at 9:44:39 PM UTC, edited April 24, 2016 at 9:45:25 PM UTC link Permalink

>If I copied two-four lines of a 30 lines poetry I should attach a death certificate of the poet? or other legal documents? I do not see this practical. its fair use that is annoying to find such non standard strict rules about it, that also prevent the use of Wikipedia CC-SA licensed source.

Tatoeba appears to operate on some sort of common understanding here. I think Tatoeba admins just hate the thought of having to deal with copyright enforcement (having to respond to their demands), so if there is a remote possibility of that happening, copyright-infringing sentences will get deleted 'to be on the safe side', but if it's a little-known source that few have heard of (so any copyright action is very unlikely to happen), it will most likely not be picked up.

odexed odexed April 23, 2016, edited April 23, 2016 April 23, 2016 at 9:27:13 AM UTC, edited April 23, 2016 at 10:11:30 AM UTC link Permalink

Hi, damascene

Nice to see you here and that's a pity that you are disappointed.
Here I share some of my thoughts about what you wrote.
1) GPL license is used for software, it's not something appropriate for Tatoeba, I think. Creative Commons licenses are good for all kinds of creative works. There is CC BY-NC 2.0 that is for uncommercial use but Tatoeba permits to use its data for other projects (even commercial) so if you contribute sentences you have to agree with that.

2) We don't split Arabic language but represent the current situation as other qualified linguists do. You may have no trouble speaking Arabic from different countries but for me they are really different. For example, تذهب الآن إلى السوق in Syrian Arabic would be انت هلق بروح ع السوق and in Saudi Arabia انت دخّين تروح ع السوق
There are even differences in pronounciation, for example ق in Saudi Arabia sounds different from how it's pronounced in Syria. Or ج in Egypt and in other countries.
Also there are differences in grammar and vocabulary is totally different.

3) It takes some time to get used to the rules but we all obey them.
Anyway your contribution is valuable and as a student of Arabic I read your sentences too. Thank you.

{{vm.hiddenReplies[26273] ? 'expand_more' : 'expand_less'}} hide replies show replies
damascene damascene April 24, 2016 April 24, 2016 at 8:48:24 AM UTC link Permalink

Hi Odexed,

Nice to see you too.
1) I liked the spirit of GPL, It's counterpart in CC would be CC-SA (Share a Like) so user works for example can not get used by Google or Yandex without them returning back to the community they took from. I'm not against commercial use, but I think if you benefit from my work you should also share it back.

2) The three example you provided are just different choices of words.
fore example:

(فلما _ذهبوا_ به واجمعوا ان يجعلوه في غيابت الجب واوحينا اليه لتنبئنهم بامرهم هذا وهم لا يشعرون) Yusuf 15:12
here it uses ذهب as in Syrian Arabic

in انت هلق بروح ع السوق the used the word راح as in:
(ولسليمان الريح غدوها شهر _ورواحها_ شهر واسلنا له عين القطر ومن الجن من يعمل بين يديه باذن ربه ومن يزغ منهم عن امرنا نذقه من عذاب السعير) Saba 34-12

for the word حين:
(الذي يراك حين تقوم) Ash-Shu'ara (The Poets) 26:218

for the ع: it's a shortcut for على

so they are just one language using different combinations, I see it similar to the difference between Scottish English/British English/Indian English.
If you write it as you hear it I'm sure it would not be the same thing.

damascene damascene April 28, 2016 April 28, 2016 at 5:33:42 AM UTC link Permalink

4. I've found a sentence that was translated to many languages that says "Putin is a **** head"

I do not like Putin, but how such sentence got here, Does Tatoeba allow personal attacks on public figures? Can I put any name I want?

{{vm.hiddenReplies[26310] ? 'expand_more' : 'expand_less'}} hide replies show replies
User55521 User55521 April 28, 2016 April 28, 2016 at 7:32:54 AM UTC link Permalink

We have a precedent of a sentence #2926572 (Mandela was the terrorist, Mandela was the murderer.) being deleted, so probably you can get this one deleted for the same reasons. However, apparently no one cared about Putin enough to get the sentence deleted so far.

odexed odexed April 28, 2016, edited April 28, 2016 April 28, 2016 at 7:58:38 AM UTC, edited April 28, 2016 at 8:00:42 AM UTC link Permalink

The censorship on Tatoeba is quite vague. Some sentences like the one mentioned by Impersonator are being deleted, others like #4111307 and #3459609 are considered as freedom of expression. So you could chance it.

cueyayotl cueyayotl April 28, 2016 April 28, 2016 at 10:16:19 AM UTC link Permalink

I agree with odexed that censorship here is quite vague, but it is more important to have more of a variety of sentences rather than many sentences of that format: "X is a **** head!" just by replacing 'X' with one of many other names. In the end, it may just be best to use names like Tom or Mary.
I think the main instance where such sentences were deleted was when using one of our users' names.