Muro (4263 fadenoj)

<<< 1234567 >>
TRANG
antaŭ 6 horoj
** Tatoeba Day #21 **

Tatoeba Day starts now and as usual will end in 24 hours!

Tatoeba Day is an event where you can get involved a little bit (or a lot) more than you usually would, to help make Tatoeba more awesome. You can find more details about it here:
http://en.wiki.tatoeba.org/arti...ow/tatoeba-day

After the last Tatoeba Day I started a small campaign to try and find more developers for Tatoeba. I was pleasantly surprised to receive more responses than I expected (in total, around 25 people responded to the call).
Although it is unsure how much each person who contacted us will get involved in the project, it is really nice to see that so many people are interested. A few of them already started submitting code for small bug fixes/enhancements. I'm really looking forward to seeing how things will progress :)

For this Tatoeba Day, now that we have new members in the dev team, I'm hoping that with some of them we can fix as many issues as possible. Priorities will be bugs[1] and enhancements that are "accepted"[2].

You can already test on the dev website the following issues:
- https://github.com/Tatoeba/tatoeba2/issues/955
- https://github.com/Tatoeba/tatoeba2/issues/1119

I'll keep updating this thread as new issues are fixed.

Happy Tatoeba Day!

-----

[1] https://github.com/Tatoeba/tato...ue+label%3Abug
[2] https://github.com/Tatoeba/tato...tus%3Aaccepted
kaŝi la respondojn
Guybrush88
antaŭ 6 horoj - redaktita antaŭ 6 horoj
First request: https://github.com/Tatoeba/tatoeba2/issues/765

maybe it's time to update the strings and the urls in order to cover the current user roles, since they still present the old names for the roles, and they might be confusing for new users
Guybrush88
antaŭ 6 horoj
Second request: https://github.com/Tatoeba/tatoeba2/issues/720

This would be pretty useful to anyone who wants to browse sentences with a given tag in a given language. For example: if I'm learning English/French/any langage, I'd like to browse the sentences that were tagged as 'OK', and I would find it extremely useful to see the exact number of sentences with such tag, and not just the page number.
Guybrush88
antaŭ 4 horoj - redaktita antaŭ 4 horoj
I also saw these two nasty bugs:

https://github.com/Tatoeba/tatoeba2/issues/1057 ("Odia cannot be searched")
https://github.com/Tatoeba/tatoeba2/issues/1056 ("Kannada cannot be searched")
honestlang
antaŭ 5 horoj - redaktita antaŭ 5 horoj
Hey, just want to say that I am glad to be part of this website and I apologize for my little mishaps but thankfully I am already getting used to the site.

I hope I can be of really good use on here and I hope to be able to play a huge role in the development of Tatoeba. Thanks.
CK
CK
antaŭ 2 tagoj - redaktita antaŭ 2 tagoj
** New Audio Files **

Today, I uploaded 84 new Latin audio files by alexmarcelo, so now he has 926 Latin audio files.

https://tatoeba.org/eng/sentenc.../show/4134/und

I also added 783 English audio files today.

https://tatoeba.org/eng/sentenc.../show/4000/und
kaŝi la respondojn
CK
CK
hieraŭ - redaktita hieraŭ
Today (2016-04-29), I uploaded 28 Naga (Tangshang) audio files by Khamlan.

https://tatoeba.org/eng/sentenc.../show/6053/und

I also added 181 English audio files today.

https://tatoeba.org/eng/sentenc.../show/4000/und
GrizaLeono
antaŭ 3 tagoj
Kelkfoje la lingva flago estas neĝusta, kiam oni elektas "aŭtomata rekono" pri sendota frazo. Se oni elektas mem indiki la lingvon, necesas ĉiufoje refari tiun elekton.

Soms is de taalvlag niet juist, als men "automatische taalherkenning" selecteert voor de te zenden zin. Als men verkiest zelf de taal aan te duiden, is het nodig dat telkens te herhalen.

Parfois le drapeau linguistique est faux, quand on choisit "reconnaissance automatique" pour la phrase qui sera envoyée. Quand on préfère même indiquer la langue, il faut qhaque fois refaire ce choix.


Miaopinie preferindas programe lasi la elekton neŝanĝite, ĝis la aŭtoro mem ŝanĝis ĝin.

Ik ben van mening dat het te verkiezen is dat het programma de keuze onveranderd laat, tot de auteur die zelf verandert.

Selon moi, il est préférable, que le programme laisse le choix inchangé, pour autant que l'auteur même ne le change.
kaŝi la respondojn
alexmarcelo
antaŭ 3 tagoj
Your flag choice is kept when you move to the next page, but I agree this could be improved.
kaŝi la respondojn
GrizaLeono
antaŭ 2 tagoj
Pli detale:
Mi uzas ekzemple jenan paĝon:
https://tatoeba.org/epo/sentenc...ferent/page:84
Ĉiam, kiam mi mane elektas "Esperanta" en la listo de la tradukita frazo, tiu elekto estas ŝanĝata de la sistemo al "Aŭtomata rekono" ĉe la sekvanta frazo.
Post traduko de la lasta frazo de tiu paĝo kun la elekto "Esperanta" tiu elekto efektive restas konservata en la sekvanta paĝo (85). Sed ene la paĝo la elekto ĉiam resaltas al "Aŭtomata rekono".
GrizaLeono
hieraŭ
Vi pravas: se oni eniras novan paĝon kun la dezirata flago, tiu restas dum la tuta paĝo.
maaster
hieraŭ
Hi everyone,

Now I deal with the noises of animals.
We, raggione and I, were thinking about a list of animal noises. It already exists (What do animals "say" in your language?). I wondered if it was better a tag "animal noises". It would be easier to handle it (to add it and search by them).
In Hungarian there are about 40-50 different onomatopoeias of animals (I don't know exactly how many; it may be less or more) and even more sentences.
I know there are anyway too many lists and tags, either. I'm waiting for your viewes and remarks, pros and cons.

Sziasztok!

Mostanában az állatok hangjaival foglalkozok előszeretettel, mint tapasztalhattátok is.
Érdekelne, hogy szerintetek jó ötlet lenne-e egy "animal noises" "tag", mely megkönnyítené a címkézést és a címkék szerinti keresést.

Grüß Euch!

Heutzutage beschäftige ich mich mit Tierlauten.
Wir haben daran gedacht, dass eine Liste von Tierlauten gut wäre. Da es doch schon existiert und ich es ausprobiert habe, Sätze auf die Liste aufzunehmen, hab ich's kompliziert gefunden. Facit: Wir müssten einen Tag 'animal noises' schaffen, damit dies Ganze einfacher sein soll.
Ich weiß ja, eine Menge Liste und auch Tag existiert bereits. Aber dennoch.
Ich warte Eure Bemerkunken, Meinungen ... und neue Sätze (genau wie Korrekturen:)
(Ich weiß wohl, ihr alle sprecht ja Englisch ebenfalls, nun ist es meinen Gehirnzellen gut, auf Deutsch auch zu schreiben:)
kaŝi la respondojn
CK
CK
hieraŭ - redaktita hieraŭ
What you are suggesting is likely more appropriate for a standard multi-lingual dictionary rather than for the Tatoeba Project which focuses on collecting sentences.

At one time, someone added a lot of (perhaps less-than-natural-sounding) English sentences that were sort of related to this idea, but using verbs for how animals "speak."
damascene
antaŭ 7 tagoj
Hello,

I've been invited to this project by a friend. I contributed to Arabic. Used the website to help me learn the Turkish language.

well that is my experience:
1. my first concern that the "CC BY" license. Me I like the GPL viral characteristic (like virus). I hoped my work would not be taken by some commercial company and then get locked and sold. "Share Alike" sounded better. anyway I though it would not be that bad as it could help people learn Arabic.

2. The heated discussion about Arabic language getting split to different languages because the SIL organization which wrote the ISO-639-3 think that Arabic is actually 30 languages without providing a study to support that.

https://github.com/Tatoeba/tatoeba2/issues/1084 https://github.com/Tatoeba/tatoeba2/issues/1079 http://linguistics.stackexchang...ne-for-english

At the end I got banned from #tatoeba IRC channel after some disagreement and exchange of word with a member. we apologized to each other afterward but my ideas seems does not fit there. even after I talked to the leader of the project it was not his decision. I could not get back. it's just does not seems a comfortable and fair environment.

3. I like a word of poetry in Turkish language, I posted it to Tatoeba to find that I've broken many rules. Well, even Youtube does not follow such strict rules.

Then I find that even sentences from Wikipedia are not compatible with Tatoeba, because it does not allow CC SA "creative commons share alike" text.

This is the comment I got "@possible copyright infringement

From the song "Hoşçakal" by Ayşe Gökalp. Please do not add lyrics to songs; it is against our copyright policies.

Also, damascene, please do not include annotations in your sentences, it is one of our rules in the Quick-Start Guide: http://en.wiki.tatoeba.org/arti...ow/quick-start
So in the future, for sentences like "...that we lived (together)." you would have to add two separate translations: one "...that we lived." or "...that we lived together."

We would also recommend contributing sentences in your own native language. You could be helping us much more that way, since people would be able to trust that what you have contributed is likely to be good and natural-sounding.

[#1230823] If you translate from your second language into your own native language, rather than the other way around, you're less likely to make mistakes.

[#1907470] It's very easy to sound natural in your own native language, and very easy to sound unnatural in your non-native language.

Even if some sentences by non-native speakers are good, it's really hard to trust that they are good, so members would be helping us much more by limiting their contributions to sentences in their own native languages. Remember that the purpose of the Tatoeba Project is to create example sentences that can be used for studying languages. It’s not really a place to be contributing non-native language sentences for others to correct for you.

[#3946394] We recommend adding sentences and translations in your strongest language. If you are interested primarily in having your sentences corrected, you should try a site like Lang-8.com, where that's the focus."

If I want to translate some text or sentence that I like to English I'll have to ask a native to add it instead of implementing a way that they can review it with a click of a button? what about the "fair use policy" a sentence or 3 from a poetry is fine by all normal standards.

I really disappointed. form all of these. hope you be more friendly and understanding.
kaŝi la respondojn
cueyayotl
antaŭ 7 tagoj
First of all, we are sorry you feel this way, however there simply have to be certain conventions by which we must abide in order for this project to be successful.

1,3) I will reply to these two issues together. Copyright laws are stronger in some countries than they are in others. Even YouTube follows copyright laws and takes down millions of videos that break them. If you like poetry, then great! So do we :) Just make sure that the actual poetry cannot have any copyright protection applicable. If the artist died more than 50-100 years ago, you can be sure that their works can be uploaded here on Tatoeba.

If you add sentences in other languages, they may contain mistakes and be overlooked by native speakers (remaining uncorrected). This will unfortunately result in lowering the quality of the corpus, which we have to minimize at all costs. Even if your sentences are corrected, they may still sound unnatural because the sentence structure you suggested may not be the best, so that the sentence gets corrected AROUND the structure that you suggested, rather than reworded to sound more natural.
If you want to practice English, Lang-8 is an excellent site to do so! I often use that site myself :)

2) sil.org is full of great studies supporting their decisions for separating languages; definitely check them out. Your case is similar to mine; my native language is Nahuatl and we too consider it a single language. However, in all honesty, some linguistic varieties are just TOO distinct to be considered to be the same "linguistic unit"; it is even worse for Zapotec (which has been spoken in the area for thousands of years). We simply cannot have all varieties of Nahuatl added into ONE category; we must divide them into "linguistic units" which we call here "languages". Doing so is no easy task, as no matter what you do, somebody will disagree. But, it must be done through research and field studies, and careful analyses of linguistic variety, especially when dealing with a linguistic continuum.
You must recognize, that a rose by any other name... is still a rose. What we call the Arabic macrolanguage, you call the Arabic language, and what we call the Gulf Arabic language, you call the Gulf Dialect of Arabic. My Nahuatl language is divided into 30 "languages" through the ISO 639-3 classification, and to me they are "linguistic varieties", but in the end, I know what each "linguistic unit" means, and so does everyone else... so now we can work together.

We hope that you understand, and we do apologize if we hurt your feelings, as that is most definitely not our intention. Please do let us know if you have any other concerns that need to be address, we will be here for you :)
kaŝi la respondojn
damascene
antaŭ 6 tagoj
Fair use of others work
=================

If I copied two-four lines of a 30 lines poetry I should attach a death certificate of the poet? or other legal documents? I do not see this practical. its fair use that is annoying to find such non standard strict rules about it, that also prevent the use of Wikipedia CC-SA licensed source.

Translating to non native language
=================
Me translating English part, I do not feel that I'll translate to English again. I did not want to practice English. Just wanted to share a poetry. I do not feel like English should always be the feeder and other be the receiver. It would be better if other cultures were translated to English without waiting for native English speaker to learn their language then translate it. It's English that we are talking about. it's the least language that you should worry about. it has billions of studying material. I believe English is necessity but it does not mean that it should be placed higher than other languages.

Language learning website Lang-8
=================
Lang-8, I did not see what license they are using? are they non profit? do they share their sentence database publicly or it's closed for their own use? could it disappear as it happen to livemocha with all user contributions?

SIL Specifications
=================
I do not want to extend the discussion on this as I do not like to get into more Arguments about it without having enough knowledge on other languages. but I'm sure we can understand each other easily without the need to learn another language as stated in https://en.wikipedia.org/wiki/M...ntelligibility that differ between languages/variations.

Thank you for taking time to responding.
kaŝi la respondojn
Ooneykcall
antaŭ 6 tagoj - redaktita antaŭ 6 tagoj
>If I copied two-four lines of a 30 lines poetry I should attach a death certificate of the poet? or other legal documents? I do not see this practical. its fair use that is annoying to find such non standard strict rules about it, that also prevent the use of Wikipedia CC-SA licensed source.

Tatoeba appears to operate on some sort of common understanding here. I think Tatoeba admins just hate the thought of having to deal with copyright enforcement (having to respond to their demands), so if there is a remote possibility of that happening, copyright-infringing sentences will get deleted 'to be on the safe side', but if it's a little-known source that few have heard of (so any copyright action is very unlikely to happen), it will most likely not be picked up.
odexed
antaŭ 7 tagoj - redaktita antaŭ 7 tagoj
Hi, damascene

Nice to see you here and that's a pity that you are disappointed.
Here I share some of my thoughts about what you wrote.
1) GPL license is used for software, it's not something appropriate for Tatoeba, I think. Creative Commons licenses are good for all kinds of creative works. There is CC BY-NC 2.0 that is for uncommercial use but Tatoeba permits to use its data for other projects (even commercial) so if you contribute sentences you have to agree with that.

2) We don't split Arabic language but represent the current situation as other qualified linguists do. You may have no trouble speaking Arabic from different countries but for me they are really different. For example, تذهب الآن إلى السوق in Syrian Arabic would be انت هلق بروح ع السوق and in Saudi Arabia انت دخّين تروح ع السوق
There are even differences in pronounciation, for example ق in Saudi Arabia sounds different from how it's pronounced in Syria. Or ج in Egypt and in other countries.
Also there are differences in grammar and vocabulary is totally different.

3) It takes some time to get used to the rules but we all obey them.
Anyway your contribution is valuable and as a student of Arabic I read your sentences too. Thank you.
kaŝi la respondojn
damascene
antaŭ 6 tagoj
Hi Odexed,

Nice to see you too.
1) I liked the spirit of GPL, It's counterpart in CC would be CC-SA (Share a Like) so user works for example can not get used by Google or Yandex without them returning back to the community they took from. I'm not against commercial use, but I think if you benefit from my work you should also share it back.

2) The three example you provided are just different choices of words.
fore example:

(فلما _ذهبوا_ به واجمعوا ان يجعلوه في غيابت الجب واوحينا اليه لتنبئنهم بامرهم هذا وهم لا يشعرون) Yusuf 15:12
here it uses ذهب as in Syrian Arabic

in انت هلق بروح ع السوق the used the word راح as in:
(ولسليمان الريح غدوها شهر _ورواحها_ شهر واسلنا له عين القطر ومن الجن من يعمل بين يديه باذن ربه ومن يزغ منهم عن امرنا نذقه من عذاب السعير) Saba 34-12

for the word حين:
(الذي يراك حين تقوم) Ash-Shu'ara (The Poets) 26:218

for the ع: it's a shortcut for على

so they are just one language using different combinations, I see it similar to the difference between Scottish English/British English/Indian English.
If you write it as you hear it I'm sure it would not be the same thing.
damascene
antaŭ 2 tagoj
4. I've found a sentence that was translated to many languages that says "Putin is a **** head"

I do not like Putin, but how such sentence got here, Does Tatoeba allow personal attacks on public figures? Can I put any name I want?
kaŝi la respondojn
Impersonator
antaŭ 2 tagoj
We have a precedent of a sentence #2926572 (Mandela was the terrorist, Mandela was the murderer.) being deleted, so probably you can get this one deleted for the same reasons. However, apparently no one cared about Putin enough to get the sentence deleted so far.
odexed
antaŭ 2 tagoj - redaktita antaŭ 2 tagoj
The censorship on Tatoeba is quite vague. Some sentences like the one mentioned by Impersonator are being deleted, others like #4111307 and #3459609 are considered as freedom of expression. So you could chance it.
cueyayotl
antaŭ 2 tagoj
I agree with odexed that censorship here is quite vague, but it is more important to have more of a variety of sentences rather than many sentences of that format: "X is a **** head!" just by replacing 'X' with one of many other names. In the end, it may just be best to use names like Tom or Mary.
I think the main instance where such sentences were deleted was when using one of our users' names.
TRANG
antaŭ 4 tagoj
There will be a small change to make the search button look more clickable:
https://github.com/Tatoeba/tatoeba2/issues/736

The dev website has been updated to include this change so you can test it there: http://dev.tatoeba.org/eng/

Please let me know if this changes doesn't feel suitable for you (and why).


I'd also like to inform/remind everyone that there will be a Tatoeba Day this Sunday (May 1st).
kaŝi la respondojn
alexmarcelo
antaŭ 4 tagoj
Bravo !
CK
CK
antaŭ 4 tagoj
It looks better, I think.

Perhaps it would be a good idea to change all buttons on tatoeba.org so they also have the same look.

"show another", "show more...", "Send", "OK", "Submit comment", "« previous", "next »", "random", "Submit", "Submit translation", ...

AlanF_US
antaŭ 4 tagoj
It looks good to me.
alexmarcelo
antaŭ 4 tagoj
::: Request to become an Advanced Contributor* :::
User number 68437, ToinhoAlam (https://tatoeba.org/eng/user/profile/ToinhoAlam), who contributes primarily in Portuguese, is applying for the AC status. Any pros and cons should be sent directly to me, Pfirsichbaeumchen or Shishir, unless ToinhoAlam publicly states that the discussion can be held here.

*Advanced Contributors are able to link sentences and add tags.
sharptoothed
antaŭ 5 tagoj
** Sentences & Translations Stats **

These stats have been updated:

http://tatoeba.j-langtools.com/transtop/

Graphs have been updated, too.

http://tatoeba.j-langtools.com/graphs.html
kaŝi la respondojn
alexmarcelo
antaŭ 5 tagoj - redaktita antaŭ 5 tagoj
I'm glad to see Latin and Marathi in that top 10. :-)
kaŝi la respondojn
sharptoothed
antaŭ 4 tagoj
Oh yes, this dead language looks pretty alive. :-D
Ricardo14
antaŭ 7 tagoj
##New languages have been added on Tatoeba##

> Malay (Vernacular) - zlm
> Naga (Tangshang) - nst
> K'iche' - quc
> Arabic (Gulf) - afb
> Minangkabau - min
> Temuan - tmw
> Chinese (Jin) - cjy
> Maithili - mai
> Madurese - mad
> Banjar -bjn

Thanks a lot, @cueyayotl for your efforts! We wouldn't do it without you!
kaŝi la respondojn
brauchinet
antaŭ 6 tagoj
Did you know, that Temuan has 23.000 native speakers according to Wikipedia? Is one of them contributing? Or are these languages just a playground for one omnilingual person?
kaŝi la respondojn
CK
CK
antaŭ 6 tagoj - redaktita antaŭ 6 tagoj
That's also a concern of mine.

We now have over 150 languages (over half) that each have less than 100 sentences. (See https://tatoeba.org/eng/stats/s...es_by_language)

Are we just adding these so certain members can show off their (likely less than perfect) language ability, or so that the Tatoeba Project can brag that it supports a large number of languages?

I personally think that if we don't have dependable contributors for these languages, we shouldn't be adding them.
kaŝi la respondojn
alexmarcelo
antaŭ 6 tagoj
I'll have to agree with that.
cueyayotl
antaŭ 6 tagoj
Some of us come from places where people have been or are routinely ostracized or even killed for speaking minority languages. Organizations like the Summer Institute of Linguistics come to our villages, document and study our languages, and then bogart the information in volumes and other materials which require payment to be accessed. They in the end rarely help the people whose languages they study; I have personally experienced cases where the SIL uploaded a dictionary onto their site and no one from the villages where they speak the language knew about it. Tatoeba gives us the opportunity for us to share our cultures and languages freely, to create learning materials and recordings free of charge so that anyone who wants to learn a language, can. Realistically, we cannot add the thousands of "languages" that have ISO 639-3 codes, but we must do all in our power to include everybody.

>> Are we just adding these so certain members can show off their (likely less than perfect) language ability, or so that the Tatoeba Project can brag that it supports a large number of languages?

I have no idea how one can come up with such petty reasons... Maybe someone who likes playing numbers games?
kaŝi la respondojn
Ricardo14
antaŭ 4 tagoj
I do. But ONLY and JUST when I play videogames with my brothers (actually I have 3). I am quite sure that I wouldn't also be on charge on adding them if it was going to treat it as a playground. @brauchinet @CK and everybody . if you feel that my job here is messinh up Tatoeba even with Trang's supervison because I am adding languages which has 100 sentences or less FOR NOW, just let me know and I'll gladly stop adding language and unfortunately, many languages wont be represented. It's nor because we have few sentences for now on a certain language that makes it "not reliable, something that doesn't deserve to be on Tatoeba" but not having languages with potential contributors is really dangerous since we may (and will) lose several potential members that don't even know how to add a sentence (FOR NOW) - and that might not find a way to request his/her language before being added.
For EACH language I added, I did a research. Is it a language? Are that sentences valid? Can we have more contributors? Do their translations match? Are they possible copyright infringements? Thanks to @cueyayotl, @odexed those languages fullfil all that "minimal requirements" and were implemented. But if some think that I like to waste time, to damage the Corpus rather than studying languages or giving classes, feel free to ask me to stop doing that because my goal is to have as much languages as possible - as far as they follow all the requirements.
That's it.
cueyayotl
antaŭ 6 tagoj
We've lost a huge number of possible contributors due to us not having their languages on our site. Those who have been on this site long enough know that it is possible to contribute in languages not yet added to the site, but unfortunately new users have no way of knowing this. I've added either languages of previous users who have left the site or languages with many native speakers in order to increase our chances of having a native speaker (in fact I was able to catch a native Minangkabau speaker who created an account and wasn't adding sentences: hopefully we will see much more of her here), don't worry though, I always consult native speakers before making contributions :) As has been seen with languages like Marathi, Berber and Macedonian, all it takes is a single, dedicated user to create a large corpus. Now that we can add languages to our profiles and filter out other languages, there is really no reason why we couldn't add the over 7000 "languages" with ISO 639-3 codes; it is purely for scholarly purposes and exploiting the potential of those who browse the site, "bragging" that Tatoeba supports many languages has nothing to do with it.
kaŝi la respondojn
pullnosemans
antaŭ 6 tagoj
maybe it would be a good idea to display an "is your language not in our database yet? click this link for instructions on how to add a new language" kinda message on the front page or another page that is very frequently visited.
kaŝi la respondojn
cueyayotl
antaŭ 6 tagoj
+1

This is definitely a great idea, and I hope it can help us minimize the number of languages with few contributions.
Ricardo14
antaŭ 4 tagoj
+1
corvard
antaŭ 6 tagoj
I don't think we have lost anything.
kaŝi la respondojn
cueyayotl
antaŭ 6 tagoj
Thank you for your input; I'm sorry if my words were confusing, but I meant to say that we have "failed to obtain" or, in other words, that we have "lost opportunities to expand our corpus".
<<< 1234567 >>