Profil
Sätz
Vocabulary
Reviews
Lists
Leestekens
Kommentare
Kommentare to cueyayotl sien Sätz
Pinnwandnarichten
Logböker
Audio
Transcriptions
Translate cueyayotl's sentences

I wonder if Khmer should really be called 'kmera' in Esperanto (as opposed to 'ĥmera'). In Cambodian esperantist circles, we use 'kmera', plus it is closer to its pronunciation in Khmer (kmae).

Good afternoon, ⵓⵎⴰⵔ
What a fantastic video! Are you able to transcribe it? It seems phonologically close enough to Kabyle to be able to transcribe without many problems.
So that's why... languages DO change a lot within 100 years, but it seems that even in Émile Laoust's time they were already different :/ there is no excuse: if they were already different enough to be classified as distinct languages before, then certainly now, they should be classified as distinct languages (and have distinct ISO 639-3 codes).
>> I'm planning to start working on a gigantic "basic corpus" suitable for any endangered language, including a language like South American Yanomamo.
In some languages, you can't count past '2'. Others don't have colors. Forget asking a Yanomamo native how to say "I rode my camel across the desert." :) I guess there could never be a list where all the sentences applied to all forms of traditional life, so I suppose it just comes down to creating a list of sentences without neologisms/technology :)
Neologisms can pose problems, as is the case for 'credit card' as you mentioned before. In Mexico, most tribal languages would just use the Spanish word, as I would assume the Algerian tribal languages would just use the Arabic word. It is OK to have these translations, but it wouldn't be very helpful in the long run. (I do remember a clever translation of 'ATM' I heard in one of the Mixtec languages, though)
It'd be great to have such a "basic corpus" of sentences suitable for daily village life. I really wonder if just using a collaborative list would suffice. Also... be careful when you are linking sentences. If you translated some of Tatoeba's sentences into Algerian Arabic FIRST, the Blida Atlas dialect should be linked to Algerian Arabic only, until you understand this dialect well enough to conclude that it is a proper translation of the English.
>> How do you do to document the full conjugation pattern of a language?
If you find ANYONE who can conjugate a new verb you come across with, HOLD ON TO THEM LIKE GOLD! In your case, the languages you wish to document are similar to your own, so the patterns will emerge much more quickly. Some languages will be more Arabic-influenced than others, but since you speak both Arabic and a Northern Berber language, you shouldn't have too many problems (unless one of the languages you wish to document has had some Nilo-Saharan influence as well).
At any rate, you must intensively research any languages within the same language family that have been documented and see the documented verb patterns, as well as those of any language that may have influenced it in order to get an idea of what meanings verbs can convey in another language. Languages like Chinese (any), Vietnamese, Khmer, etc. do not conjugate their verbs, but others, such as Korean may not conjugate by grammatical number, but DO have a very rich conjugation scheme based on tense, modality AND aspect. There are literally THOUSANDS of ways to conjugate a single verb in Korean, and I have yet to see a full list (though if I see a verb in a sentence, by all the conjugation patterns I have learned, I could identify all the 'data' it contains :) ).
Research as much as you can, and always remember the context in which a sentence was translated, as it can help give clues as to the deeper meanings of a particular instance of a verb.

Thank you, TRANG.
>> Is because they just don't feel interested in the project in general. Or is it because they found Tatoeba too complicated to use?
The reasons for not joining have been plenty. In MY case, most stem from people being too lazy or not having proper motivation. Though there ARE those who have complained that Tatoeba was too complicated to use. Example: user jeronimoconstantina (who initially did have problems with the interface) invited fellow Kapampangan native speaker reyjay1, who simply couldn't figure it out, and to this day has not contributed anything to Tatoeba.
Most people who refuse to join say that they WILL join someday, but retain a "let somebody else do it" attitude. Others say that there is no way that they would work for free. Others don't want to work for others (as bizarre as this sounds, it DOES happen. I've encountered a native Abkhaz speaker who was building his own English-Abkhaz corpus... when I asked if he could volunteer on Tatoeba and then use Tatoeba as a source on his site, he apologized because he did not work for others). And, still others don't see the value of Tatoeba (often with the excuse that 'crowdsourcing' can never yield in a quality corpus). I cite, for example, user ManguPurty. He has tried his hardest to get more native Ho speakers to join, but being just 17 years old, others see him as being too young to take an initiative in something with real value. If I had the funds, I would fly to ManguPurty's village and do the recordings myself, as this Ho language is a language I have a genuine interest in (being an Austroasiatic language, related to Khmer and Vietnamese). The Ho language, unfortunately has never had a standardized orthography... even in their Varang Kshiti script, or in Devanagari.
So, I have to second Amastan's motion of having an 'exclusively oral corpora for these languages', and not just that, but the possibility for multiple recordings of the same sentence (as in forvo.com).
I'm not sure what else could be done to attract more users...

I know that it isn't the same as the Chenoua dialect, but the SIL STILL considers what you would call the Blida Atlas Dialect as ISO 639-3 CNU (which it calls 'Chenoua' language). Maybe you should send SIL a message persuading to give the Blida Atlas Dialect a separate ISO 639-3 code, otherwise it COULD cause problems later. I would support that decision 100% :)
The grammatical information you gave is very interesting. I have notes from Teotitlán del Valle Zapotec, which the SIL classifies under ISO 639-3 ZAB, though it is COMPLETELY unintelligible with Guelavía Zapotec (under the SAME ISO 639-3 code), but mutually intelligible with Mitla Zapotec (ISO 639-3 ZAW). I don't know who exactly came up with some of these linguistic divisions, but it is time us locals made some change. This, unfortunately, is why my Colombian friend I spoke of before did not want to join.
Anyway, as for your question, use combination of both. In a "classical method" account/conversation, it may be difficult to add translations in other languages, as many concepts are completely lost in translation. That is OK, you have several languages in which you can try to translate in, so that we can have a good idea of what the original expression was. As most of the people who speak these nearly-extinct languages have never taught their language, they are not familiar with the scheme in which we learn Indo-European languages, namely "I eat." "You eat." "He eats." etc. I have been unfortunate enough to ask for translations of these consecutively and come up with no apparent pattern after changing the verb a few times (usually it is because they changed the aspect or modality of the verb between sentences). Sentence to sentence translation without context is generally very difficult for these people. What I recommend, is to CREATE a story, or outline for a conversation using sentences from Tatoeba. Maybe even use the same outline on a different day to get a different translation. Ask questions, and try your hardest to comprehend. Keep in mind, that in your translations, 1st and 2nd person can get switched around, so try to pick up on those patterns SOON. I cannot tell you how many times I asked how to say something like "Your name is Tom." and received the equivalent of "My name is Tom."
One last thing: record EVERYTHING!! Even if you have HOURS of audio, record ABSOLUTELY EVERYTHING that they say. You won't believe how much gold will slip by when you are not recording.
I'm excited for you, I really am :D

I totally agree with you. If I could, I WOULD go out and get 100.000 sentences of each endangered language. Unfortunately, I am now in South Korea, without access to any of these languages. Next year, I plan to go back to Cambodia for a year. I have studied Cham [CJA] (among other minority languages) for quite some time now, completely puzzled as to how to document them (nobody writes them). I DID manage to bring Ngeq [NGT] to Tatoeba, though I am entirely dissatisfied with the amount of data I collected (I DO have more in my notes, though). Now, armed with more technology, I am excited to return and record as MUCH audio as I possibly can. Sa'och [SCQ] (spoken in Veal Renh commune: about 10.7193N, 103.8275E) died out during my stay in Cambodia, but hopefully other languages don't have to.
I think 100.000 sentences is MORE than enough to learn the basics of a language :P
But, it is a very nice goal.
How many people speak Blida Atlas Berber? Since we follow SIL's conventions for naming languages, we WOULD have to classify Blida Atlas Berber under the name 'Chenoua', though I am sure that, as long as no one is contributing in another dialect of 'Chenoua', TRANG would allow it to be named "Blida Atlas Berber (Chenoua)" [Cf: Odia (Oriya)], at least temporarily :)

These sentences were contributed by a native speaker using the Southern-Priangan dialect of Sundanese.
I am trying to convince him to join the project and contribute for himself, but have thus far failed.
If you would like his contact information, please send me a private message.
I have sent you a private message with 160 translations provided by the same person which I didn't add to Tatoeba. Please let me know if they are good, or if you believe that they should just be deleted. Thank you :)

Is there a way to change a sentence with audio? Several sentences have audio that is different from the written sentence, and also I have been discussing with some of our members whose native languages do not have a standardized spelling. It would be great for them to record audio, and then if there is ever a standardized spelling for their language, they can go back to their sentences and change their spellings.
One day, I would like to go back to Mexico and speak to people from different villages to help conserve their languages here on Tatoeba, however the great majority of people speaking minority languages do NOT know how to write them. We could have them write in an ad hoc way, but record audio faithfully and then those of use who DO know how to write indigenous languages can leave an orthography as comments for them to either modify their written sentences, or fix them for them.
Is there also a possibility for multiple recordings of the same sentence? Spanish is pronounced differently in every country where it is spoken (and even within the same country), and it would be great to have as many different pronunciations as possible (as in forvo.com). I see absolutely no reason why it could not be so.

One could argue that the Mexican flag should be used instead (though it is what we are using to represent the Nahuatl language) as it is the country with the most Spanish-speakers.
I personally think the (national) Spanish flag is the right choice, just as the Union Jack is the right choice for the English language.

'most likely', not 'likely' :)

Congratulazioni!! :D

Conlangs CAN be added; they just need to have an ISO 639-3 code. Most of the Conlang sentences we have are in Ithkuil, in Rothongua, in Kah (by user Yauh), and in whatever user 'artur' wrote with. Of all of these, Ithkuil is the most likely to receive an ISO 639-3 code. I'd be more than happy to see Ithkuil added to Tatoeba.
In the end, it is Trang's choice, but it would be such a waste to throw away these users' hard work.

+1
I sometimes get them, but other times, I don't.

That's definitely good for translation purposes, however sometimes when studying languages with others, I search random sentences often to find a structure with which I am not familiar with in the target language. I usually have to refresh the Display random sentences a few times before finding anything.

It definitely helps :) thank you!

Contribute >> Translate Sentences >> Display random sentences
Would it be possible to add more possibilities? 20, 35, 50, 75, 100?
Quite often 15 has not been enough.

A way to remove a single sentence from a list, without having to browse through the list.
We now can see which lists the sentence belongs to. How about a link to remove the sentence from the list (if the list is ours, or is collaborative).
Also, language filter for lists.

And thank you so much for your hard work and dedication :)
The Ilocano sentences kept coming and coming! It was hard to keep up with the list :P

Right, I don't think it is urgent or anything, but it is really difficult to search example sentences containing a specific Kapampangan word, because of the above reason: online dictionaries usually give the word without diacritics. The languages that most need this are precisely the Philippine languages, of which we will soon be adding Cebuano and Ilocano as well.
If it is too much of a problem to implement this, that is OK, maybe we can find a way around it. Thank you :)

This is interesting. If we look at sentence #4652117,
We have the word 'Miⁿko' for chief, which is a valid, alternate spelling of 'Mi̱ko'.
Someday, it'd be great to search for "Miko" in Choctaw and have both of the above spellings appear. :)

That's perfect! Philippine languages CAN be written with diacritics, but are usually not in practice. The same is true for Native American languages, and even Latin.
I give for example in Tagalog (taken from http://www.unilang.org/course.php?res=79)
kaibígan - friend
kaibigán - desire
kaíbigan - sweetheart
káibigán - mutual consent
Though, in writing, usually you can tell one from another by context, 'friend' and 'sweetheart' can be confused.
It would be great, AlanF_US, if you could do this for particular diacritics for:
Tagalog, Kapampangan, Chavacano, Hiligaynon, Chamorro, Nahuatl, Choctaw, Mohawk, Shuswap, Navajo, Lakota... maybe even for the African languages as well. Also, for Latin, I should, for example, be able to type 'vacuō' and get results for 'vacuo' and 'vacvo' as well, and vice verca. Is there any way to do this?