menu
Tatoeba
language
Registrieren Anmelden
language Deutsch
menu
Tatoeba

chevron_right Registrieren

chevron_right Anmelden

Durchsuchen

chevron_right Zufälligen Satz anzeigen

chevron_right Nach Sprache durchsuchen

chevron_right Nach Liste durchsuchen

chevron_right Nach Etikett durchsuchen

chevron_right Audiodateien durchsuchen

Mitglieder

chevron_right Pinnwand

chevron_right Mitgliederliste

chevron_right Mitglieder nach Sprachen

chevron_right Muttersprachler

search
clear
swap_horiz
search
xtofu80 {{ icon }} keyboard_arrow_right

Profil

keyboard_arrow_right

Sätze

keyboard_arrow_right

Vokabelwünsche

keyboard_arrow_right

Bewertungen

keyboard_arrow_right

Listen

keyboard_arrow_right

Lieblingssätze

keyboard_arrow_right

Kommentare

keyboard_arrow_right

Kommentare zu den Sätzen von xtofu80

keyboard_arrow_right

Pinnwandeinträge

keyboard_arrow_right

Protokoll

keyboard_arrow_right

Audio

keyboard_arrow_right

Transkriptionen

translate

Sätze von xtofu80 übersetzen

Pinnwandbeiträge von xtofu80 (insgesamt 81)

xtofu80 xtofu80 19. Oktober 2010 19. Oktober 2010 um 18:59:42 UTC link zur Pinnwand

> It is OK to use idioms "internally" in your own culture, but trying to convey them outside of it doesn't make sense at all.

A huge portion of translation work is not done on texts which were explicitly designed for a foreign audience, but usually on texts or speech that was just uttered in the mother tongue, and was then decided to be translated into other languages. E.g. hardly any German author thinks about whether the sentence he writes is easy or difficult to translate into Korean. Consequently, problems such as idioms can never be avoided, though they are sometimes not translatable (often they are; I have learnt a lot of Japanese idioms which have German counterparts, such as
"焼け石に水"="ein Tropfen auf einen heißen Stein")

Just a question about esperanto:
So do
to teach
to train
to drill
to educate
all have the same root?

xtofu80 xtofu80 19. Oktober 2010 19. Oktober 2010 um 00:10:47 UTC link zur Pinnwand

I think that the corpus has to be bigger if there are many surface forms, that is, if a lot of inflection is used. The point why English is chosen as a pivot language in machine translation is that there are many translations from and to English, but fewer between the languages themselves. For example, you will find much more material between Thai and English and between English and French than between Thai and French. Consequently, engineers try to use English as a pivot language. Arguing that one wants to use Esperanto, one would need the same size of text corpora (from the same domain) for Thai-Esperanto and Esperanto-French, or as you said, the necessary amount of text might even be larger due to the different number of surface forms.

xtofu80 xtofu80 19. Oktober 2010 19. Oktober 2010 um 00:04:43 UTC link zur Pinnwand

So what makes you believe that Esperanto preserves more of the original meaning if used as an intermediary language? If I translate from Japanese to Czech, why would Esperanto preserve the meaning better than English?
It is true that using pivot languages for translation does lead to a loss in information, and I also agree that the choice of an intermediate language will have an influence on the translation quality, I doubt that Esperanto will be the solution of all problems. Instead, I assume that the "optimal" pivot language will be different for each language pair.

xtofu80 xtofu80 18. Oktober 2010 18. Oktober 2010 um 23:49:50 UTC link zur Pinnwand

Thanks for the link, I found the website very interesting. However, gramtrans is, as the name suggests, not a statistical machine translation system but a system based on grammars constructed by hand. While an SMT system can be trained automatically given a huge bilingual corpus (the larger the corpus, the better the quality), grammar systems need hand-tweaking by linguists for decades, as their are many exceptions and special cases (maybe for Esperanto, there are fewer exceptions than for natural languages like English or German). I have yet to see a large scale SMT system trained for Esperanto...

xtofu80 xtofu80 18. Oktober 2010 18. Oktober 2010 um 10:45:59 UTC link zur Pinnwand

How about EU parlamentary speeches and news bulletins, is that available in Esperanto on a large scale?

xtofu80 xtofu80 17. Oktober 2010 17. Oktober 2010 um 00:10:14 UTC link zur Pinnwand

Well, it will fail for financial reasons and for others as well. The problem is that to use Esperanto or a variant of it as a pivot language, you need a huge collection of already translated texts with Esperanto on one side, and such a collection is, as far as I know, not existent. Most such databases exist between living languages, where there was a natural need to translate documents. E.g., the first texts to be used were parlamentary debates in Canada, which had to be translated between French and English. However, there is no natural need to translate texts into Esperanto, and hence these texts can not be utilized in a machine translation system.

xtofu80 xtofu80 16. Oktober 2010 16. Oktober 2010 um 08:44:24 UTC link zur Pinnwand

Der Nachteil, in anderen Sprachen als Englisch zu schreiben, ist der, dass nicht alle Nutzer den Diskussionen folgen können. Da die meisten Menschen heutzutage Englisch als erste Fremdsprache lernen, ist das Schreiben von Beiträgen auf Englisch einfach pragmatisch und hat nichts mit Imperialismus zu tun.

xtofu80 xtofu80 6. Oktober 2010 6. Oktober 2010 um 08:49:47 UTC link zur Pinnwand

Hello Trang. In general, your idea to support new users is a nice idea. However, the "unofficial" system in which the more experienced user contacts the new user feels a bit awkward, especially with the "please link my profile in your profile" passage.
If we do the contact via private message, it seems like we want to "control" other users. It would therefore be good if this system was publically explained on the introductory pages. It doesn't need to be a feature which needs a big implementation effort, but it should be transparent both to the experienced and the new user.

In general, I feel that the communication components of tatoeba need to be brushed up. While the core system itself works pretty well, apart from commenting on a sentence there is hardly any good way to communicate. E.g. if a topic is discussed on the wall, but then new topics appear, the old topics simply vanish.

xtofu80 xtofu80 4. September 2010 4. September 2010 um 00:16:51 UTC link zur Pinnwand

Well, Moore's law is about computer capacity, but not about human capacity. If you are looking for a nice example sentence, but instead of getting 10 nice and distinct ones, you get 500 variants of the same sentence, you don't want to look on page 117 to find the next uniquely distinct sentence.

xtofu80 xtofu80 3. September 2010 3. September 2010 um 12:36:06 UTC link zur Pinnwand

I am not sure whether it makes sense to produce too many translation variants. Instead of having ten translation variants for a single sentence, contributers should rather translate ten different sentences. By browsing through different sentences, language learners will then see that there are several ways to translate a sentence, e.g. they would encounter sentences which use "must" and sentences which use "have to". Creating too many translation variants for one sentence is a) boring b) overwhelming, having a long list of translations.
The reason why I do not like it is that when I search for a certain word, say e.g. "umbrella", instead of getting ten nice different sentences in which umbrella is used, I get thousand translation variants, where "must" is replaced by "has to", which has no bearing on the usage of the word umbrella.

An overuse of translation variants will make tatoeba an unwieldy monster, and more problematic, it will be rather useless to browse through.

Another problem is that all those translation variants will again be translated into other languages, creating an exponential number of sentence variants. This all is rather useless in my humble opinion.

xtofu80 xtofu80 18. August 2010 18. August 2010 um 05:52:53 UTC link zur Pinnwand

This sounds nice. Actually I used a similar system for looking up English sentences, namely the BYU system for browsing the British National Corpus, freely available for students:
http://corpus.byu.edu/bnc/

But does such a system exist for Japanese corpora?
I mean, besides the search facility, the corpora are the other essential ingredient. That's why I tried to use Google, cause they do have the data.

xtofu80 xtofu80 17. August 2010 17. August 2010 um 15:10:24 UTC link zur Pinnwand

Finding new sentences:
When there is a new vocabulary I want to learn, I look up example sentences in Tatoeba. I translate the sentences into my native language to get used to its collocations and nuances, and then I add some sentences to my SRS system (anki in my case).

But when I do not find a sentence here, I have great difficulty to find meaningful sentences on the web.
Even if I use the option "in the text of the page", I mostly get meaningless results such as headlines or single words, but not nice sentences. Does anyone have any ideas about how to find sentences in a more efficient way?

The above problem led me to the idea that users should be able to ask for sentences containing certain words, because native speakers might come up with sentences more easily.

xtofu80 xtofu80 16. August 2010 16. August 2010 um 16:43:14 UTC link zur Pinnwand

Sure. And here it is:[ ]. Lol, do you see the difference?

xtofu80 xtofu80 16. August 2010 16. August 2010 um 13:36:19 UTC link zur Pinnwand

I am not sure whether it is a bug or a curiosity ("feature"), but when you want to search for multiple Japanese words, the whitespaces have to be entered in English. Typing whitespace when using the Japanese IME leads to different results (always no results?)

xtofu80 xtofu80 13. August 2010 13. August 2010 um 22:41:37 UTC link zur Pinnwand

I wonder if this whole duplicate business could not be solved by automatically merging a duplicate sentence at the time it is entered into tatoeba.

So if I add a new sentence, the server looks it up in the database, and if it is identical to an existing entry, both are merged, which is basically a link operation to the sentence I am translating from. That should be as demanding as a simple search in the database, plus one link operation.
As a consequence, the database would at no point in time contain duplicates.
(We have to consider multiple entries though if two sentences look identical in two languages.)

xtofu80 xtofu80 5. August 2010 5. August 2010 um 13:37:52 UTC link zur Pinnwand

I agree, though sometimes it is interesting in highly inflected languages to see how the Gender of the subject may change the other words in the sentence. But if you think they are of no use, simply do not add such sentences, and do not translate all of those variants. That's how I do it.

xtofu80 xtofu80 4. August 2010 4. August 2010 um 22:43:52 UTC link zur Pinnwand

I think discussions on the wall should be held in English. Only discussions in the comment section of a particular language might be held in that very language, since a reader who cannot read the comment might also not be interested in the sentence anyway.

xtofu80 xtofu80 4. August 2010 4. August 2010 um 22:02:45 UTC link zur Pinnwand

I think it would be wise to mark non-idiomatic translations of proverbs by a tag, because often translations of proverbs are rather weird.

xtofu80 xtofu80 3. August 2010 3. August 2010 um 16:56:08 UTC link zur Pinnwand

Right, it is not about discrimination, it is rather about education. What I was originally aiming at was an encouragement, for example in an introductory video/text/mail, to add only sentences in one's mother tongue or a language one is confident in. I still think that tatoeba is underdocumented / underguided. We should develop a nice tutorial, also some translation guidelines.

xtofu80 xtofu80 3. August 2010 3. August 2010 um 14:27:17 UTC link zur Pinnwand

Well, I used the word "discourage" to indicate that the issue is more complex than a flatout prohibition of all contribution apart from one's mother tongue, and I can partly agree to the notion of "generating foreign language sentences outside their comfort zone".

However, I fear that example sentences created by non-natives will often be slightly awkward. Personally, I even hesitate to contribute in English, often double-checking with Paul if I have a correction of an English sentence. I think it is a question of corpus quality. In my personal opinion, it is far better to use this website by translating sentences from the language you want to learn into your mother tongue. By doing so, you 1. learn the natural sentence structure, collocations, phrases etc. of the foreign language. On the other hand, if you produce sentences in a foreign language, they will sometimes be flatout wrong, sometimes be slightly wrong, sometimes sound weird etc. All of those less-than-wrong, but still awkward sentences will deteriorate the quality of the example sentence database. You are right that this restriction would lose the possibility of producing L2 output. But it would reduce the amount of poor quality data on this website.

If people are looking for an opportunity to produce L2 content and be corrected by others, they can go to http://lang-8.com/, where you can write texts in your L2 and get corrected by others. Maybe it is only my personal opinion, but I would rather opt for a high-quality sentence database.