Wall (7134 threads)
Astúcias
Before asking a question, make sure to read the FAQ.
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
mraz
1 days ago
mraz
1 days ago
Dovud123
1 days ago
sharptoothed
1 days ago
frpzzd
9 days ago
hecko
9 days ago
frpzzd
10 days ago
araneo
10 days ago
gillux
10 days ago
araneo
10 days ago

[not needed anymore- removed by CK]

your stats are wrong: rene1596 is not a native french speaker. He is a native russian speaker.

je note que soudain, les dernières phrases que j'ai saisies n'apparaissent pas sur la page d'accueil...Bogue ?

Yes, and it hasn't been fixed yet...

.-.-.

Tio ne havos bonan finon. ;)

Hey everyone, a few days (weeks?) ago I posted something about a script I wrote that helps me find mandarin chinese sentences that I can easily understand because I already know all the characters. I've finally written a documentation and put it on my blog:
http://www.der-patriotische-kap.../findsentences
Here's the abstract:
This script helps to find example sentences for mandarin chinese on tatoeba.org, based on a file with all the characters that the sentences may (exclusively) consist of. It outputs a file with the found sentences which can be more or less conveniently importet into Anki, a spaced repetition learning software.
In addition to the sentences, the script searches a copy of the chinese-english cedict dictionary to fetch translations for words consisting of two or three characters which occur in the sentences. It searches tatoeba.org for native mandarin audio, and downloads it if there is an audio file.
Furthermore, it is possible to generate a set of audio files: If the sentences have native mandarin audio, it will check the german or english translation for audio files. If those have audio as well, it will concatenate the audio files using the following pattern: chinese - translation - chinese - 1 second pause, while preferring german over english. If there is no native audio file for the german or english translation, it will synthesize an audio file using the text-to-speech-program espeak. Put these files on your mp3-player.

Cool!!! Thats exactly what Trang was dreaming of: someone makes use of the massive stock of sentences for some learnign application! I wish you the best for both learning and programming!

Just a little UPDATE to my Userscript:
*** Tatoeba Random Languages ***
http://userscripts.org/scripts/show/104964
Now you can lock a given language by double-clicking it (a lock symbol will appear on the flag), so you can temporarily bypass the randomness of the script! Enjoy!

Vi havas bonegajn iniciativojn, Jakov, gratulon! Mi bezonis ĝuste tion!

Koncerne vian komenton http://tatoeba.org/deu/wall/show_message/8532 mi ankaux pensis cxu ne estus ecx pli bone preni nur ne tradikitajn frazojn por la hazardo.

Bona ideo!


Hm, oni povus kunigi la informojn de CK (kvazaux tiuj kiuj sin deklaras denaskaj) kun lainformoj kiujn oni ricevas per mia userscript "show user profile inline" (kiun mi planas plibonigi cxar mi ja eltrovis ke estas multi pli facile elsxuti la nombrojn de frazoj per elsxuto de la pagxoj per lingvo anstataux ol elsxuti cxiujn pagxojn de la uzanto, kaj mi pensas ke gxi nun ne tute bone funkcias): ekz. oni povus meti limon; se 50% de la frazoj de uzanto estas en certa lingvo (kaj superas certan minimumon ekz 100 frazoj) li estu konsiderata kvazaux denaska kaj lia uzantnomo estu cxiam markita kiam estas frazo en tiu lingvo.

Pourrais-tu faire une option « sans Tom » ? Ça serait la cerise sur le gâteau !
J'adore ton script, Jakov !

Pourquoi pas ?! Ça serait pas compliqué, il faut seulement charger les phrases ce qui sont preparé et les chercher par un RegExp.
Come ça on pourrait aussi eviter des phrases de certain users, par excemple les propres.

ça serait génial !

*** netradukitaj frazoj / phrases non traduites / untranslated sentences ***
[epo] Ne ĉiuj lingvoj egalas pri la ne-tradukado
[fra] Toutes les langues ne sont pas égales face à la non traduction
[eng] Not all languages are equal when it comes to non translation
[ita] 10.03% el la italaj frazoj ne estas tradukitaj => 4200 ne tradukitaj frazoj!
[por] 8.94% el la portugalaj => 4490!
[deu] 8.67% el la germanaj => 7630!
[eng] 5.20 % el la anglaj => 11030!
[epo] 4.77% el la esperantaj => 5700!
[fra] 3.70% el la francaj => 3930!
[rus] 2.76% el la rusaj => 930
[pes] 2.19% el la persaj => 245
[spa] 2.15% el la hispanaj => 1574
[vie] 1.97% el la vjetnamaj => 61
[dan] 1.71% el la danaj => 99
[ukr] 1.33% el la ukrainaj => 230
[cmn] 1.25% el la ĉinaj => 419
[nld] 0.67% el la nederlandaj => 152
[jpn] 0.39% el la japanaj => 615
[heb] 0.35% el la hebreaj => 39
[tur] 0.25% el la turkaj => 99
[pol] 0.22% el la polaj => 91
[hun] 0.22% el la hungaraj => 39

[fra] Ce qui est intéressant c'est de voir c'est que ce sont les langues les plus communes de tatoeba qui sont en proportion le moins traduit, sûrement que lorsqu'une personne contribue dans une langue "rare", elle le fait souvent soit en traduisant seulement, soit en ajoutant puis en traduisant peu de temps après.
Alors que les contributeurs dans les langues assez bien représentées sur Tatoeba prennent plus de liberté, ayant surement à l'esprit "De toute manières quelqu'un finira par me traduire"
Bon ça enfonce un peu des portes ouvertes mon analyse...

Ta théorie ne fonctionne pas très bien. Le Japonais est très populaire, mais aussi très traduit (0.39% de non-traduits). Les langues les moins traduites sont l'Italien et le Portugais, mais ce ne sont pas les plus populaires.

>Le Japonais est très populaire
Il y a très peu de natifs qui contribuent en japonais, même si on enlève les faux natifs des statistiques fantaisistes de CK, le nombre de contributeurs en japonais est minimal par rapport aux autres langues populaires. La très vaste majorité des phrases japonaises ont été importées du corpus Tanaka. Très peu ont été créées sur Tatoeba...
Par rapport à l'espéranto, au français, à l'allemand ou à l'espagnol, la popularité du japonais sur Tatoeba est toute relative...

Faktoj pri la Tekstaro Tanaka / Facts about the Tanaka Corpus / Fakten zum Tanaka-Korpus
--> http://www.edrdg.org/wiki/index.php/Tanaka_Corpus

et en fait je viens de réviser les chiffres que je n'avais pas mis à jour depuis un certain temps pour le japonais (du fait que le nombre de phrases dans cette langue croissait plutôt relativement lentement) et il s'avère que la situation se dégrade. Il y a désormais 1,65% des phrases japonaises non traduites, soit 2676 phrases.

je ne m'explique d'ailleurs pas cette augmentation massive.
Soit des phrases japonaises sans traduction ont été ajoutées.
Soit des liens ont été rompus, volontairement ou pas...

Oui, en effet, mais il y a aussi des problèmes de qualité. les phrases qui ne sont pas traduites n'ont pas trouvé preneur...

ou le niveau requis pour les traduire n'est pas disponible, ou leur sujet n'intéresse personne...
Le Mont Fuji...bof!

http://tatoeba.org/eng/contribu...r/mayok/page:5
See translations from Jun 19th 2011,
can we trust that?

Of course it could be someone else who just thought one could trust Google Translate.

I translated quite a few German sentences from mayok and used to trust them...

I can see that at least some are machine-translated. The Yiddish sentence reads "Antshuldikn mir, vos mol iz es?"
That's a mistranslation from English "Excuse me, what time is it?", time here being translated in the sense of "two times, three times, for times" (which is "mal" in German and hence "mol" in Yiddish).
I know that the Yiddish Google Translation is usually quite bad.
I believe it's *him* again.

I take my last comment back. I didn't see his other contributions and comments. Okay, no. I believe it's a user who once upon a time trusted Google Translate too much.
The Yiddish sentence is definitely wrong, but his German seems to be native.

Yep I don't think this account is a fake one from bora, my bora-fake-account-detector-3000-next-gen-deluxe-edition(tm) has recognized no "bora"-like pattern in his behaviour.

I confirm that, although my version is not as state-of-the-art as yours...

This user seems to leave and answer to comment, so may be drop him so comment on the languages in which he has added sentences that he hasn't added in his profile.

Por qué estoy viendo frases mías sin traducción, ¿cuando en realidad son todas traducciones que hice ?

¿sale en el log si alguien las desligó?
si es así debería salir el nombre del que lo hizo.
O si no, también debería salir el link a la frase a la que fue ligada inicialmente, y la puedes volver a ligar tú misma :)

No Marcelo, aparecen mis traducciones sin ningún detalle, como si nunca hubiesen sido una traducciones, no veo que la hayan desligado, pero obviamente fueron desligadas, hice varias traducciones de la Declaración Universal de los Derechos Humanos del español al portugues y están desligadas, pero no puedo ver quíén lo hizo.
@sacredceltic:I'll send it to sysko, many of my traslations have been unlinked :(

Don't worry for this :) Actually I've started working on this issue this weekend, but the internet connection there make the task much much slower. But in the end it will get fix, data are backup often enough.

I don't mean to be insistent but,
how is this going?
is this normal?
it seems a large part of the links were corrupted

I willl have time to restore them around friday, not before, but at the end it will be done
No that was not normal, as any crash anyway. It happends because of the conjonction of boracasli stuff and (and actually mainly this), the fact that google suddenly decided to crawl our website with a 5 request by second rate, so Tatoeba simply said "no I can't, or you should give me a raise" and goes on strike.
but don't worry we get backup of the data so at the end everything will get back to normal.

should be fixed by now.

It seems to be OK now.

English untranslated sentences went from 107XX to 8XXX all of the sudden ^^



I'm sorry about that. Don't lose hope. Each time sysko has been confronted with this problem, he could quikly restore the links...

I think the situation is that only half the link is present. Check on both sides if there is a track of your translation in the history of the sentences.
If the link is present only on one side, then it is not visible from the other side...

is it possible to get a summary of links and unlinks in say, a month?

@hayastan
It's possible that links have been destroyed (and have been restored later?). I saw a few examples myself. Could you find an example? If so, send it to sysko.

Parece-me que elas não foram desligadas (pelo menos de acordo com o log)... Não vejo a tradução desta frase, por exemplo, apesar de estarem ligadas...
http://tatoeba.org/eng/sentences/show/1245225
Se você clicar em 554863 no histórico, vai ver que elas estão de fato ligadas.

Maintenant Borocasli envoie des croix gammées à mon adresse électronique. Super !
Je porte plainte pour harcèlement ?

On the way to fix everything...

Great news!

Thank you!

Merci sysko :)

Merci sysko !