Wall (7 134 threads)
Парады
Before asking a question, make sure to read the FAQ.
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
mraz
4 days ago
mraz
5 days ago
Dovud123
5 days ago
sharptoothed
5 days ago
frpzzd
12 days ago
hecko
13 days ago
frpzzd
14 days ago
araneo
14 days ago
gillux
14 days ago
araneo
14 days ago

Why does it take like 3 minutes to switch pages? I can't work well. I'm getting bored.

between which pages?

For example:
http://tatoeba.org/eng/sentence...rent/page:6750
and
http://tatoeba.org/eng/sentence...rent/page:6751
or
http://tatoeba.org/eng/sentence...rent/page:6749
I use to translate directly from these pages, but I always have to wait like 3 minutes to open them... And as long as other pages are loading in different tabs, the translations are not addded either.
Maybe the filters are too complex, though...

Yep unfortunately this set of pages, especially when getting in high number, start to be very slow because then the database can't use any optimization method as there's already a lot of filter. and for this I can't really do anything on the short term:(
The only things that maybe will indirectly speed it up is that today i discovered that for two weeks Google started back to crawl Tatoeba with their own determined speed (which is very high, like 2 request/second), which create an artificial heavy load on the server. I've made some hours ago a request for a lower crawl rate, it takes around a day to apply, so in one day you will certainly notice a speed improvement for this kind of pages.
the mid-term solution I'm thinking about is using the API , to have a desktop tatoeba client with the database locally (it's not that heavy, something like 300mo admitting you're interested by all languages and links etc.) that keep sync with something like pushsub, so that this kind of heavy filter will be done on your computer which will be far far faster as you're computer is being used only by you.

Thank you, sysko! That would be great!
I'll try to work on "lower pages".

About detecting languaje automaticaly.
What metod will use for detecting languaje?
I write in wall about trigrams or N-grams. Its seem a very good a simple metod.
I don't know if sysko know it.
In the message there the word http://detectlanguage.com/
A message was readed by sysko but other not. I don't find message, but maybe sysko have a searching tool for wall.
Maybe is't also interesting registre sentences which automatic clasification fail. For future training of sistem.
I hope enhance my english a little. Sorry for my wrinting.

Sysko planned this long time ago, but since he’s so busy nowadays...
He had an idea that the languages someone uses should affect probability. E.g. if you input some text, it is more likely to be Spanish than French, because you haven't added any French sentences before (or maybe languages would be selected in the profile).
So that’s one reason not to use off-site processing: if this is done on Tatoeba, it's going to be better.
But it seems no one has time to implement this. :) What is more, no one knows when new Tatoeba version will be ready. New Tatoeba will use its own database engine, so the code will be incompatible... So maybe we should wait for sysko to finish the new version first.

actually I have already implemented it and it works with a sufficient precision (more than 95%, i don't remember exactly, 98% maybe), i now need to turn it into a web service. The goal is to make it independent of Tatoeba itself, as it can be used by other people (and I think we're not the only ones with that problematic). So I think I will finish it first before continuing the new version (because it will be needed by the new version and it will not be hard to interface with the current code as we were already using an external service before)

Es genial Sysko!
Magnifique sysko!
What metod use for recognition.
In the first time I think about caracter estatistics. But this maybe inefectibe with short text like sentences.
n-grams work very well, I think.
In Spanish are theren't sh but in english yes. In think is a good metod for short sentences.

yep that is basically the methods I use (with some weights), when I will release it, I will take the times to explain it in details.

Thanks for explanation.
I wait for future interesting explanation.

Wow, cool! Sysko, you’re great! :)

the other reason to use our own service is that we do have a lot of languages that other services are very unlikely to be able to detect because Tatoeba is the largest easily available dataset for these languages (like Shanghainese or Berbere)

Suggest for login metod.
http://tatoeba.org/spa/wall/sho...#message_12719

un petit dédoublonnage ne serait pas du luxe...

hier dans le train m'emmenant au fin fond d'un trou perdu en Chine (18 heures de train avec heuresement une prise electrique) j'ai commence a pondre un script qui me converti le dump de la base en une version utilisable par la nouvelle base de donnee (le but est de resynchroniser la version de la base de donnee avec la nouvelle version pour mes tests, vu que le nombre de phrases a doublés) et il me faut pour cela un fichier sans doublon du coup ca m'a amenez a revoir cela, et la ce que j'ai ecrit est autrement plus elegant que ce que j'utilisais avant, il faut encore que je le finisse et apres je ferais un dedoublonnage.
Le principal a retenir est que je me remets à pondre du code, et vu que le plus dur c'est toujours de s'y remettre, c'est plutot bon signe...

Génial !

bon je l'ai fini et il est en train de tourner, je viens de remarquer que la plupart du temps passer a mettre a jour la base se passe dans les requetes qui fusionne les commentaires (et souvent vu que des commentaires il y en a pas tant que ca, ca ne fusionne rien), donc il doit y avoir moyen de diminuer le temps d'execution sur le serveur en faisant un check si la phrase a des commentaires ou pas.

Merci sysko !

bon là je tombe de fatigue, mais là normalement l'algo en lui même est visiblement fonctionnel, il reste plus qu'à écrire la partie qui génère les requetes à faire sur la base, mais ce ne sera pas le plus difficile vu que c'est déjà écrit dans mes précédents scripts.

Grandan dankon al dominiko pro lia valora subteno rilate al la trovado de eraroj!

I've upgraded the status of Amastan and Sadhen to advanced contributors
And upgraded to Corpus maintainers:
al_ex_an_der
Pfirsichbaeumchen
GrizaLeono
Biga
A big thanks to them for all what they do for Tatoeba
If I've forgotten some people just reply to this message, these days I'm on the slow way to unpack all the things I was requested to do during the time I was quite busy and I've not yet an efficient "todo list" for Tatoeba (it's in my todo list...)

Congratulations to all the new corpus maintainers and advanced contributors! Bravo à tous!

Thanks Scott ^v^
Shukran yaa Eldad ^v^

Vielen Dank für Ihr Vertrauen und Ihre Unterstützung. Ich fühle mich geehrt und freue mich sehr über diese Beförderung. Ich hoffe, daß jeder damit einverstanden ist. Ich werde weiterhin mein Bestes tun, um beim Projekt behilflich zu sein.
Thank you so much for your confidence and support. I feel honoured and am very glad of this promotion. I hope it meets with everyone’s approval. I shall continue to do my best to help with the project.

ENG: I simply would like to sign this statement of Pfirsichbaeumchen and I only want to add a translation into Esperanto.
DEU: Ich würde gern diese Erklärung von Pfirsichbaeumchen einfach unterschreiben und möchte nur eine Übersetzung ins Esperanto anfügen.
EPO: Mi ŝatus simple subskribi tiun ĉi deklaron de Pfirsichbaeumchen kaj mi deziras aldoni nur tradukon al Esperanto.
Multan dankon pro via konfido kaj via subteno. Mi sentas min honorata kaj mi tre ĝojas pro tiu ĉi promocio. Mi esperas, ke ĉiu konsentas. Mi daŭre klopodos helpi en la projekto laŭ miaj plej bonaj kapabloj.

Alexander: Tatoeba is not just an "example translation website", it's a whole world, a whole universe, a whole community where you can LEARN NEW LANGUAGES and find FRIENDS WHO WOULD VOLUNTEER TO HELP YOU IN LEARNING THEM!!! It's wonderful!!!

ENG: So let's be friends and enjoy our very cooperative community as well as the richness and diversity of languages and their cultures!
EPO: Do ni estu amikoj kaj ĝuu nian tre kunlaboreman komunumon same kiel la riĉecon kaj variecon de la lingvoj kaj iliaj kulturoj!
DEU: Lasst uns also Freunde sein und uns an unserer so kooperativen Gemeinschaft ebenso wie am Reichtum und der Vielfalt der Sprachen und ihrer Kulturen erfreuen!

Welcome for the new advanced contributors and corpus maintainers!

Alex:
Tanemmirt ula i kecc a ameddakel ɛzizen :-)
Thanks to you too my dear friend :-)

Congratulations, my friend!
It's about time! ;-)

Thanks Eldad ;-)

You deserve it!

Thanks Alex ^v^

Sysko:
Tanemmirt s tussda. Aya d iseɣ imi ay qqleɣ d imttekki n weswir aɛlayan deg Tatoeba.
Merci beaucoup. C'est un honneur pour moi de devenir un contributeur avancé sur Tatoeba.
^v^

Thanks, dear Sysko. I did not read all the messages on the wall. But I noted, that I could change sentences, that do not belong to me... I thought this was a failure about security in the programmation, so I let it know on the wall... and I proved it...

Welcome, new advanced contributors and corpus maintainers!
Congratulations, and thanks for your most appreciated contribution!

Yesterday* you were wonderful. 4041 new sentences!!!
Hieraŭ* vi estis mirindaj. 4041 novaj frazoj!!!
Gestern* wart ihr wundervoll. 4041 neue Sätze!!!
*2012-07-14

and 4522 new sentences yesterday 2012-07-15

That's a new record!

That's great!!! I love to have new material to translate ^v^

Bravo!

Mi hazarde konstatis, ke mi povus ŝanĝi frazon, kiun mi ne posedas.
Ik zag zojuist toevallig dat ik een zin zou kunnen wijzigen, waarvan ik niet de eigenaar ben.
Je viens de voir par hazard que je pourrais changer une phrase, dont je ne suis pas le possédant.

Gratulojn, GrizaLeono!
Vi nun estas ĉiopova Tatoebano ;-)

Dankon, kara Eldad.
Ĉu mi nun fosu du sulkojn? :)

You've been promoted, Leono! You're now a corpus maintainer. Congratulations!

Thanks, dear Alexmarcelo.
I suppoze the first thing I have to do is translate the tasks for corpus maintainers into Esperanto... Maybe I have already done this, but forgot it.
Where can I find the text about that?

Maybe you mean this one:
http://blog.tatoeba.org/2010/05...n-tatoeba.html

You, you have already translated it, but it's not working here. I can't open your link:
https://doc-0o-1g-docs.googleus...i77v6nfauo495n
Forbidden
Error 403

Kara Aleksmarcelo,
provu foje, ĉu vi havas aliron al
https://docs.google.com/documen...vsK_Ikc0w/edit
Antaŭdanke salutas Leo
Intertempe la terminaro ŝanĝiĝis: mi iam tradukis "corpus maintainer" kiel "tekstara bontenanto"
Fakte la plej facila maniero tradukigi tekstojn estas sendi la tekstojn al https://translations.launchpad.net/tatoeba
Tie kelkaj volontuloj tradukas en diversajn lingvojn, se oni memorigas al ili, ke alvenis tradukenda teksto en tiu retejo.

Thank you, Leono!

Also for me it is "forbidden".
Is this the right text? I will translate it again.

Mi kompreneble malfaru mian "pruvon". :(

I entirely agree with you that adding such "insulting" sentences is not a good idea. We should avoid them (and in my opinion, DELETE them). The only problem when you change a sentence is that its translations won't match anymore...

Mi jam skribis mian opinion al Griza Leono, ke en tiu ekzemplo ne okazis insulto de persono, sed simpla esprimo de la opinio, ke lerni Esperanton signifas malŝpari multe da tempo. Kvankam mi same kiel GrizaLeono havas tute alian opinion pri Esperanto, mi kredas, ke ĉi tie ni devas resti strikte senpartiaj kaj trankvile toleri aliajn opiniojn.

Cetere mi eksciis, ke GrizaLeono per tiu ŝanĝo, volis elprovi kaj anonci tion, kion li en tiu momento kredis teknika paneo de la programo. Feliĉe la kaŭzo ne estis programeraro, sed promocio - kaj nun ni estas ĉiuj kontentaj.

Jen la pruvo: rigardu la frazon 448246, kiun mi evidente ne posedas. Ĝi insultis nian lingvon. Mi profitis la eraron en la programaro por ŝanĝi la instulton en laŭdon :)