Muro (7 138 fadenoj)
Konsiloj
Antaŭ ol starigi demandon, bonvolu legi la oftajn demandojn.
Ni strebas konservi sanan etoson en civilizitaj diskutoj. Bv. legi niajn regulojn kontraŭ malbona konduto.
zbetnetwork2025
antaŭ 5 horoj
deniko
antaŭ 2 tagoj
deniko
antaŭ 2 tagoj
frpzzd
antaŭ 2 tagoj
araneo
antaŭ 2 tagoj
deniko
antaŭ 2 tagoj
deniko
antaŭ 2 tagoj
deniko
antaŭ 2 tagoj
deniko
antaŭ 2 tagoj
deniko
antaŭ 2 tagoj

[not needed anymore- removed by CK]

I see the rationale for a rating system, but don't think it would solve the perceived problem.
Consider a language that is spoken in an area large enough to permit different variations to have formed (which happens to be the case in every language I know). Knowing which variant is more common isn't going to help the people speaking accurately in areas where less common, but equally valid, variants are used.
Furthermore, the collection of Tatoeba members can hardly be used as a representative sample of usage.
I'm therefore in favour of a simple OK/not-OK system where OK sentences can be further qualified with tags (e.g. local variants, archaic/colloquial/slang/etc.).
I've discussed with sysko the idea of notes on sentences (separate from comments) and shared pages where one can link to from multiple sentences. These could be used to further describe usage. This is a lot more work to set up, but a single-dimensional rating system would imply more information than it can convey.

As a similar issue I think that a usefulness rating would be helpful. I'm particularly thinking about some of the Tanaka Corpus sentences.
For example
"It is an apple" is a perfectly good English sentence but says little about apples.
"Eve bit into the apple with a crunch as the juice ran down her face." contains more potential information about apples but might not be the best sort of example in a dictionary.
I think it's worth exploring the usefulness of a given example to second language learners through a rating system as most of Tatoeba's sentences are destined for dictionary or learning resources.
(apologies if I've missed other threads about this, but I find topics on the Wall by accident. I'd like to bang the drum for a forum again :)

For duplicates, I'm going to test an other solution to handle this in a semi-automatic way
I will replicate the tatoeba database on my personnal computer
I will run on it a slower but safer script (that I couldn't run here without slowing down tatoeba.org during some hours) that will output all the modification that need to be done on the database
I will run this output script on tatoeba.org
so this way it should be ok.

Some notable figures from the past few days:
- Esperanto overtook French in number of sentences. It is more than a third of the way to the top spot.
- Russian reached 20,000 sentences ^^
- Dutch overtook Arabic again; Hungarian and Hebrew are both coming up fast.
- Swedish finally broke out of a slump and reached 1,000 sentences, and Persian is very close to that mark.

Thanks for keeping track of this! ^^

L'italien ausshi a dépassé 10000 phrases mais pershonne ne le voit :P

J'ai vu ! Mais je n'ai pas rendu notice au mur. Mais je penshais que quelqu'un l'avait encore fait ^^
Même que l'espagnol avait dépassé le polonais. ^^

Je prie les francophones d'accélérer leurs efforts, parce que sinon, les espérantophiles vont vous dépasser. Ca sera une vraie honte de voir une langue-jouet dépassant le français.

Ich denke nicht, dass es eine Schande ist. Aber es zeigt doch, dass man Esperanto nicht unterschätzen soll. :-)

What's unterschätzen?

unterschätzen: sous-estimer

En français: Je ne pense pas que ça soit une honte. Mais quand même cela montre qu'il ne faut pas sous-estimer l'espéranto. :-)

je vais d'abord retirer les doublons en ésperanto et après on en reparle :p

If you'd like to remove duplicates in esperanto, I prepared a quick list of the phrases which are exact duplicates.
Please see http://www.is.titech.ac.jp/~zakirov8/epo-dup.html .
There is also a link to a list in tsv format.
The only issue I see is that many of the duplicate phrases already have lots of translations, so what we need is not only deleting of duplicates, but also relinking pf the translations.

It is nice to remove duplicates - but wouldn't it be even nicer to think, why they are created?
Usually I think people just don't see that there is already an Esperanto translation, because it is an indirect translation of second or more degree - so they go ahead and translate.
If it is possible to create a script for duplicate sentences, wouldn't it be possible to create something to show every translation already in the translation chain? This would reduce the work for eliminating duplicates to nearly nothing...

unfortunately as discussed before, the reason we can't show the whole translation graph is because normal database system are really bad to make this kind of operation. So the best we can do with all the possible optimization is a 2degree depth chain, with the current system.
In theory it would be possible, but it would be slow as hell.
That's the reason why we've started to build our own database server for our specific need, to permit this.
So in the future it will be possible
http://static.tatoeba.org/425123.html (it's a page shot of the version I have on my computer, don't pay attention to how ugly it is) as you can see there we view every translations, whatever the degree of depth.
And anyway our database will be able to detect duplicate on the fly.:)

* Total number of sentences linked *
How about indicating the (approximate) total number of sentences linked? This could be calculated once a day/week/month and, maybe, would be of some help. So on http://tatoeba.org/eng/sentences/show/93453 we would see "+2 hidden translations" (below) or "(There is a) total of 4 translations" (above). (As shows http://tatoeba.org/eng/sentences/show/333724 )
Everyone who wants to translate would know there are already some hidden translations - so, be careful, look them up before risking to add a duplicate which will be deleted later anyway.

* Identification of translation chain and language *
How about assigning a second identification to every sentence which denotes the translation chain (graph) and the language? So in the example http://tatoeba.org/eng/sentences/show/93453 the first sentence, the Japanese one, would get the identification 93453-jpn, the second, the English one, would get 93453-eng, the French one 93453-fra, the Chinese one 93453-cmn and the last, the Esperanto one 93453-epo. A second Esperanto translation would get 93453-epo2.
If then, before translating, in a first step, everyone had to inform the system about the planned target language, the system could show, if a translation already exists (or two translations...) and show it or them. If the second identification would already be assigned, this databank procedure would not last long.
Perhaps it would take a bit of work to assign these second identifications - but it would more or less eliminate the problem with duplicates.
Somehow this procedure would mean doing the time consuming search procedure for the complete translation graph in the database once and later just taking the stored result.

this system would be a hell to maintain
1 - computers are fast to deal with numbers, but become slow when it comes to deal with characters
2 - it's easy to done if it was all about tree, but unfortunately we're dealing with graph, so your proposition bring the following problems
* we will need to update it when we delete a sentence
* the same when we mix to graph, by adding a link
and moreover it will still doesn't solve the problem which is traversing the graph, as you will still need to traverse it to discover there is already a epo2 and so

to be honnest before you propose other solution
we're thinking about it for one year, and there's no simple solution to this problem with the current architecture, and as we're few developpers, I prefer to focus my free time on the new version rather than trying to find and develop a new one, which will only increase the time before we get this new version which will solve in a smart way these problems

OK, let's wait for the new version. Thank you for your explanations.

- How is the programming progressing?
- Is there a solution in sight about the problem of the hidden translations?
- If you are not enough programmers, should we try to find programmers for Tatoeba?

This looks nice. Maybe it could be used only by those who want to translate, if it is slow.

in fact what i've shown has been made with the new version
it's a hell to code with normal database, and unfortunately we only have one server, so if the server take 10 seconds to generate my page, during this 10 seconds people who don't care will still also need to wait 10 seconds.

so by a collateral effect it will affect not only the performance of those who wants.

The famous collateral effect :-|
OK, I see. So we shall wait for the new database.
And in the meantime, maybe we could spread the enthusiasm about putting more translation links. They help a bit.

so yep it possible, and the script was easier to do, and was done as a temporary solution, waiting we finish this new version.

y en a?

3000 ^^

Ah bon, ça me donne de l'espoir ;)

C'est de l'espoir pour quatre jours, puisque l'espéranto a déjà 1600 phrases en plus que le français - et actuellement on ajoute environ 400 phrases en espéranto par jour.

Dommage qu'une langue artificielle soit sur le projet plus répandue qu'une lange vivante (alors que c'est loin d'etre le cas dans le monde réel). Cela en fait met en question le sérieux du projet Tatoeba.

Regardons les faits d'abord: En Hongrie l'espéranto a la dix-huitième place parmi les langues maitrisées, http://www.nepszamlalas.hu/eng/...ad01_13_0.html . Actuellement il y a plus de 135.000 articles dans la wikipedia en espéranto http://eo.wikipedia.org ce qui fait la vingt-troisième place en comparaison avec les autres versions, http://stats.wikimedia.org/EO/Sitemap.htm . Les chinois donnent des informations au monde en une dixaine de langues dont l'espéranto, http://esperanto.china.org.cn . Donc il y a pas mal de langues nationales qui se trouvent derrière l'espéranto...
Comme en général les gens qui parlent l'espéranto parlent aussi beaucoup d'autres langues (probablement ils sont plus polyglottes que les gens des autres communautés linguistiques), il est normal qu'ils s'intéressent à un projet comme Tatoeba. Je ne vois pas de désavantage pour le projet.

Ce qui me gène avec l'espéranto est qu'il n'est représentatif que d'un nombre restreint de langues (langues romanes, germaniques, slaves, grec et isolats, et langues agglutinantes pour la structure) tant sur le substrat que sur la morphologie. Or à chaque langue sont liés des mécanismes cognitifs particuliers (façon de se repérer dans l'espace...).
Sur les 3000 à 7000 langues que l'on recense actuellement l'esperanto en représenterait on va dire une centaine ? Et avec ca on voudrait le promouvoir comme langue universelle ? C'est vraiment faire très peu cas des 95 % de langues existantes.

Moi aussi j'aimerais avoir une langue basée sur plus de langues. Mais il semble qu'avec chaque langue ajoutée il devient plus difficile de l'apprendre, pour tout le monde.
L'espéranto est loin d'être une solution idéale - seulement la meilleure connue (ou une des meilleures parmi les langues construites). L'espéranto est beaucoup plus proche à pas mal de langues que, par exemple, l'anglais, le francais ou l'allemand. Donc de ce point de vue, il est préférable comme candidat de langue universelle aux langues nationales.
Et il est bien clair qu'on peut apprendre l'espéranto dans un tiers du temps nécessaire pour le même niveau dans une langue nationale. Donc avec le même temps on parle l'espéranto beaucoup mieux. C'est pourquoi qu'il y a pas mal de chinois, japonais ou vietnamiens etc. qui apprennent l'espéranto.

Tu plaisantes ou tu es sérieux, aandrusiak?

D'ailleurs la « langue-jouet » a l'habitude de dépasser les autres langues. Quand on a publié l'espéranto en 1887, il y avait environ cinq gens qui parlaient cette langue; l'espéranto était donc une des dernières d'environ 7000 langues à ce temps. Aujourd'hui en général on trouve l'espéranto sur une place parmis les premiers 15 à 35 langues, parfois parmis les premiers 50.
Donc l'espéranto a déjà dépassé plus de 6900 langues pendant seulement 123 années. J'ai l'impression qu'il n'y a pas eu une autre langue dans toute l'histoire de l'humanité qui a fait un tel progrès pendant seulement un siècle.

Heureusement, celle langue artificielle ne deviendra jamais une langue nationale d'un pays, au moins si tous les espérantophiles ne s'assemblent et n'achètent une ile pour y vivre et parler leur langue pour la déclarer la langue nationale de leur Espéranto-Paradis.

Nous n'avons besoin d'une langue nationale en plus. Nous avons besoin d'une langue pour la communication internationale. Cette langue doit ètre plus facile que les langues nationales. Moi j'ai appris l'anglais pendant 8 ans et le resultat n´ etait tres bien. En Europe nous dépensons beaucoup pour traduiser et étudier. Des miliards. L' espéranto est tres facile (10 fois plus facile!) et il est neutre.

Il faut surtout pas precher votre langue facile. Cela prouve une fois de plus le caractère sectaire de ce mouvement.

Je suis d'accord qu'il n'est pas toujours une bonne idée de prêcher l'espéranto.
A part ça, il vaut la peine de faire une distinction entre la communauté des gens qui parlent l'espéranto et le mouvement espérantiste - et même dans ceci entre des gens qui proposent l'espéranto d'une manière modéré et d'autres qui le proposent d'une manière presque exagérée qui évoque le comportement d'une secte.
Je sais que le monde serait beaucoup plus facile à comprendre si on savait que tous les habitants du pays A étaient intelligents, ceux du pays B méchants et tous les gens du pays C gentils - mais ce n'est pas la réalité. De même les gens qui parlent l'espéranto ont des charactères assez différents...



ДО СВИаДНИЯ

Bug?
Home page http://tatoeba.org/eng/home
More latest comments (show more...) http://tatoeba.org/eng/sentence_comments/index
Filter by language http://tatoeba.org/eng/sentence_comments/index/hun
(2) second page on the top link http://tatoeba.org/eng/sentence...dex/hun/page:2
Press End key/Go down, (3) third page or any
http://tatoeba.org/eng/sentence...s/index/page:3
...Language filter now missing. The bottom links are not updated according to the language filter.
Sry if it is already posted, or the Wall is not the best place to submit this.

It's a bug. Thanks for reporting :) It will be fixed soon.

[not needed anymore- removed by CK]

I have a Mac, and it seems to work well. But I'm really not sure if the rendering is 100% correct as I can't read the script.
Anyway CK, you need a font!
http://sites.google.com/site/macmalayalam/
http://www.prokerala.com/malayalam/

[not needed anymore- removed by CK]

I think it's just because you don't have the right font, because on computers (I don't know exactly on Mac, but on linux/windows/etc. this is the case) the behaviour with caracters rendering is the following
1 try to display the character with the font specified by the software
2 if the font is not present or the character can't be render by this font, then there's a set of rules to use some fallback fonts
3 if no font can render this caracter then display a box
so even if the css was using a font which has no Malayalam characters, your OS would have used an other which has.

Maybe this table replies you (sorry if I didn't get completely the meaning of your question... :)
http://en.wikipedia.org/wiki/He...isting_support

In what way are they not displaying correctly? The language appears to be Malayalam, which uses its own script. On my Android phone, it's all boxes. On my XP PC, I see the letters, but can't be sure if they are connected correctly without learning more. I don't have a Mac so I can't directly answer your question.

it's unicode encoding, maybe you don't have the right fonts for malayalam ?

They display fine for me. (Firefox, Windows 7)

Tag auto-completion script turned off?
It looks like the auto-completion script when entering tags on sentences seems to have gone. I'd actually got used to it as well.

what about now ?

Looks like it's working now.

Kiel skribi chapelitajn literojn chi-tie? Mi provis kun sx, sed tio restis sen transigo en la ghustan chapelitan literon en Esperanto. hans

Ekzistas multaj programoj kaj helpiloj. Iom ĉi tie:
http://esperantilo.org/
http://members.aon.at/aldone/konvertileto.html
https://addons.mozilla.org/de/firefox/addon/3684/
https://addons.mozilla.org/de/firefox/addon/4016/
http://de.wikipedia.org/wiki/Es...g#Das_X-System
http://www.akueck.de/eoskribo.htm
http://www.apple.com/downloads/...ardlayout.html
http://www.esperanto.mv.ru/Ek/
Mi esperas ke tiu helpas vin.
Se vi uzas Linukso (kiel ekzemple Ubunto) vi ne havus tiel problemojn. ;)

Kaj transliterator por firefox:
https://addons.mozilla.org/de/firefox/addon/883/
Por multaj lingvoj i.a. Esperanto

[not needed anymore- removed by CK]

Mi volas aldoni, ke tradukado de proverboj estas malsama ol tradukado de simplaj frazoj, kiun ni plejparte faras ĉi tie, do estas interesa (eĉ se iom malfacila) ekzerco pri proverboj.
Bonvolu ankaŭ uzi la liston "Proverbaro Esperanta" por trovi tradukendaĵojn:
http://tatoeba.org/epo/sentences_lists/edit/153