Wall (6,037 threads)
Before asking a question, make sure to read the FAQ.
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
6 hours ago
12 hours ago
12 hours ago
13 hours ago
17 hours ago
Could we translate sentences into language not listed in the site?
I can translate it into Malay language.
P.S. Great site!!
For malay language wikipedia told me there's a lot of different malay, as we make difference between dialect, is this a specific form of malay or it's "standard malay" ?
is this flag suitable ?
Yes it's standard Malay. and there's isnt much dialects used contemporary days..the most well known should be Kelantanese, but standard Malay is spoken all over the country.
Yes that's the flag of the nation =)
ok, by the way could you correct the chinese sentences I've commented please :)
the answer is just below
spoiler : yes you can ;-)
thanks for your contributions in chinese :)
Normally we've added Malay as a supported language :)
You'll still have to add a few sentences to check that the language detection does work properly though.
why i cannot find my sentences in your search engine.
Yes, we're not indexing on the fly. The main reason is that I didn't (and still don't) have time to figure out how to do that ^^'
Usually I launch the indexing process once a month but considering the increase of contributions, I think it'll be more once a week now...
If you don't mind me asking, what kind of database engine is behind tatoeba.org? SQL Server/mySQL or other?
It's MySQL :) But for the search feature we're using Lucene (http://lucene.apache.org/java/docs/).
I regularly use SQL Server, so I'm not much help with mySQL, but maybe this link might help http://wiki.apache.org/lucene-java/UpdatingAnIndex
Right now though, I must say it doesn't speak much to me... Also, MySQL is not really the issue here (because I know MySQL and it doesn't help me :P).
The issue is to know how to use Lucene (which is written in Java). I just have to take the time to read the documentation.
The search engine part of Tatoeba was coded as a school project, at a time when I didn't have much knowledge in programming but had a good partner who knew Java and so he pretty much did all the coding.
Someday I'll have to look into his code. I'll probably have to upgrade to the latest version of Lucene as well because our code is from like, 2 years ago. Someday... When I have time.
I'm not the one who made the search engine part, but it seems that the index is not updated in real time, certainly for perfomance reason, so in few times your sentences will be available :)
Qu'est-ce que je fais si tatoeba ne reconnait pas la bonne langue d'une nouvelle traduction?
("Bill war in Japan." est allemand et ne parle pas d'une guerre au japon :D!)
il suffit de cliquer sur le drapeau de la phrase et de mettre le bon :)
With regards to Chinese entries, can we have some way of distinguishing between Traditional and Simplified entries?
In fact I was thinking to add an option to convert sentences in simplified chinese to traditionial chinese, and vice versa, wouldn't it be better that way ?
Yeah that's a good idea. Means that all the existing entries, in either Traditional or Simplified will be preserved.
The other thing regarding Chinese translations that probably needs consideration, is that there are 3 or 4 major regions where Chinese is spoken (Taiwan, Hong Kong, PRC, Singapore), but each region often has a slightly varied vocabulary set to represent the same meanings in another language. I'm no expert on this, but I'm pretty sure a Taiwanese person would translate the English word "Potato" to "馬鈴薯" whereas in the PRC (Mainland) they more commonly translate it to "土豆". Maybe we need the ability to choose the "Region" of our Chinese translations?
Yep we have recently migrate the code of language from iso 639 alpha 2 (name of languages coded on 2 letters) to alpha 3,
which allow us to make more precise distinction about languages (as you can see there's already shanghainese)
but for the moment the problem is not really technical, but mostly ergonomical "how do we present it in a nice way, without overloading a sentence with billion of buttons",
moreover the problem can exist with french, canadian french etc... so I agree, its something we will need to handle one day or another
after we need to keep in mind that a beginner maybe don't want to see these regional variations, and only focus on "standard" version, so here come again the ergonomic problem
in fact for the moment if you plan to add "regional" sentences, just add in () which region it is, that people will be aware its not standard mandarin
I will notice you when we will be starting handle this :)
by the way thanks for your contributions :)
Yeah I understand.
(When you get round to it, you could possibly make the flag icon a drop-down list of regions for that language, so that if we want to we can mark the translation as region specific.)
By the way I really like your site :).
I'm an Australian studying in Mainland China.
In fact for the moment the flag icon is used to change the language if the tool used to detect automatically which language your sentence is do a bad job (which happen with shanghainese /mandarin, or close language such as ukrainian and russian)
Find a work around for those adding in right to left languages (such as arabic)
and who get a strange characters order (see http://tatoeba.fr/eng/sentences/show/340400 for an example)
just edit your sentences and this ‏ to end, it's the xml entities to indicate switching writing direction :), for some strange reason, independant of Tatoeba, I've got the same problem in different text editor while trying to repeat this bug, this control character is sometimes missing
I will try to find quickly a automatic way to get it work properly
I've only just joined a few minutes ago.... I have favorited several sentences, but my profile still says I have 0 favorite sentences. Does it just take a while for them to show up, or is there some problem?
Also, what does it mean to "adopt" a sentence?
Sorry for newbish questions, but your site lacks a good "about" page that introduces all this to newcomers. :/
adopt means this sentence now belong to you, and you will be the only one allowed to make change on it, and you will receive email notification ( if set in your profile )if someone comments on it
that way we're sure that they will be no "war of edit" or people editing too much sentences
for favorite, you will soon seen them :)
have you checked http://tatoeba.fr/eng/pages/help ? ( in bottom right) ? (maybe not so much visible)
Tatoeba should use a license without "by" like CC0:
Attribution is unnecessary and unpractical.
in fact it's only legal problem european law say one can't abandon his moral against a text, except 50 years after his death, 70 years in France, so CC0 can't be choosen
anyway we're looking if there's any problem to go to a less restrictive licence such as CC-BY, we will be sure at the end of the week
I like cc-sa (is almost Public Domain!) http://creativecommons.org/licenses/sa/1.0/ sadly "retired"
unfortunately as explain in my last message, due to european/french author right, attribution is mandatory and CC0 is still not clear whether it works in france or not, so we prefer to be safe, regardin that make law pursuit for copyright violiation is "fashion" in france ...
so the most "free" we can do is "CC-BY" ( for the moment my research hasn't show anything against it, but I prefer to check juridiction of main countries), when CC0 will be clearer regarding countries which has the notion of moral right (basically all european countries) , for further information, you can read the CC discussions pages, there, you can find more precise technical explanation :)
*his moral right
that means globally that we must attribute works of contributors, as we're based in europe and a major part of contributions (except takana corpus original sentences) after some internal discussion we've realized that maybe CC-BY can be used, as Tatoeba MUST attribute works, after if people want to reuse the contributions without attributing it to original contributors, that will be their problem (in fact no problem as long as they don't reuse without attributing sentences or corrections from european contributors or other countries where public domain is different from US definition)
so the licence is only to make things clear
by the way, we wouldn't have take a long time to choose a licence or so if there were no threats nor possible juridical problem, I far prefer coding than looking into law books
the content will now be licenced under CC-BY 2.0 FR, which is for the moment, the less restrictive we can do according to european law
Is this a bug? When I do something like this:
-Add a new translation in a sentence for a language which were not present (e.g. Spanish).
-Press "show another" or go to another sentence to edit.
-Do a search for the sentence I edited in the first place, because I want to modify the Spanish translation.
The Spanish translation does not appear, and actually the sentence number of the sentence found does not match. If I add a Spanish translation to this, the sentence becomes duplicated (all languages). It occurs, for instance, in sentences 339047 and 339048.
Is this normal? Thanks in advance.
Hmm, well, if I understood what you did, it's not a bug.
There is one thing in my todo list that I really should do (if only I had more time), and that is : hide all the translations when someone clicks on "Translate". Then people will probably understand better that they are not translating a group of sentence, but only one particular sentence.
So in your case, the sentence 339047 was a translation of an English sentence:
Can you deliver this? <-> Le importaría repartir esto?
When you added your translation, you also INDIRECTLY translated the Japanese sentence. Because initially you had:
配達してもらえませんか。<-> Can you deliver this?
And when you added your translation, here's what happened:
配達してもらえませんか。<-> Can you deliver this? <-> Le importaría repartir esto?
(See? Indirect translation.)
But when you did your search, you probably searched the Japanese sentence. And the search results only display sentences and their DIRECT translations.
So, you added a translation to the Japanese sentence, and the whole thing became linked this way:
Le importaría repartir esto? <-> 配達してもらえませんか。<-> Can you deliver this? <-> Le importaría repartir esto?
And now you have to know that when you BROWSE a sentence, we display both the direct AND the indirect translations. Which is why you will see two Spanish translations for http://tatoeba.org/eng/sentences/show/121527.
One is the direct translation. The other is the indirect translation.
Hopefully I understood properly your problem and that my explanation was somewhat clear...
Ok, I'm sorry if I caused trouble. I didn't know exactly how the direct & indirect translations work. I'll be more careful from now on. Thanks for the explanation.
anyway you're not the first and you're not the last :P
it's true that is not something which directly come to mind, the difference between direct and undirect translation
Which will be the first, french to reach 26 000 sentences or chinese to reach 3000 ?
(congratulations for spanish contributors and esperanto, they have reach 2000 and 100 sentences ! :D )
Apparently Chinese reached 3000 first =P
yep, wow 300+ sentences added today, congratulations to our hardcores translators in french / chinese and spanish,
Hey all, here's a bug report :
- can't set up my birthday date ;
- can't create paragraphs in the field "something about you".
- sentences owned counter seems to have a glitch -> it displays 10 instead of 215 on my profile. Seems like it only counts the ten lasts contributions since i contributed far more than 10 times at my last connection...
Thanks, we'll look into this :)
It probably won't be fixed before Tuesday though. As far as I'm concerned I have a midterm exam on Monday.
the last bug is known i'm on the way to fix it :)
only the birthdate bug remains from now
You can set your birthdate now. There's just the problem that anyone who hasn't set their birthdate is born on November 30, 1999... :D