留言板(7,215个话题)
小贴士
提问之前先确定已经阅读了常见问题解答。
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
u88store1
4小时前
gillux
13小时前
gillux
13小时前
LeviHighway
15小时前
EugeneGS
1天前
Ooneykcall
1天前
LeviHighway
1天前
frpzzd
2天前
sharptoothed
2天前
EugeneGS
2天前
Hi, I just spotted what I believe is an error with the katakana to romanji automatic traslator. チェ translates "CHIE" instead of "CHE". Hope it helps.
Cheers.
I was just wondering. The corpus is under CC:BY licence but when you're downloading the database dump, there is no way to re-associate a sentence with its original contributors.
Anyway, I guess it could be interesting to know who's contributing on which languages for collaborative translation research purposes.
Well, the fact that the corpus is under CC-BY means you have to mention Tatoeba if you are going to reuse it, not that we have to indicate the original contributor in the dump.
It works like this : people provide their work under CC-BY, so the attribution of the work of each user is mentioned in Tatoeba itself, through the logs. But then we're reusing the work of everyone to make something else: the corpus, which we also redistribute under CC-BY.
But now, if you also need to have the username of the original contributor, or any other information that is not provided by default, you can just ask and I'll see what I can do =)
Hey guys. Finnish in the house.
Love your interface by the way. Great job.
Hey, hey! I'm bourdu, collaborating with parent poster to have a finnish-french-english-japanese corpus. Keep up the good job!
Wow seems we have found the Batman and Robin of translations :D
Yep nice to find Finnish active again, it's been a long time without people contributing in this language, so congrats to have already added so many sentences and don't hesitate to report things you find strange / improvement or behaviour which bother you, we're eager to make Tatoeba a better place :)
Hi, new guy here,
Have you ever considered adding Brazilian Portuguese language?
Brazilians and Portuguese can understand each other pretty well, but the "natural" sounding totally difers between them.
Hi vbkun, and welcome :)
We haven't considered adding Brazilian Portuguese, for one thing : because no one requested it. But also, it's not exactly a new language.
I feel it is a problem similar to the distinction between British English and American English. Perhaps the difference is more significant in the case of Portuguese "Brazil vs. Portugal" though, but technically, it is still considered as the same language...
So right now, the best I solution I can suggest is to add a tag [Brazil] at the end of sentences that are in Brazilian Portuguese.
I also don't think it's really a big problem, I was mainlly just worried bout people learning the language VS natural sounding ;).
Because I guess in most cases (maybe 90% of them, or even a bit more) the sentences are understandable for both sides.
Guess I'll then add those tags in my future contributions for pt-br ;)
Coquille :
Il n'y a pas (encore) de résultats pour cette recheche mais vous pouvez nous aider à alimenter (...)
=>Il n'y a pas (encore) de résultats pour cette recheRche mais vous pouvez nous aider à alimenter (...)
Merci, c'est corrigé.
There are a number of sentences where some words are tagged with weird bracket syntax, like this one;
"There are days where I feel like my {brain}{1} wants to abandon me."
Do these have any meaning to the system, and should they be preserved when adding new translations?
No, you don't have to preserve them when adding new translations. They used to be used, and they may be used again someday, but right now they don't mean anything anymore to the system.
You can find a short explaination at the very bottom of this page:
http://tatoeba.org/eng/pages/do...mple-sentences
Then should they be removed?
They should, but not by you. I mean, someday I'll just run a script that will do the work, so don't worry about it :)
I'll also have to backup these sentences somewhere before removing the brackets, because they're still providing information that can be useful someday.
I see. It'd be so much simpler to run a script than to remove them one by one :)
Looks like there’s something wrong with the Chinese pinyin transliteration... the pinyin of some traditional Chinese characters are not properly displayed :S
True. We installed our own pinyin converter and started using it recently, but perhaps the data we used is not complete...
Anyway we switched to the old way so it is now working again for the traditional Chinese characters.
Hi, how about adding Latin as a separate language? It appears to be a core language for almost all of Europe's languages (from a cultural point of view, of course, not the linguistic one). Latin proverbs and sayings are actually original versions of correspondent idioms in many languages.
As for all other language, we're opened, we just need people adding some sample sentences in this language, as soon as you can provide us around 5 sentences, we can consider adding it.
Great, how about:
http://tatoeba.org/eng/sentences/show/349115
http://tatoeba.org/eng/sentences/show/349096
http://tatoeba.org/eng/sentences/show/349347
http://tatoeba.org/eng/sentences/show/349457
http://tatoeba.org/eng/sentences/show/349463
As for flag, consider 'SPQR' symbol :)
How about this:
http://en.wikipedia.org/wiki/File:Spqrstone.jpg
http://en.wikipedia.org/wiki/Spqr
Yes, that's exactly what I was thinking about! Now an icon of 20x30 pix :)
language added, the SPQR file human600 give dit not properly render on 20*30, so we've choosen something else, but if you have something better to propose us, tell us :)
Looks great! Thanks!
Why are the flags made lighter than they are in reality? The Dutch flag now looks exactly like the Luxembourg flag.
That's because back at the time when I introduced the flags as the indicator of the language of a sentence, I felt that the real colors were too strong and needed to be lightened a little bit to fit more to the overall colors of the site.
But you're right, it leads to issues such as having a flag of a country looking like the one of another country.
But anyway if someday luxembourg language is added, its flag too, and it will look lighter than dutch flag :) (and the contrast between red and blue is kept as the red is also lighter)
The readings for <arabic numeral>日 are incorrect, but <kanji numeral>日 are correct. Figured I'd leave a note here rather on some particular entry as it applies to many.
Thanks for the note. I've corrected them.
Great! Furthermore, <numeral>日間 should also have the same reading as <numeral>日+かん. When there is no numeral, it is read ひあい.
Further <numeral>日 exceptions:
*一日の長(いちにちのちょう)
*一日一日(いちにちいちにち)
*一日一夜(いちにちいちや)
*一日増しに(いちにちましに)
*一日置き(いちにちおき)
*一日片時(いちにちへんじ)
*一日路(いちにちじ)
*一日千秋(いちじつせんしゅう/いちにちせんしゅう)
*一日を過ごす(いちにちをすごす)
*一日三秋(いちじつさんしゅう/いちにちさんしゅう)- anthy only gives the kanji when inputting the former reading.
*一日一善(いちにちいちぜん)
*一日二日(いちにちふつか)
*三日月(みかづき)
*七日鮫(なのかざめ)- anthy cannot produce this.
I didn't dig any deeper into this, but I imagine there are loads of special cases to be found.
And I remembered I didn't reply to you about this.
If you have been reading some of the latest messages you might have guessed that we're in the process of switching to another software for romanization, which will hopefully be more of a long term solution.
With the current one (KAKASI), I've reached the point where adding new rules to get correct romanization was too much of a hassle...
Yes, I'd noticed. Hope the new software proves better. I also noticed that the next release might have kana in addition to romaji which would be a welcome feature.
In fact the release will maybe replace romaji by kana, in a first time, and if people prefer to have both, the we will work to have both