وال (ہک تند)
گُر
Before asking a question, make sure to read the FAQ.
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
Vortarulo
کل
gillux
کل
gillux
کل
brauchinet
کل
gillux
کل
TATAR1
کل
Tartar
کل
TATAR1
کل
Rok
کل
TATAR1
کل

For audio on tatoeba...have this site been considered for a collaboration of some sort:
http://rhinospike.com
and if recording off a laptop's mic isn't an option, how should the recordings be done...how about in the future would it be more convenient?

What about allowing users to directly link to audio files hosted on free (or not) hosting sites that allow audio streaming (for easier integration)?

We've discovered Rhinospike quite recently actually. But you know, we were in the phase of releasing all these new things, we didn't have time to consider contacting anyone (because we still need to live our lives and stuff ^^). Anyway, we just checked their license, it appears they are distributing their audio under CC-BY, just like we do! So yes, we will be contacting them and see what happens :)
Now, concerning not recording with your laptop's microphone, the key point was not really that you cannot record with your laptop's mic, but that *qualiy* is important. Usually, your laptop's mic will not give the best quality. However, I do not have any experience in sound, I don't know what is the cheapest way to get the best sound. But this is more something that you'll have to discuss with Shtooka.
Now the problem we're facing is a "Fast, cheap, good. Pick two" type of problem. Tatoeba is cheap for sure. If we pick "fast", then we could have lots of audio, not necessarily optimal for beginners, but always better than nothing. If we pick "good", then the audio we get will be more sustainable, in a way (no need to throw it and replace it), but this will exclude people who will not have the patience and determination to aim for quality (like 99% of people).
Despite of this, I decided to pick the "cheap and good". We are not in a hurry, and actually, we NEED time. The whole audio integration process has barely started. We don't really have a good platform to add audio... Right now, it's quite artisanal. You would have you rename your audio files with the id of the sentence and send them to us via email or something, so that we upload them in our server. Although, the good thing is, Shtooka has a good software that will let you record "massively".
Anyway, besides of that, audio is not really the "core" function of Tatoeba. Not that I don't want it, I've ALWAYS wanted it. But Tatoeba itself still needs a lot of improvements in order to reach a broader audience... not just in terms of features but also in terms of performance (that is, making it faster, and making sure it doesn't crash if we suddenly have thousands of visitor a day).
We decided to introduce audio in order to get awareness and mostly feedback (just like you did), so that we can start thinking about how we can adapt Tatoeba to make the audio integration easier. But things won't really be settled before AT LEAST a couple of months ^^

Congratulations on the new kana versions of Japanese sentences. I like them much better than the old romaji. I see that it's overdoing the spaces a little, e.g. 背の高さ is becoming せ の たか さ when it's really a single "word", but that is really a minor point.
The generated simplified Chinese looks good too (not that I can make out too much of it.)

for those who wonder, the wall is now ordered by last reply and should be a bit faster than before :)

* To those who can link/unlink.
You have to be careful when unlinking. In order not to get it wrong, always link everything you can before you unlink.
The thing is, if you have the chain A-B-C, and you cut A-B, you won't be able to link A-C because C will not appear anymore.

So, any chance of 'trusted user' status? There's a sentence here
http://tatoeba.org/eng/sentences/show/21085
that needs unlinking from one of the others.

Ah, I forgot to say, even if you have the right to (un)link, you cannot (un)link everything.
If you want to unlink A-B, you have to be the owner of either A or B.

is the owner of the other sentence informed if you unlink your sentence and his sentence?
If not, I suggest to implement this... even "trusted users" shouldn't be trusted too much ;).
great update, btw =)))!

I trust people a lot :P
But yes, your suggestion will be implemented. It's just a matter of time (as always).

I know, you even trust me :P...

why ? we shouldn't trust you ? :p

Got it. I think that one is sorted out now.
Note that you need to refresh the page after 'owning' a sentence to see the link / unlink icons.

クレジット or 謝辞 ?
Anyone have an opinion on whether クレジット, 謝辞 (or, indeed, something else entirely) should be used with the following interface elements?
https://translations.launchpad....AC%9D%E8%BE%9E

Hellu!
How can I help to translate the Tatoeba interface into Italian?
There are a few errors and not everything is translated.
thanks!

You need to register here, as this is the service we use to manage the translation of tatoeba
https://translations.launchpad..../+translations
and then from there you can translate the interface/ correct mistakes
feel free to ask if you have questions :)

Very goood, I've started with the translation!
When will be everything updated?
Anyway it's a working in progress project (but I'm really interested to complete it) because I can't find some of the sentences inside the site >.<

cool
it will be updatet soon :p
http://tatoeba.org/fre/wall/show_message/360 when you don't find some sentences, and if you still don't find ask us ;-) you can send to me or Trang a pm

(just trying a thing)

Regarding translations, I noticed that the German ones all seem to du-ts, rather than Sie-ts. Is that just the way the cards fell and best left alone?

Yes in general it's better to leave alone a sentence if by itself it doesn't have any spelling or grammatical mistake. Never mind if it's a correct translation or not.
=> http://blog.tatoeba.org/2010/02...eba.html#rule5
We will soon introduce a way to "unlink" sentences for problems related to incorrect translations.
Anyway, if you would like to see more "Sie", you have the right to another translation. It's not forbidden to have two German translations for one same English sentence :)

Sorry, that may have been a bit too unclear. I'm referring to the Tatoeba interface translation, not the sentences in the database.

Oooh, okay. Haha.
Well my friend Muiriel told me that it is more likely to see the use of "du" on German Web pages. She was the one who translated most of it. And as far as I'm concerned I can't really have an opinion on the matter, I'm nowhere near fluent in German.
But if you (or other German speakers have) an opinion on this (du vs. Sie), feel free to express it. We're not reluctant to change if there's a good reason for it.

Fair enough; I don't read enough German websites to contradict your friend. Of course commercial sites would refer to their customers with Sie, but Wikipedia seems to mainly use du in help and project namespaces (with occasional Sie's). The semi-commerical dict.leo.org, however, also uses Sie but there may well be plenty of other examples of similar sites that use the informal. I'll ask around as well.

I think it's best to use the informal. The formal in German is dying out, especially among the young generation. I'd say it's mostly young people who are using this site, anyway.
Maybe it could be decided on by sentence. Like "Hey, let's go to the movies" would be informal because you're rarely going to say that to someone you don't know, and "I'd like to have a talk with you about my salary" would be formal, because it's in a business situation.

Re Trang's: "Yes in general it's better to leave alone a sentence if by itself it doesn't have any spelling or grammatical mistake. Never mind if it's a correct translation or not."
I think an exception to this is when there is a Japanese-English pair which have different meanings. Although Paul and I have removed or corrected hundreds of these over recent years, there are still quite a lot there. Our approach has been to amend one or both so that they agree in meaning. I think this needs to continue. When there is a 3rd or 4th language sentence linked, and that sentence has been translated from the English, it is more complicated. Then I think it is better to add an extra English sentence which matches the Japanese, then delink the non-matching pair.
I regularly encounter Japanese sentences with spelling mistakes, e.g. 性交 [8-}] when it should have been 性行 (they are pronounced the same). Then there is no alternative to correcting the Japanese.

Oh, I saw the discussion just now^^:
As Swift mentions, Wikipedia uses "du" when appealing to its contributors. Facebook also uses du. I juste decided to follow their example as it seems the most common solution to me and as I also prefer "du". But if anyone doesn't like it, let's discuss it :)!
Trang, did you synchronize my latest launchpad translations? The status is "Translation unchanged since last synchronized", however I see many English sentences in the German version that I think having translated?
Es wäre toll, wenn mir andere Muttersprachler helfen könnten, Tatoeba vollständig und besser zu übersetzen :)!
https://translations.launchpad.net/tatoeba
Das, was ich schon übersetzt habe, könnte sicher in vielen Fällen besser gemacht werden. Und einiges habe ich auch gar nicht übersetzt, da mir bisher keine gute Lösung eingefallen ist.

I think "Sie" sounds much too reserved, communitywise "du" would be the better choice. But I don't visit many German websites either.
@Muiriel: Ich werd mal reinschaun, vielleicht übersetze ich heute noch ein paar Sätze. Ist schließlich noch ein langer Vormittag. ;)

Good, it seems there's a fair consensus on this, then. Thanks for the input.

I didn't update the translations yet. It will be done this Saturday, some time in the afternoon. If you can translate more in the meantime, go ahead :P

I can't edit translation? I make myself owner of a sentence, edit the false English translation to make it better, but it reverts to the previous translation. Am I missing a step here?

No it's actually a bug. We're aware of this and it will be fixed this weekend (or perhaps sooner). We'll let you know. Sorry for the inconvenience =/
In the meantime, the best way to proceed is, first to click on the sentence in order to view it in the "Browse" page (the page where there are also the comments and the logs displayed). Then adopt the sentence. Then refresh the page. Then edit the sentence.
I know it's quite annoying, but it's the only (temporary) solution.

Merci TRANG, for all your hard work on the site.

The bug is corrected :)
And you can also thank sysko, he did a lot of work as well.

Confirmations in LaunchPad?
OK, quick question here. I've made a lot of 'awaiting confirmation' translation suggestions in LaunchPad but it doesn't seem like they will be confirmed (or corrected) any time soon. I guess there aren't many native Japanese speakers with a lot of free time hanging around here. Do you think it would be best to 'confirm' them, even though I can't guarantee good Japanese, in order to get the interface translation finished quicker?
It would, of course, still be possible to correct translations later anyway.

I think as you've said, we haven't so much Japanase native, in a first time "confirm" them is the best solution, with an interface even in "not perfect" Japanese, we're more likely to attract more Japanese, and I think they will report us if they see any mistakes

Thanks for taking on the task of translating into Japanese by the way ^^
And yes, I agree that it's better to "confirm". We're not going to be very perfectionist on the interface translations, even if it may not look very "serious" from a language website NOT to have correct translations...
But well, for the reason sysko mentioned, it's better than nothing.

A proposal for the Japanese romanisation:
I think both romaji and kana readings should be shown on the site. While there's some agreement that serious students of the language should be reading kana, having the romaji makes the site more accessible for silly things like learning to say "I love you" in twenty languages.
As for how to generate it, I'd suggest using the B lines where possible. If there's no B line, or if the B line does not match the text (for instance because of names in the sentence), generate the reading with MeCab, which looks pretty solid.
This may require names and other unindexed items to be added to a B line if the romanisation needs to be corrected. Just the reading will do.
Does this sound reasonable?
I'm motivated to work on this if necessary, but it will probably be a little while before I have the time. First up would be to create the B line to reading converter, and then use it to test MeCab's accuracy on our data.
There are entries like '|1' in front of most parentheses that aren't described at http://www.edrdg.org/wiki/index.php/Tanaka_Corpus. I'm guessing they're indices for the reading?

Much as I love those indices in the B-lines (I invented them years ago), I think it might be better to go straight to MeCab. There are some tricks you would need to apply, e.g. where MeCab says a particle is "助詞,格助詞" you would leave it with spaces around it, and where it is "助詞,接続助詞" you would attach it to the preceding word.
Those "|1" are an artifact of the days when Paul Blay was maintaining the indices in MSAccess and needed a way of disambiguating some words. They are not carried through to the B lines in WWWJDIC (I didn't know what they were until Trang expained them to me.)
If you are using MeCab, use the IPADIC version.
The B-lines would be necessary if you you were to create links to WWWJDIC, as MeCab breaks up expressions/compound nouns/etc.

one of us has beginning to look into changing kakasi, but we have never used MeCab and so, so maybe if you want, I can give you his mail in order to see how to use/configure mecab ?

Sure. I have only used it on as a command-line too (and in shell scripts), but I see it has bindings for python, perl, ruby and java. I just installed it with apt-get (Debian). I use the ipadic (mecab-ipadic) ratjer than the default juman dictionary.
You need to make sure you get the utf-8 configuration. Mine is euc-jp.
It's simple to use: "echo 日本語の分節 | mecab".
Feel free to ask.

the output seems to contain only katakana, is there a way to have hirigana ?

Only by doing your own conversion - those morphological analysis systems don't really care whether it's one or the other.
In EUC-JP and in raw Unicode the conversion is simple, e.g. あ is 3041 and ア is 30A1 and so on. It's a little more messy in UTF8 but quite doable with a simple algorithm. Of course where it is katakana in the original text, you would leave it that way.

ok that's what we've done waiting your answer, so we will keep it :)
so it's highly probable that it will be included in next release