menu
Tatoeba
language
Inscriber te Aperir session
language Interlingua
menu
Tatoeba

chevron_right Inscriber te

chevron_right Aperir session

Percurrer

chevron_right Monstrar phrase aleatori

chevron_right Percurrer per lingua

chevron_right Percurrer per lista

chevron_right Percurrer per etiquetta

chevron_right Percurrer audio

Communitate

chevron_right Muro

chevron_right Lista de tote le membros

chevron_right Linguas del membros

chevron_right Parlantes native

search
clear
swap_horiz
search

Wall (1 discussion)

Consilios

Ante de poner un question, assecura te de haber legite le FAQ.

Nostre intention es mantener un atmosphere salubre pro discussiones civilisate. Per favor lege nostre regulas contra mal conducta.

Ultime messages subdirectory_arrow_right

sharptoothed

heri

subdirectory_arrow_right

sharptoothed

heri

subdirectory_arrow_right

TATAR1

heri

subdirectory_arrow_right

AlanF_US

heri

feedback

sharptoothed

heri

subdirectory_arrow_right

Shanaz

heri

subdirectory_arrow_right

Qaztat

heri

subdirectory_arrow_right

TATAR1

heri

feedback

Tartar

heri

subdirectory_arrow_right

menaud

heri

MUIRIEL MUIRIEL 7 de martio 2010 7 de martio 2010 a 09:44:26 UTC link Permaligamine

Great updates :)!!!

{{vm.hiddenReplies[288] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
cburgmer cburgmer 7 de martio 2010 7 de martio 2010 a 10:07:32 UTC link Permaligamine

+1 :)

TRANG TRANG 9 de martio 2010 9 de martio 2010 a 21:19:48 UTC link Permaligamine

+1 :D

sysko sysko 9 de martio 2010 9 de martio 2010 a 22:27:31 UTC link Permaligamine

hope you will love the next ones too ^^

Hautis Hautis 31 de januario 2010 31 de januario 2010 a 10:05:39 UTC link Permaligamine

Hi, I spotted a small bug in the list which you get when searching for "Example sentences with the words:".

When giving a new version of the original language, this sentence appears "Are you sure you want to translate this sentence into a sentence in the same language?", but I cannot click on the OK-button. In other views it works fine.

I use the latest Firefox on Mac OS X.

Cheers.

{{vm.hiddenReplies[141] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
TRANG TRANG 9 de martio 2010 9 de martio 2010 a 21:33:13 UTC link Permaligamine

This problem should not happen anymore :) We simply took out the "same language" warning because it is no more useful.

xtofu80 xtofu80 8 de martio 2010 8 de martio 2010 a 16:16:41 UTC link Permaligamine

A general question about Chinese:
Is there a policy regarding simplified and traditional Hanzi?
I just found a sentence (nº346168) with traditional Hanzi posted by someone from Hongkong. So far the "policy" seems to be that both scripts will be put under the same category. This might be a burden for people learning the language who do not recognize the differences.

{{vm.hiddenReplies[292] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
sysko sysko 8 de martio 2010 8 de martio 2010 a 16:27:39 UTC link Permaligamine

the plan is to specifiy the script in the information of the sentence, and to give the possibility to switch from one script to another while seing the sentence

{{vm.hiddenReplies[293] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
xtofu80 xtofu80 8 de martio 2010 8 de martio 2010 a 23:17:18 UTC link Permaligamine

That would be really awesome if you could implement this.

{{vm.hiddenReplies[294] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
sysko sysko 9 de martio 2010 9 de martio 2010 a 22:28:51 UTC link Permaligamine

Yep I will try to do this ASAP, as a chinese learner, it's also something I want for a long time ^^

sysko sysko 20 de martio 2010 20 de martio 2010 a 22:46:37 UTC link Permaligamine

implemented :)
now chinese sentence have a 汉 picture when it is "simplified" and 漢 when it's traditional moreover in addition of pinyin, the equivalent of the sentence in the other script (in traditional for a simplified sentence) is also displayed

http://tatoeba.org/eng/tools/index

new tools for chinese has also been added :)

it's just a draft, so if there's something missing, or if you say some possible improvement, tell me and i will include them ;-)

blay_paul blay_paul 6 de martio 2010 6 de martio 2010 a 19:26:24 UTC link Permaligamine

Advice for those leaving corrections in comments.

I strongly suggest you 'favorite' any sentences that you leave corrections for so you can check whether they are actually corrected or not.

{{vm.hiddenReplies[286] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
sysko sysko 6 de martio 2010 6 de martio 2010 a 19:53:19 UTC link Permaligamine

good suggestion

xtofu80 xtofu80 6 de martio 2010 6 de martio 2010 a 16:25:01 UTC link Permaligamine

Does anyone else have problem with the updated version as well?
I use Ubuntu 9,10 with firefox 3.5.8, and since the update today, I cannot add new sentences. For each sentence I see this turning circle on the left (before I saw it only when I edited the sentence), and pressing "submit translation" does not seem to work.

{{vm.hiddenReplies[280] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
TRANG TRANG 6 de martio 2010 6 de martio 2010 a 17:02:27 UTC link Permaligamine

Ah, well try doing CTRL+F5 on the homepage (or any page where you systematically see the loading animation when it shouldn't be there).

The problem is that your browser is still displaying from an old CSS file, and so the layout will look strange. You have to force Firefox to update its cache.

{{vm.hiddenReplies[283] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
blay_paul blay_paul 6 de martio 2010 6 de martio 2010 a 17:14:26 UTC link Permaligamine

That fixed it, thanks.

sysko sysko 6 de martio 2010 6 de martio 2010 a 18:21:37 UTC link Permaligamine

yep this time the problem come from firefox
we will try to figure out, how to force Firefox to update its cache without needing an intervention from the user

lilygilder lilygilder 6 de martio 2010 6 de martio 2010 a 16:56:37 UTC link Permaligamine

Yeah, me too. I work with Vista and Firefox 3.5.7 and can't submit any translations either.

blay_paul blay_paul 6 de martio 2010 6 de martio 2010 a 16:57:15 UTC link Permaligamine

I suspect some sort of database glitch. I'd give TRANG and sysko an while to fix it before worrying too much.

blay_paul blay_paul 28 de februario 2010 28 de februario 2010 a 10:29:25 UTC link Permaligamine

Favourite'd sentences not working.

I set a number of sentences as 'favourite', but when I use the link in my account page it says "This user does not have any favorites."

{{vm.hiddenReplies[267] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
sysko sysko 6 de martio 2010 6 de martio 2010 a 16:05:43 UTC link Permaligamine

Tatoeba has been updated, this bug should have been fixed by now

{{vm.hiddenReplies[269] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
xtofu80 xtofu80 9 de martio 2010 9 de martio 2010 a 19:58:16 UTC link Permaligamine

I can only see the first ten favourite sentences. The other sentences are not accessible, as far as I see.

sysko sysko 28 de februario 2010 28 de februario 2010 a 12:19:44 UTC link Permaligamine

Yep the issue has been fixed and will re-work in next release, which should not be long know, sorry to make you wait

xtofu80 xtofu80 4 de martio 2010 4 de martio 2010 a 12:21:03 UTC link Permaligamine

I found some incorrect sentences linked together, but don't know how to resolve the problem:
Sentence nº313285 "no-smoking area" is correctly translated in
Sentence nº90428 as "禁煙区域" but incorrectly in
Sentence nº90427 as "禁猟区", which means no-fishing zone, or no-hunting zone.

The best solution would be to cut the link between nº313285 and nº90427. Or should I delete nº90427 and add it again as the correct translation of my German sentence about the hunting zone?
Greetings

{{vm.hiddenReplies[278] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
blay_paul blay_paul 4 de martio 2010 4 de martio 2010 a 12:35:11 UTC link Permaligamine

I think the Japanese sentence with 禁猟区 is a kind of typo and I suggested it be deleted. I added a new pair of sentences for 禁猟区 to replace it.

If you think the German sentence is worth keeping then you can do so, but I think the Japanese version was a bit odd.

Nemo Nemo 1 de martio 2010 1 de martio 2010 a 03:54:10 UTC link Permaligamine

I've looked through a lot of contributions, and I've come to the realization that there are a LOT of contributions in English made by non-native speakers. I assume the same is the case for other languages, especially Japanese. There needs to be some sort of indicator for each sentence on whether or not the last editor was a native speaker. I've seen a lot of English sentences that are perfectly grammatical, with no errors at all, that I have never in my entire life heard someone utter -- correct or not, a native speaker would never say them.

{{vm.hiddenReplies[271] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
sysko sysko 1 de martio 2010 1 de martio 2010 a 09:05:58 UTC link Permaligamine

As the major part of both japanase and english come from the tanaka corpus, I can understand that the english is not really reliable, but for most of others language, Spanish, German, Polish, Chinese, I can say that for these languages 99% has been added by native

I agree we need a way to precise if the sentence has been added or reviewed by a native, we're currently thinking about a nice way to do that, maybe something to tag some sentences as "trust"

anyway for the moment one can assume that sentences which belong to someone are much more reliable than orphans

{{vm.hiddenReplies[272] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
blay_paul blay_paul 1 de martio 2010 1 de martio 2010 a 11:32:44 UTC link Permaligamine

> As the major part of both japanese and english come from
> the tanaka corpus, I can understand that the english is
> not really reliable.

The Tanaka Corpus was, initially, generated by students submitting pairs of sentences with the intent that the Japanese and English meant the same thing. So the Japanese is marginally more reliable than the English because the person entering it was Japanese.

However you cannot assume that the Japanese is correct and the English unreliable all the time. It's more complicated than that.

{{vm.hiddenReplies[273] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
xtofu80 xtofu80 6 de martio 2010 6 de martio 2010 a 16:09:00 UTC link Permaligamine

Being a native German speaker, I came across both Japanese and English sentence which I felt were not correct, however I was not 100% sure.
It would be a cool feature if non-natives could mark a sentence as "questionable", and then this sentence could be checked and corrected or verified by a native speaker. I suppose this would be rather easy to implement using the word list feature. So a non-native speaker would not correct a sentence which he is not 100% sure about, but put it into this list, and native speakers could occasionally go through the list and check for grammatical errors. This would drastically improve the quality of the sentences, if the feature is known and used by most users.

{{vm.hiddenReplies[274] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
blay_paul blay_paul 6 de martio 2010 6 de martio 2010 a 16:54:55 UTC link Permaligamine

Just post saying that you're not sure they are correct. There are enough native speakers of English to check the English speakers (and, though I may be biased, I think I'm good enough at Japanese to usually have a good idea as to whether a sentence is OK).

blay_paul blay_paul 26 de februario 2010 26 de februario 2010 a 14:15:19 UTC link Permaligamine

Romaji generator.

Are the tool and data used available online anywhere?

{{vm.hiddenReplies[254] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
sysko sysko 28 de februario 2010 28 de februario 2010 a 00:03:05 UTC link Permaligamine

yep it's from the kakasi project
http://kakasi.namazu.org/
as said before, the project seems more than dead :(

{{vm.hiddenReplies[255] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
blay_paul blay_paul 28 de februario 2010 28 de februario 2010 a 09:12:54 UTC link Permaligamine

Oooh yes. I remember this now.

The bad news is that kakasi probably isn't really fixable. I think you'd need to re-writing the code in a major way, not just add a few lines to the dictionary, to fix it.

The good news? Removing the line
ぜつ 絶
from the file 'kakasidict' may correct one romaji error in generated romaji.

{{vm.hiddenReplies[256] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
sysko sysko 28 de februario 2010 28 de februario 2010 a 12:21:50 UTC link Permaligamine

I think we can also try to find if there's people motivated to start a project for automatic romanization of japanese, or looking if there's not an embryon of such project and see how we can help

{{vm.hiddenReplies[257] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
contour contour 28 de februario 2010 28 de februario 2010 a 17:54:22 UTC link Permaligamine

For now, if there was the possibility to enter the romaji explicitly, and if manually entered and automatically generated romaji could be separated, that should make for a good test set for evaluating different methods for automatic generation.

I think that ideally one would start with a mature project, and automatically add corrections to the training set.

{{vm.hiddenReplies[258] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
blay_paul blay_paul 28 de februario 2010 28 de februario 2010 a 19:20:08 UTC link Permaligamine

> I think that ideally one would start with a mature
> project, and automatically add corrections to the
> training set.

There are six main approaches that could be taken.
1. Drop romaji support.
2. Allow manual correction of romaji.
3. Develop romaji generation code that uses the WWWJDIC index line.
4. Further develop kakasi
5. Look for alternative romaji conversion software.
6. Develop romaji conversion software from scratch.

I would recommend 1, 2, or 3.

4. Could be done, but I think you would soon reach limits on what is achievable.

{{vm.hiddenReplies[259] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
sysko sysko 1 de martio 2010 1 de martio 2010 a 00:09:20 UTC link Permaligamine

(I don't speak japanese at all, so excuse me if i speak non sense)

Nemo talk about JUMAN to replace kakasi, which can output in kana,
is kana not better as that way we're sure people who can't write japanese will not "accidently" mess up the "romanization", by restricting the "reading" part to kana characters , and we're also sure people use the same convention as there's only one kana per "sound"
(Trang always take about different way to write the romaji)

what do you think ? Trang ?

{{vm.hiddenReplies[260] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
Nemo Nemo 1 de martio 2010 1 de martio 2010 a 03:11:15 UTC link Permaligamine

I should give a little more information than I have in the past posts I have, I think, because there seems to have been little progress. I don't really want to come off as being harsh, but the reality is that Kakashi is a lost cause. Whoever coded the program did so in a very naive way, and to use sed to correct its errors would take an inordinate amount of both human and CPU time, and in the best case scenario, it would cause such undue load on the server so as to make tatoeba unusable. I've gotten the impression that Kakashi was chosen with little to no consideration of other options (c.f. below), despite the fact that there exist ways to accurately dissect Japanese text into parseable units, which could be further changed into romaji. The reality is, Kakashi is nowhere near mature enough to produce accurate results, and as an abandoned project there is little hope of it reaching that maturity -- its output will never get any better than it is. In contrast, Juman seems to be near-perfect, though I will admit that I have not tried the other romanizers suggested in the blog post, nor have I done extensive testing of Juman. Regardless, Juman seems to be acceptable, even optimal. Kakashi falls so far short of the mark that I'm not sure why it is even in use. I would even go so far as to state that if Kakashi remained the method of conversion, that by the time tatoeba becomes popular, greasemonkey scripts will be produced which correct romaji via some other means, if that's even feasible. (Here's the blog post I referenced: http://blog.tatoeba.org/2009/02...anization.html )

{{vm.hiddenReplies[262] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
TRANG TRANG 1 de martio 2010 1 de martio 2010 a 20:37:21 UTC link Permaligamine

Yes, to be honest, KAKASI was chosen with no consideration of other options. It was the first one I found that when I searched for a romaji converter, so I picked it.

And only later I wrote this blog post where actually searched and I listed other solutions. Solutions that I should have explored but never had the time to =/
I completely agree with you that KAKASI is not the long term solution.

Anyway, considering you have been taking the time to write all these posts, I will take a look at Juman ;). But if you can just tell me quickly what command to use to get a Japanese sentence parsed and converted into kana, that can save me some time from going through the documentation. Ah but, does JUMAN supports UTF-8...?

{{vm.hiddenReplies[263] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
contour contour 1 de martio 2010 1 de martio 2010 a 22:19:40 UTC link Permaligamine

From a quick look, it looks like you have to convert to and from EUC-JP. Piping a sentence through "juman -b -c" gives one line per word, with readings in the second of the space-separated columns.

{{vm.hiddenReplies[264] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
Nemo Nemo 2 de martio 2010 2 de martio 2010 a 02:03:07 UTC link Permaligamine

There's a powerpoint tutorial, I'll look at it when I have time and translate it. The translated user guide focuses on the whole idea behind the system, and why it was/how it was developed, and then when it comes to the syntax, it's just a bunch of "I don't know this word" and "If you break this down, it would mean something like..."

Nemo Nemo 1 de martio 2010 1 de martio 2010 a 02:51:08 UTC link Permaligamine

If Juman's kana/categorization output is accurate, it can produce 100% perfect romaji output. Kana give a representation of how something is said, along with its syntactical representation. There are ambiguities in kana, but JUMAN gives enough information that the pronunciation and syntax can be reconciled to provide a perfect, phonetic, romanization.

TRANG TRANG 1 de martio 2010 1 de martio 2010 a 22:00:51 UTC link Permaligamine

My ideal approach would be using WWWJDIC indices, combined with a better software for conversion into romaji or kana.

As for making romaji editable, if we were to make anything editable, I'd rather it be kana, like what sysko suggested.

If the purpose is to provide something useful for learning, then it's obviously better to have a sentence in kana, with spaces so that the learner knows how the sentence is composed. And of course we can use the sentence in kana to generate correct romaji.

TRANG TRANG 23 de februario 2010 23 de februario 2010 a 21:25:16 UTC link Permaligamine

That took me forever to write but hopefully it will prevent us from explaining certain things over and over again: http://blog.tatoeba.org/2010/02...n-tatoeba.html

{{vm.hiddenReplies[251] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
sysko sysko 24 de februario 2010 24 de februario 2010 a 00:36:04 UTC link Permaligamine

WOooOW i haven't read it entirely, and you've did a really damn good job, many thanks to Trang :)

lilygilder lilygilder 24 de februario 2010 24 de februario 2010 a 01:15:26 UTC link Permaligamine

Thanks Trang, this clears up a lot of problems. Very helpful. =) And kudos for writing all of this.