menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search

Wall (6,960 threads)

Tips

Before asking a question, make sure to read the FAQ.

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

Latest messages subdirectory_arrow_right

marafon

4 days ago

feedback

CK

4 days ago

feedback

sharptoothed

10 days ago

subdirectory_arrow_right

Cangarejo

10 days ago

subdirectory_arrow_right

Cangarejo

13 days ago

subdirectory_arrow_right

Thanuir

13 days ago

subdirectory_arrow_right

ondo

14 days ago

subdirectory_arrow_right

ddnktr

14 days ago

feedback

ondo

14 days ago

subdirectory_arrow_right

AlanF_US

17 days ago

CK CK June 10, 2010, edited October 25, 2019 June 10, 2010 at 1:29:24 AM UTC, edited October 25, 2019 at 8:10:02 AM UTC link Permalink

[not needed anymore- removed by CK]

{{vm.hiddenReplies[1227] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG June 10, 2010 June 10, 2010 at 11:36:05 AM UTC link Permalink

But that would imply that everyone uses Firefox (or has to) ^^

Also, since Tatoeba has a broader public than students in a classroom, it wouldn't necessarily be a good thing to drop the furigana. A lot of people would rather have something that is 80-90% accurate than not having anything at all, because it saves them time. And in the end, it's people's own responsibility to decide whether they want something perfect or not.

Generating furigana is not what slows down Tatoeba the most, you wouldn't see a difference in speed if we took it out, so it wouldn't justify that we take it out.

But as a teacher, you can (and actually *must*) educate your students not to rely on the furigana line, and use Rikaichan instead. It's not incompatible. I use it myself (and always did) whenever I want to figure out the reading of a Japanese sentence in Tatoeba, despite the fact that the reading is already displayed.

CK CK June 9, 2010, edited October 25, 2019 June 9, 2010 at 5:05:24 PM UTC, edited October 25, 2019 at 8:10:08 AM UTC link Permalink

[not needed anymore- removed by CK]

{{vm.hiddenReplies[1221] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko June 9, 2010 June 9, 2010 at 5:21:26 PM UTC link Permalink

once the tags will be added, we will be able to do the following:
tags "unsuitable_for_children" etC. and maintain a list of tags which will be not accesible for non user/ user which has not active "unsafe search" option

Pharamp Pharamp June 10, 2010 June 10, 2010 at 10:24:18 AM UTC link Permalink

oui
moi je suis très très petite

CK CK June 10, 2010, edited October 25, 2019 June 10, 2010 at 9:21:49 AM UTC, edited October 25, 2019 at 8:09:55 AM UTC link Permalink

[not needed anymore- removed by CK]

{{vm.hiddenReplies[1229] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko June 10, 2010 June 10, 2010 at 9:33:26 AM UTC link Permalink

simply because we haven't found yet time to rewrite the hint page :P

Demetrius Demetrius June 9, 2010 June 9, 2010 at 5:58:33 PM UTC link Permalink

If okurigana is incorrect, can we just file this as bugs in MeCab?

{{vm.hiddenReplies[1223] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul June 9, 2010 June 9, 2010 at 6:57:14 PM UTC link Permalink

In theory. However they are not really bugs in MeCab, but problems resulting from the dictionary used with MeCab. The dictionary used can both be selected (from a very short list ;-) and can be altered or aided by user-defined dictionaries.

So really what it needs is someone familiar with MeCab to find the best dictionary available and to add fixes for the problems noted.

However it probably is never going to be possible to be 100% accurate so to get the best results manual corrections will be needed at some point.

{{vm.hiddenReplies[1225] ? 'expand_more' : 'expand_less'}} hide replies show replies
JimBreen JimBreen June 10, 2010 June 10, 2010 at 8:44:49 AM UTC link Permalink

It's not that simple.

Consider the sentence: 君たちの訳文と黒板の訳を比較しなさい。 MeCab suggests わけ as the reading of the solo 訳, whereas we all know it's やく. MeCab's usual dictionary (NAIST-JDIC) has both versions of 訳, and no amount of adding dictionaries is going to "fix" it. MeCab uses some very sophisticated AI to segment sentences, and the dictionaries have parameters derived from training on hand-segmented texts. The trouble is that ...の訳を... could be either, and you need the context of the whole sentence to decide which is which. In fact the weightings for 訳/わけ and 訳/やく as solo lexemes are the same. You could probably fiddle the weights on 訳 to make it produce やく, but most of the solo appearances of 訳 in Tatoeba are in fact わけ.

There is a whole research field of Word Sense Disambiguation (WSD) working on problems related to this, but I don't think there are any packaged solutions for Japanese that can be plugged into Tatoeba. Just be grateful we have MeCab - 20 years ago automatic Japanese segmenters were thought to be impossible to build.

Demetrius Demetrius June 9, 2010 June 9, 2010 at 6:43:11 PM UTC link Permalink

*furigana

blay_paul blay_paul June 9, 2010 June 9, 2010 at 10:38:48 AM UTC link Permalink

Help wanted!

I'm looking for someone to help with Tatoeba / WWWJDIC integration. Specifically I'd like someone with database / web experience to work on tools for validating / completing the index data needed to link WWWJDIC dictionary entries to Tatoeba example sentences. If you're interested post here for more details or PM me.

saeb saeb June 8, 2010 June 8, 2010 at 4:53:40 AM UTC link Permalink

revisiting sentence variation...

I came across a sentence in arabic (thx to qahwa's comment)...one of my sentences :D...It shows a property of the arabic script that I want to document with 3 or 4 variations:

هل استلمتَ الرسالة؟
Did you(male) receive the letter?

هل استلمتِ الرسالة؟
Did you(fem) receive the letter?

هل استلمَتْ الرسالة؟
Did she receive the letter?

without the harakaat (vowel marks) they're all written the same.
Now if I do add them, I'm afraid they'll just get reported as similar sentences and get merged/deleted/etc... (I mean the english sentences ofc)

what's tatoeba's 'official' statement on how to deal with this?

{{vm.hiddenReplies[1205] ? 'expand_more' : 'expand_less'}} hide replies show replies
brauliobezerra brauliobezerra June 8, 2010 June 8, 2010 at 3:01:09 PM UTC link Permalink

Similar things happen in Portuguese

Esse é seu brinquedo.

can be

[Hey, you, ]
This is your toy.

[Bob likes to play.]
This is his toy.

[Mary likes to play.]
This is her toy.

[Tex the armadillo likes to play.]
This is its toy.

There are unambiguous ways to say these sentences in Portuguese, but they are not used that often.

{{vm.hiddenReplies[1208] ? 'expand_more' : 'expand_less'}} hide replies show replies
MUIRIEL MUIRIEL June 8, 2010 June 8, 2010 at 8:41:18 PM UTC link Permalink

I don't think that this is the same problem as in Arabic.
It doesn't cause problems when Portuguese duplicates like your example are merged. But in Arabic it does. The example that saeb posted is the same *without* vowel marks, but with vowel marks, it's not anymore the same, and the pronounciation isn't the same neither. So it *looks* like a duplicate, but it isn't.

{{vm.hiddenReplies[1209] ? 'expand_more' : 'expand_less'}} hide replies show replies
Demetrius Demetrius June 9, 2010 June 9, 2010 at 4:07:28 PM UTC link Permalink

I don’t know about Arabic, but I personally prefer to differenciate sentences that are different in speech.

E.g. in Russian, Belarusian and Ukrainian one doesn’t normally mark stress, but when it’s important, I do (as in the case with зáмок/замóк in sentences No. 385729 and 385728).

brauliobezerra brauliobezerra June 8, 2010 June 8, 2010 at 2:31:12 PM UTC link Permalink

New link for the typing game:

http://braulio.home.dyndns.org/

Now a little more user friendly and with longer texts.

xtofu80 xtofu80 June 7, 2010 June 7, 2010 at 3:04:50 PM UTC link Permalink

A general remark: Given the huge amount of Japanese data, I think we need more Japanese contributors to help us with the correction and to keep the Japanese sentences more consistent. Talking to my Japanese language partner recently, I came to the conclusion that with its current features, tatoeba is rather unattractive for Japanese natives who want to learn another language, because most sentences are already translated to Japanese. For example, what would be the benefit for my partner who learns German?

{{vm.hiddenReplies[1195] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul June 7, 2010 June 7, 2010 at 3:28:23 PM UTC link Permalink

There are probably more sentences in German that aren't translated into Japanese than you realize.

What you can do is ask Trang or Sysko to add a list
"ger->jpn translations needed"

They can filter the data to see exactly what sentences in German aren't linked to a sentence in Japanese.

{{vm.hiddenReplies[1196] ? 'expand_more' : 'expand_less'}} hide replies show replies
xtofu80 xtofu80 June 7, 2010 June 7, 2010 at 4:14:12 PM UTC link Permalink

Dear Trang, dear Sysko:
Would it be possible to do this on the contribution webpage, e.g. when one selects "ger->jpn",to show example sentences in German which have no Japanese equivalent, or is this to heavy a burden for the database? As the current system shows only 15 sentences, maybe this might also reduce the complexity of the search.

{{vm.hiddenReplies[1198] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG June 7, 2010 June 7, 2010 at 8:36:10 PM UTC link Permalink

It's in our plans :)
...and has been for a long long time. But to be honest, I cannot tell you for sure when we can implement it. Perhaps in two weeks, perhaps in one month, perhaps in two months.

In the meantime, I can generate a list of German sentences that have no *direct* translation in Japanese. Most of them will have an indirect translation though.

{{vm.hiddenReplies[1203] ? 'expand_more' : 'expand_less'}} hide replies show replies
saeb saeb June 8, 2010 June 8, 2010 at 12:17:10 AM UTC link Permalink

...perhaps in two years :P

blay_paul blay_paul June 7, 2010 June 7, 2010 at 4:44:09 PM UTC link Permalink

I've done some checking and there are 9570 German sentences that are not translated into Japanese. Starting with 123 and ending with 399357.

I can send you them in an email if you want.

{{vm.hiddenReplies[1200] ? 'expand_more' : 'expand_less'}} hide replies show replies
xtofu80 xtofu80 June 7, 2010 June 7, 2010 at 8:21:01 PM UTC link Permalink

Yes, please do so, though I think it would be preferable to have a list or even a search function for such sentences.

{{vm.hiddenReplies[1201] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko July 19, 2010 July 19, 2010 at 12:06:22 PM UTC link Permalink

just to say it's now possible to view all "german sentences not translated in japanese" here http://tatoeba.org/eng/sentence...pn/indifferent (it will display indirect translations in Japanese, as this way it can be a fine way to view sentences than can be linked)

TRANG TRANG June 7, 2010 June 7, 2010 at 8:34:22 PM UTC link Permalink

Here are a few things...


1) If your partner has a good level in German, he/she can practice translating from Japanese into German, and you would be checking his/her sentences.

Note that you can view someone's sentences by going to their profile, then clicking on "See this user's contribution" (below the link to send a private message). Then on the user's contributions page, you go to "See all" next to "Latest sentences" (yea it's a bit complicated, but we'll be improve the profile someday). And you can select "German" to only keep German sentences.


2) Your partner can enter in Tatoeba Japanese sentences that he/she would like to know how to say in German, and you can try to translate his/her sentences. It would be good of course to search first if the sentence doesn't already exist.

Basically, you could be chatting together and, as the conversation goes, your partner could add Japanese sentences he/she can't figure out how to say in German. Your partner would then send you the link to the sentence once it's added, and you would translate right after that. This is pretty much what most tandems do anyway, when they chat together "How do you say...?" But instead of keeping that knowledge in your private chat logs, you can share it with the rest of the world :)

It would be a good idea to also make a list out of it, so that you can keep track of the various sentences you learned.
sysko had made such a list: http://tatoeba.org/eng/sentences_lists/show/73/eng
It resulted from him and Dorenda chatting together in IRC.


3) You partner can practice translating from German to Japanese, while also contributing. Because as I said, most of the German sentences do not have a DIRECT translation in Japanese, but they do have an INDIRECT translation. So your partner could take on the task to link them.

But not just by reading and saying "okay, those translations match". I can create a deu->jpn list for him/her. She would then look at the German sentences from the list, and try to think of what the translation is. Then he/she can click on the sentence to view its details, and most of the time there should be an indirect Japanese translation. If that indirect translation matches what he/she thought of, they he/she can link the two sentences.

The only problem with this is that not everyone can link or unlink sentences yet. Only "trusted users" can (and even then, they can't link or unlink anything). The link/unlink feature requires you to understand very well the structure of Tatoeba, so it's not available to everyone because it's going to be confusing to new users more than it's going to be useful.
But I hope to make the link/unlink feature available to everyone through the concept I just explained.


So those are the ideas I have in mind. There are certainly more creative ones, but I'd have to think harder ^^

brauliobezerra brauliobezerra June 5, 2010 June 5, 2010 at 5:05:25 PM UTC link Permalink

I have a game for you: "Tatoeba Typing"

http://braulio.home.dyndns.org:1004/
or
http://braulio.home.dyndns.org:1004/?lang=fra
http://braulio.home.dyndns.org:1004/?lang=deu
http://braulio.home.dyndns.org:1004/?lang=por
http://braulio.home.dyndns.org:1004/?lang=cmn
etc...

{{vm.hiddenReplies[1163] ? 'expand_more' : 'expand_less'}} hide replies show replies
Demetrius Demetrius June 7, 2010 June 7, 2010 at 4:06:11 PM UTC link Permalink

Please perform the following substitutions to make it usable for Russian, Belarusian and Ukrainian:
« » (non-breaking space) to « » (space)
«—» (M-dash) to «-» (or «--»)
«–» (N-dash) to «-» (I don’t add these, but who knows?)
«, », „, “ (different quotes) to «"».
I prefer to add these when adding Russian sentences, but most people have no way to type them.

Perhaps «’» should be changed to «'», but I'm not sure. Some people have only the first, some only the second. (Microsoft didn't have this in their Ukr. layouts up to Vista, so some people even use * instead!) It whould be the best way to add this as an alternative in a JavaScript.

{{vm.hiddenReplies[1197] ? 'expand_more' : 'expand_less'}} hide replies show replies
brauliobezerra brauliobezerra June 7, 2010 June 7, 2010 at 4:34:46 PM UTC link Permalink

I had noticed this problem, mainly with smart quotes. First I thought it was a simple matter of making some substitutions. But I guess the best solution is to let the user choose what character he/she will use.

For now I will make the substitutions you suggested (including «’» for «'»).

CK CK June 6, 2010, edited October 25, 2019 June 6, 2010 at 2:02:42 PM UTC, edited October 25, 2019 at 8:10:14 AM UTC link Permalink

[not needed anymore- removed by CK]

{{vm.hiddenReplies[1186] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG June 6, 2010 June 6, 2010 at 3:11:52 PM UTC link Permalink

Yes, actually, for a long time already I'm planning to have a "links" section.

blay_paul blay_paul June 5, 2010 June 5, 2010 at 5:19:03 PM UTC link Permalink

Huh, I thought a word counted as '8 characters'. It looks like I've suddenly got much faster. :-P

{{vm.hiddenReplies[1164] ? 'expand_more' : 'expand_less'}} hide replies show replies
brauliobezerra brauliobezerra June 5, 2010 June 5, 2010 at 5:24:53 PM UTC link Permalink

I counted it as five, since this is what I've seen elsewhere.

contour contour June 5, 2010 June 5, 2010 at 6:14:50 PM UTC link Permalink

That's a nice utilization!
It matches the speed I get at http://www.typeonline.co.uk/typingspeed.php pretty closely - shorter sentences go a fair bit faster, probably because I can memorize them before the timer starts.

brauliobezerra brauliobezerra June 6, 2010 June 6, 2010 at 1:00:55 PM UTC link Permalink

Typing game is working again. Database server wasn't running after Ubuntu update.

JimBreen JimBreen June 7, 2010 June 7, 2010 at 1:26:17 PM UTC link Permalink

I was looking at the list: "English sentences in need of correction, confirmation from a native, etc..." and I notice I can no longer remove sentences from it. Some have already been corrected and are now OK, so how do they get removed from that list?

{{vm.hiddenReplies[1192] ? 'expand_more' : 'expand_less'}} hide replies show replies
saeb saeb June 7, 2010 June 7, 2010 at 1:31:23 PM UTC link Permalink

I think you need to press on 'edit this list' for the remove buttons to appear...

{{vm.hiddenReplies[1193] ? 'expand_more' : 'expand_less'}} hide replies show replies
JimBreen JimBreen June 7, 2010 June 7, 2010 at 1:38:50 PM UTC link Permalink

Ah. Thanks.