英文とのリンクを削除しました。
[#281504] Japanese people take three meals a day.
[#418502] Japanese eat three meals a day.
@JimBreen
This sentence has been unlinked from the English translations but it still appears on WWWJDIC. What's the correct way to handle this?
Two comments:
- the linking in the example sentence pairs is not tied to the linking in Tatoeba. To delink in the example pairs you need to go to the annotations page for the Japanese sentence and zero out the "meaning" sentence.
- is some cases a replacement English sentence is needed to keep the pair going. This can be added as a new translation, then that sentence added to the pairing via the annotations page.
I've added a new rather 直訳 translation.
Also, can you please notify me when English sentences are delinked? Otherwise I won't know that the main Tatoeba set and the examples collection are not the same. Usually I'll add another English sentence.
@JimBreen
I would suggest switching to using a program that uses the weekly exported data to automatically check all Japanese/English sentences from Tanaka Corpus and monitor their changes, deletions, and unlinking. It's more reliable and less burdensome to the contributors.
Anyway, this Japanese sentence is unnatural. I'd recommend dropping this. I'll unlink the English.
It would be better to automatically drop sentences rated Unsure or Not OK.
Notice that if you need a "three meals a day" sentence example, we have several. At least one, has a Japanese translation.
https://tatoeba.org/en/sentence...eals+a+day&to=
Thanks. I have removed this sentence from the examples set.
I use the weekly "wwwjdic.csv" file from https://downloads.tatoeba.org/exports and check it for changes. Most weeks around 20-30 sentences have changed. Last week there were 85!
At present I don't check for deleted and unlinked sentences, but I can add that.
Is there a download file indicating Unsure/Not OK status? (@CK)
I don't want to automatically drop them as some are being used as "priority" examples for specific Japanese terms, and I need either to specify another sentence using the term, or arrange for a new sentence to be added. This has to be done manually.
Here are all the weekly files.
https://downloads.tatoeba.org/exports/
The file with ratings is called...
users_sentences.csv
The 2nd field is the sentence ID
The 3rd field is the rating.
-1 = Not OK
0 = Unsure
1 = OK
@KK_kaku_
>It would be better to automatically drop sentences rated Unsure or Not OK.
これって、Tatoebaのデーターベースからですよね?そういう機能があるといいですね。ただ、システム的には難しいかも。Not OK をつけた人が、信頼できるユーザーかどうかが判定できなさそう。
私たちみたいに真剣にやっている人(私も、普段ちゃらけたコメントを書いてますが、結構真剣にやっています)だったらいいけど、ある日突然 Tatoeba にやってきて「俺ネイティブ」といいながら、Not OK を遊び半分で付けられたら、アウチだよね。でも、そういう機能あったら、いいよね。
ここ↓↓か掲示板に要望を出してみる手もあるよ。
https://github.com/Tatoeba/tatoeba2/issues
(私は、日本語しか書けないから「bug票」とか「強化要望」とか出せないんだけどね)
ところで、話は変わるんですが「この日本文を救ってほしい」「この日本文ってどう言うのが一番自然?」という要望・質問が出てて、なんか、考えれば考えるほど分からなくなってしまったんですが、どんな日本語がよさそうですか?
スペイン語は、主語がなくて、次のような文だそうです。
In Japan {they, we, you} eat three meals a day.
なので、訳し易そうな we を使って、
- In Japan, we eat three meals a day.
- 日本では、1日3回食事を摂ります。
- 日本では、1日3回食事をしています。
- 日本では、1日3食食べてるよ。
とかだと、OKもらえますか?KK_kaku_さんのOKが欲しいなって思って。
お付き合いいただけると幸いです。
Jim, if a sentence is unlinked, is it possible to pick an alternative Japanese sentence (from the list of existing translations) to be part of the JMdict examples set? The Tanaka Corpus sentences that KK_kaku_さん has been unlinking appear to be mistranslations or awkward translations of the original English sentences, and I don't think it's a good idea to write a new English translation for a poor Japanese sentence.
If there are no existing alternative sentences, can I suggest that a new one is added (at the time of unlinking), so that the English sentence isn't left without a Japanese translation?
> これって、Tatoebaのデーターベースからですよね?
I think KK_kaku_さん is referring to the examples set used with JMdict.
>I think KK_kaku_さん is referring to the examples set used with JMdict.
そっかぁ。ありがとう。
> I don't think it's a good idea to write a new English translation for a poor Japanese sentence.
+1
While our native Japanese speakers haven't yet had time to go through all the Tanaka Corpus sentences, they have "adopted" a number of them, so you can likely trust those.
It's likely a good idea to not add translations to Japanese sentences that are "unowned." One step further, would be to also ignore all the Japanese sentences not owned by native Japanese speakers if there is a similar example that can be used that is owned by a native Japanese speaker.
@CK
>It's likely a good idea to not add translations to Japanese sentences that are "unowned." One step further, would be to also ignore all the Japanese sentences not owned by native Japanese speakers if there is a similar example that can be used that is owned by a native Japanese speaker.
Tatoebaの管理人さんが Public な場でそんなこと言っていいんですか?
>ただ、システム的には難しいかも。Not OK をつけた人が、信頼できるユーザーかどうかが判定できなさそう。
That's definitely something to consider. You guys have had several cases of sentences marked wrong, when they're not actually wrong.
A possible option would be to only let advanced contributors rate sentences, so that it's more likely to be someone with a good judgment who is doing the rating, but then that would be making new contributors feel more powerless by taking away a function from them, when they already lack the capacity to tag when a sentence needs fixing. I'm not sure it's worth it.
Would it be possible to add the name of the user who added the rating into the data files? That way we can check the profile and see for ourselves whether the rating is trustworthy?
>Tatoebaの管理人さんが Public な場でそんなこと言っていいんですか?
We know his opinions already, and he won't change them. Do I fully agree with them? No. But he's doing it with the intention of making Tatoeba data useful for others.
@CK There are another 9 "three meals a day" sentence pairs, so we don't really need more.
Thanks for the details of extracting the Not OK/Unsure info. I'll dig into that eventually. I have quite a backlog - there are 100+ sentences that have been merged and now have duplicate indices, and during the last week 105 sentences were edited resulting in non-matching indices.
@RobinS I plan to add a check for unlinked sentences to the processing of the weekly download. It will take a little time. At this stage I don't have a measure of how significant the issue is.
Tags
View all tagsSentence text
License: CC BY 2.0 FRLogs
We cannot determine yet whether this sentence was initially derived from translation or not.
linked by an unknown member, date unknown
added by an unknown member, date unknown
linked by tatoerique, December 12, 2009
linked by CK, October 24, 2011
unlinked by KK_kaku_, February 15, 2023
unlinked by KK_kaku_, February 15, 2023
linked by JimBreen, March 2, 2023
unlinked by KK_kaku_, March 3, 2023
edited by Pfirsichbaeumchen, March 3, 2023
linked by Pfirsichbaeumchen, March 3, 2023
linked by Pfirsichbaeumchen, March 3, 2023
linked by Pfirsichbaeumchen, March 3, 2023
linked by Pfirsichbaeumchen, March 3, 2023
linked by Pfirsichbaeumchen, March 4, 2023
linked by Pfirsichbaeumchen, March 4, 2023