menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search

Wall (7,124 threads)

Tips

Before asking a question, make sure to read the FAQ.

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

Latest messages subdirectory_arrow_right

sharptoothed

10 days ago

subdirectory_arrow_right

sharptoothed

10 days ago

subdirectory_arrow_right

TATAR1

10 days ago

subdirectory_arrow_right

AlanF_US

11 days ago

feedback

sharptoothed

12 days ago

subdirectory_arrow_right

Shanaz

15 days ago

subdirectory_arrow_right

Qaztat

15 days ago

subdirectory_arrow_right

TATAR1

15 days ago

feedback

Tartar

15 days ago

subdirectory_arrow_right

menaud

18 days ago

blay_paul blay_paul May 11, 2010 May 11, 2010 at 11:04:43 AM UTC link Permalink

Could we have an official decision on using -1 in the meaning field to mean 'not for WWWJDIC' ?

*bump*

{{vm.hiddenReplies[816] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG May 11, 2010 May 11, 2010 at 11:39:44 PM UTC link Permalink

As I said in my email, I'm okay with it. It all depends on Jim :)

CK CK May 11, 2010, edited October 25, 2019 May 11, 2010 at 11:44:10 AM UTC, edited October 25, 2019 at 8:06:36 AM UTC link Permalink

[not needed anymore- removed by CK]

{{vm.hiddenReplies[817] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul May 11, 2010 May 11, 2010 at 12:19:27 PM UTC link Permalink

Every language includes a different set of ambiguities, omitted information and conventions. If you attempt to remove these ambiguities, etc. in the sentences being translated then you are going to create bias in the vocabulary used and will also be likely end up with unnatural sentences.

The typical example for this is pronouns in Japanese. If you start adding あなた to sentences that, when translated, use 'you' in the English then there will far more あなたs used in those examples than you would ever find in natural Japanese.

JimBreen JimBreen May 11, 2010 May 11, 2010 at 9:22:09 AM UTC link Permalink

When I went to the "trusted" index edit page for 97078, I found there were two sets of indices! Is this a result of automerge? I can't think it is valid.

{{vm.hiddenReplies[810] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul May 11, 2010 May 11, 2010 at 10:45:07 AM UTC link Permalink

> I found there were two sets of indices!
> Is this a result of automerge?

Probably. It's not the first time that's happened.

> I can't think it is valid.

I'm not sure it would be an easy fix - if you've got two identical sentences with an index each the computer isn't going to know which one is right and which one is spurious. What you should do is delete the index of one of the sentences you make identical in advance - but I obviously didn't remember to do it for that one.

JimBreen JimBreen May 11, 2010 May 11, 2010 at 9:31:07 AM UTC link Permalink

Something is screwy with a batch of indices. For example, Last week there was:
地球は球の形をしている
地球 は 球 乃{の} 形(かたち)[01] を 為る(する){している}

The latest download has it changed to:
地球 は 玉(たま) 乃{の} 形(かたち)[01] を 為る(する){している}

Presumably that 玉(たま) was meant to be 球(たま)?

{{vm.hiddenReplies[811] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul May 11, 2010 May 11, 2010 at 10:36:45 AM UTC link Permalink

> Presumably that 玉(たま) was meant to be 球(たま)?

Actually it was meant to be 玉(たま){球}. (If it was supposed to be 球(きゅう) before then, 'Oops.')

As you know, where there are multiple headwords in WWWJDIC only one is used for the indexing so that there is only one [EX] link.

{{vm.hiddenReplies[814] ? 'expand_more' : 'expand_less'}} hide replies show replies
JimBreen JimBreen May 11, 2010 May 11, 2010 at 1:26:02 PM UTC link Permalink

Well, I changed them to 球(たま) and 弾(たま). I'll now go and do a global replace with 玉(たま){球} and 玉(たま){弾}.

CK CK May 10, 2010, edited October 25, 2019 May 10, 2010 at 4:18:52 PM UTC, edited October 25, 2019 at 8:06:58 AM UTC link Permalink

[not needed anymore- removed by CK]

{{vm.hiddenReplies[798] ? 'expand_more' : 'expand_less'}} hide replies show replies
Demetrius Demetrius May 10, 2010 May 10, 2010 at 7:08:11 PM UTC link Permalink

Actually I believe at least some sentences you’ve given don’t have any spelling mistakes. I can’t find any in 2263, 18732, 18867, 20594.

And please leave British forms like ‘decentralised’ intact. I like them. :)

{{vm.hiddenReplies[804] ? 'expand_more' : 'expand_less'}} hide replies show replies
saeb saeb May 10, 2010 May 10, 2010 at 8:30:33 PM UTC link Permalink

I agree with Demetrius. Some of the 'mistakes' are just british spelling differences. But there are some real mistakes in there, like in 2263 where it should say 'sine and cosine' instead of 'sinus and cosinus'.

{{vm.hiddenReplies[806] ? 'expand_more' : 'expand_less'}} hide replies show replies
CK CK May 11, 2010, edited October 25, 2019 May 11, 2010 at 12:41:28 AM UTC, edited October 25, 2019 at 8:06:43 AM UTC link Permalink

[not needed anymore- removed by CK]

blay_paul blay_paul May 10, 2010 May 10, 2010 at 4:23:27 PM UTC link Permalink

I don't see why not. As long as any corrections are advisory, not automatically made.

{{vm.hiddenReplies[799] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko May 10, 2010 May 10, 2010 at 4:31:45 PM UTC link Permalink

personnaly when inputing sentence I use a firefox plugin which acts as a spell checker

{{vm.hiddenReplies[800] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul May 10, 2010 May 10, 2010 at 4:47:50 PM UTC link Permalink

Ditto. We can't make that mandatory though, and it doesn't help the ones that have already got through.

Could you get a spell check to make up a list of sentences that are flagged as having spelling mistakes? If you could then it could be split into, say, batches of 500 and handed out to individual users.

One important note - we're again seeing how the 'adoption' system is greatly slowing down attempts to correct sentences on a wide scale. CK's got two or three dozen with '/alternatives' added by a well meaning user *cough*human600*cough* and he can't fix any of them himself.

I think that there's a case for a few users to be granted higher access rights when it's been shown (and agreed) that they know the system here well and would use them responsibly.

{{vm.hiddenReplies[801] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG May 12, 2010 May 12, 2010 at 10:05:30 AM UTC link Permalink

> One important note - we're again seeing how the 'adoption' system is greatly slowing down attempts to correct sentences on a wide scale.

You feel it's slowing down because the status of moderator hasn't been established yet. Probably also because you were used to having full access to the corpus. I can understand it somehow frustrates you, not to be able to correct a sentence because it already belongs to someone.

But I cannot imagine Tatoeba without this adoption/ownership system. On a larger scale, it would be a terrible mess if everyone could edit anyone's mistakes. Posting comments makes people come back to Tatoeba, it makes them learn from their mistakes, it makes them responsible of what they contribute. To me, it's a very important part of the collaborative aspect.

Of course, some people rather feel annoyed by it, some people will never come back, and it blocks certain sentences. But the problem doesn't come from adoption/ownership, it comes from the way permissions are set. As you pointed out, we need more people with higher access rights, people who can edit everything, and delete sentences (i.e., we need moderators).

sysko has asked me if we could integrate the 'moderator' status last week. I said no because there were too many things to test already. But we can have that for next week (or perhaps even this week, depending on how productive we are).

CK CK May 10, 2010, edited October 25, 2019 May 10, 2010 at 4:50:07 PM UTC, edited October 25, 2019 at 8:06:50 AM UTC link Permalink

[not needed anymore- removed by CK]

{{vm.hiddenReplies[803] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul May 10, 2010 May 10, 2010 at 8:17:35 PM UTC link Permalink

> For one of the admins to do it right now,
> shouldn't take too long.

I estimate 6 hours for the English. Less if they can find a dictionary that includes both American and British spellings.

blay_paul blay_paul May 10, 2010 May 10, 2010 at 4:48:57 PM UTC link Permalink

> personnaly when inputing sentence I use a
> firefox plugin which acts as a spell checker

But not in that sentence, right? ;-)

CK CK May 10, 2010, edited October 25, 2019 May 10, 2010 at 4:00:22 PM UTC, edited October 25, 2019 at 8:07:06 AM UTC link Permalink

[not needed anymore- removed by CK]

{{vm.hiddenReplies[796] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul May 10, 2010 May 10, 2010 at 4:14:48 PM UTC link Permalink

Personally, I can live without !! but I'm holding out for the occasional !?

blay_paul blay_paul May 8, 2010 May 8, 2010 at 10:46:20 PM UTC link Permalink

Duplicate removal script

Could this be run soonish? I see that sentences that were duplicates 9 days ago haven't been merged yet.

{{vm.hiddenReplies[735] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG May 9, 2010 May 9, 2010 at 11:09:10 PM UTC link Permalink

Done.

{{vm.hiddenReplies[785] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul May 9, 2010 May 9, 2010 at 11:21:15 PM UTC link Permalink

Thanks!

blay_paul blay_paul May 9, 2010 May 9, 2010 at 3:33:33 PM UTC link Permalink

I've seen that the index data log is working better now - recent changes are showing up in the right order with the actual time changed.

Could deletions be noted in the log as well?

{{vm.hiddenReplies[765] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG May 9, 2010 May 9, 2010 at 11:07:24 PM UTC link Permalink

Yes, it is planned, but not for soon... I can't guarantee it will be done before another two months.

blay_paul blay_paul May 9, 2010 May 9, 2010 at 10:48:00 PM UTC link Permalink

sentences.csv

I'm looking through this file output now. There are a number of spurious '\' symbols and line feeds in it. Possibly there are odd characters in the sentence text?

The suspect sentences are 208 and 5540. One of them is owned so I couldn't try editing it.

blay_paul blay_paul May 9, 2010 May 9, 2010 at 11:32:19 AM UTC link Permalink

I looked at just one as a test ...

> 288755 He is delighted at your success. 彼女はあなたの成功を喜んでいます。

As I suspected this was not present in the last version of the Tanaka Corpus _I_ was maintaining. It was either corrected or deleted as a near duplicate. I shudder to think of how many corrections may have been 'lost' in this way. >_<

{{vm.hiddenReplies[755] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG May 9, 2010 May 9, 2010 at 10:30:10 PM UTC link Permalink

Hmm, I'm not sure what you mean... I don't think there has been anything lost in Tatoeba.

For the sentence you indicated, it does not mismatch in Tatoeba.
He is delighted at your success.
彼はあなたの成功を喜んでいます。
(cf. http://tatoeba.org/eng/sentences/show/288755)

And I checked randomly other lines from CK's file. All those I tried turned out to be safe in Tatoeba, they do match.

{{vm.hiddenReplies[778] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul May 9, 2010 May 9, 2010 at 10:39:15 PM UTC link Permalink

Well if it's not my stuff, and it's not your stuff then it must be CK's stuff where the problem is.

That's fine with me. :-P

{{vm.hiddenReplies[781] ? 'expand_more' : 'expand_less'}} hide replies show replies
CK CK May 10, 2010, edited October 25, 2019 May 10, 2010 at 1:28:24 AM UTC, edited October 25, 2019 at 8:07:13 AM UTC link Permalink

[not needed anymore- removed by CK]

{{vm.hiddenReplies[788] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul May 10, 2010 May 10, 2010 at 5:59:33 AM UTC link Permalink

I just opened that file and it has the line

A: 彼はあなたの成功を喜んでいます。 He is delighted at your success.#ID=288755
B: 彼(かれ) は 貴方(あなた)[01]{あなた} 乃{の} 成功 を 喜ぶ{喜んでいます}

{{vm.hiddenReplies[792] ? 'expand_more' : 'expand_less'}} hide replies show replies
JimBreen JimBreen May 10, 2010 May 10, 2010 at 2:35:53 PM UTC link Permalink

I forgot to do a download at the weekend. Downloading now....