Menu
Duplicate removal script.
I don't know how the script works exactly, but I think it may be missing a step.
Suppose we have
100000 Hello.
100001 こんにちは。
100002 Hi.
100001 is linked to 100000
100001 has the meaning field of 100000
Now, suppose someone decides that 'Hello' and 'Hi' are close enough to not need both.
100000 Hello.
100001 こんにちは。
100002 Hi. ---> Hello.
Then suppose the script removes 100000.
100001 こんにちは。
100002 Hello.
Is 100001 still linked to 100000? It should be linked to the duplicate 100002 instead.
Does 100001 still have the meaning field of 100000? It should have the meaning field of 100002 instead.
In other words is Sentence A is removed as a duplicate of Sentence B then all the links that pointed to Sentence A should now point to Sentence B instead.
the remove duplicate script does the following
identify all the sentence which have both the same language and the same text
and after it will keep the oldest sentence which are owned by someone (or the oldest one if none of the duplicate belongs to someone) and then will relink all links to the duplicate to this one
(so comments / translations / lists etc... etc.. )
and finally will remove the duplicate and keep only one
so the script will not produce any broken reference to a removed sentences
> so the script will not produce any broken reference to
> a removed sentences
There are, however, some broken references being produced. It's not clear how though.
236727 あなたには姉妹がいますか。
was linked to 71123, which now no longer exists.
69566 Do you have any sisters?
does exist and was indirectly linked from 236727.
I don't know when 71123 was removed, why it was removed, or how it was removed, but something obviously went wrong somewhere. (It was one of the \N records last week - so it obviously isn't a recent deletion)
Hopefully these broken links are left over from earlier times and won't be reoccurring.
ok at least the remove duplicate script will not produce anymore broken links
> identify all the sentence which have both the same language and the same text
So it also merges duplicates that are not linked whatsoever?
Yep, that way even if new comers add
I love you
and translate it,
as I love you already exist, the script will delete the new "I love you" and link the translation to the old "I love you" (or also removed it, if the translation already exists too)