Dês (mewzûyêk)
Tîpî
Verê perskerdişê persêk xeyrê xo persa xo ser o PZP de cigêrayîş bikerê.
Ma wazenîme ke seba munaqeşeyanê medenîyan atmosferêko rindane îdame bikerîme. Xeyrê xo qaydeyanê ma yê verba hereketanê xiraban biwanê.
firsthomecare
yew deqa ver
sharptoothed
vizêr
sharptoothed
vizêr
TATAR1
vizêr
AlanF_US
vizêr
sharptoothed
vizêr
Shanaz
vizêr
Qaztat
vizêr
TATAR1
vizêr
Tartar
vizêr

When will the Tatoeba interface next be updated from the Launchpad translation data? I've noticed that some of the more obvious mistakes are still around although I corrected them in Launchpad some time ago.

Next time should be tomorrow, when we update Tatoeba for bug fixes and small changes.
But generally speaking, the interface translations can be updated any time... Since it's not entirely automated, it's not regular... You have to remind me to do it ^^'

I see the interface update has gone through. I think it looks a lot better now.

Notes on Tatoeba interface translation.
* Launchpad is no longer showing the full path to the source code. It doesn't include the website any more.
* Translation item 15
\controllers\user_controller.php:258
The original English is incorrect. It should be "Error" not "Erreur"
* Translation item 505
The original English has some minor errors.
"We really want to thanks" -> "We really want to thank"
"us a much more complete data files" -> "us much more complete data files"
"wouldn't have a so much complete IPA" -> "wouldn't have an IPA so complete."
Also, possibly "IPA table" rather than just IPA? (Not sure)

[not needed anymore- removed by CK]

So, this is the link that get passed on from good contributor to good contributor:
http://blog.tatoeba.org/2010/02...n-tatoeba.html
I really need to put it somewhere so that more people will be enclined to read it. Anyway, as Paul explained, you should not change a sentence if it is valid.
But you have to know that only 'trusted users' can link/link sentences at the moment. The 'sentence_annotations' page as well can only be accessed by trusted users. This is because these features require a deeper understanding of Tatoeba and are not dummy-safe.
If you are interested in becoming a trusted user, you can just ask me :)

this is the link that get passed on from a crazy cult member to another:
[tatoeba's bible]
I really need to put it as the site's background image.
But you have to know that only 'elite cult members' can perform rituals at the moment.
If you are interested in becoming one, just ask to be initiated :P
-------------------------------------------
btw, are slangy sentences allowed on tatoeba...u know gaming slang, 1337 speak, gang slang...
PS watch out for the lame battle b/w arabic n portuguese :P

which battle 0:-)?

Who gets more sentences. :)

Oh, Demetrius, I didn't know that you battle saeb ;).

oh, MUIRIEL, let's pretend that you don't keep on adding portuguese until it's one more than arabic xD....I'm smelling a conspiracy...Let me guess, Trang is in on this too right? You guys are so on!

You're lucky I'm full of homework in these weeks :P
But I guess Portuguese and Arabic will both lose to Dorenda.

Well, I still have over 600 sentences to go before I catch up with Portuguese, and even more for Arabic, so if work hard, there still is a chance for you to stay ahead of Dutch. :P

Why so paranoiac, saeb ;)?
Great work, brauliobezerra =)!

I don’t battle anymody. :) I guess I have no chances to make Russian higher than Dutch until Dorenda’s around. :)

Hmm... It seems I can declare war on Esperanto! :)))

> btw, are slangy sentences allowed on tatoeba...u know gaming slang, 1337 speak, gang slang...
They are allowed, but if you do add these kind of sentences, it would be useful if you put them in a list. Also we prefer to avoid sentences that are "not safe for kids" until we have a way to tag them and filter them out (so that people who may use our content for educational purpose can only go for the more "decent" type of content).
> PS watch out for the lame battle b/w arabic n portuguese :P
You guys make me laugh (in a good way :P) It's actually entertaining to watch x)
PS: I had nothing to with it. Muiriel decided to pick on you on her own!

*Corrected typo*
This is the official method, AFAIK.
1. Do not change either the Japanese or the English (provided that both Japanese and English are valid sentences).
2. Add a new sentence, as a translation, of either the Japanese or English.*
* For practical reasons it is best to add a new English translation of the Japanese as the Japanese needs index information adding. Also it is preferable to add sentences in your native, not second, language.
3. After adding the new translation, unlink the old translation. You can do this by 'owning' either the sentence being translated (Japanese) or the incorrect old translation (English), refreshing the page, and then clicking the appropriate 'scissor icon'.
4. The 'meaning' field of the Japanese index data will need to be changed to the ID of the new sentence. You can do that from this page
http://tatoeba.org/sentence_annotations/
although it may be easier just to leave a note / PM for me or Jim to do so.

Linkage.
OK, I'm seeing a lot of cases where two sentences should be linked (or unlinked) but both are owned. Could we allow linking/unlinking between sentences even if we aren't the owners? It's slowing things down, especially as you can't count on people staying in Tatoeba.

Yes, but before I give more power to trusted users, I want to display the "latest links" somewhere (in the same way there's a page where you can see the latest sentences added/edited/deleted).

but isn't that too much power? technically trusted users can then 'disappear' a sentence, right?

They don't actually go anywhere, and can still be found by a whole bunch of methods.
Some sentences need to disappear, anyway ;-)
See
http://tatoeba.org/jpn/sentences/show/383895
It's linked to
http://tatoeba.org/jpn/sentences/show/336221
(which should be deleted)
It isn't directly linked to
http://tatoeba.org/jpn/sentences/show/383894
(but it should be)
At the moment I can't link it to the sentence it should be linked to, nor can I unlink it from the sentence it should be unlinked from.

I can see that you need it :), but generally speaking, if it gets implemented for trusted users...well it'll be possible for s.o. to unlink a sentence and cause it to be 'left behind' without the consent of it's owner...which is pretty close to deleting a sentence imo.

Well, dude*, we're either trusted or we're not.
* Hopefully correct.

yet there isn't any real criteria for getting trusted or not mate*.
* Hopefully correct.

[not needed anymore- removed by CK]

[not needed anymore- removed by CK]

Actually both those cases, kimi and boku, are notorious for being very often used by the opposite gender to that expected by textbooks.

Those [M] and [F] refer to the Japanese.
I added them years ago at someone's suggestion as it seemed like a good idea at the time 8-)}. Ideally they should be part of metadata associated with the Japanese. It wouldn't break my heart if they were removed from the English sentences entirely.

[not needed anymore- removed by CK]

I'd forgotten they were originally on the Japanese sentences. All the more reason to remove them from the English ones.
I think moving them back to the Japanese is more than a global replacement. I think it would be better to remove them totally.

> I think moving them back to the Japanese is more than
> a global replacement. I think it would be better to
> remove them totally.
I suggest holding your horses on the second part, at
least until Tatoeba has a meta data handling system up.

JPN INDICES
I've sent* in a UTF-8 file (with BOM, unfortunately) containing the updated index and meaning field information I've been working on. Could Sysko or Trang update Tatoeba from it and post here when it's done?
* To the team@tatoeba.fr address.

Yes, I would like to know too. I have a change or two to get in before Saturday's dump.

I've just read an email from Trang saying it's done. Hope it all went well (fingers metaphorically crossed - would be really crossed but that makes typing difficult).

The new Japanese readings include both the kanji and their reading below the sentence. Is this an interim solution or would people be willing to reconsider it?
My view is that it's a little redundant and it could be better to either leave the kanji out or add the kanji readings as furigana.
I wasn't able to set MeCab up on this machine but I assume it isn't too difficult to format the output to create ruby code.
Speaking of formatting the MeCab output, can one disable the "readings" on punctuation and long vowel marks (see e.g. Sentence nº126252 and nº75484) and parse numbers together (see e.g. Sentence nº115312)?

[not needed anymore- removed by CK]

There's no need to use the semantically incorrect tt tag. There is already ruby character markup available and stylesheets to render them properly in modern browsers that can't handle them by default. For general inline and block level styling there are the span and div tags.
You have a good point that the redundancy in the kanji is useful though maybe not terribly aesthetically pleasing. I suppose this may be the best solution until we have furigana implemented.

[not needed anymore- removed by CK]

I sort of figured that. What I wanted to point out wast that the semantically correct equivalent of your proposal would be to use a span inline element:
東京 <span class="reading">[とうきょう]</span>
with the style definition:
.reading { whatever-property:value; }
One could actually drop the brackets:
東京 <span class="reading">とうきょう</span>
and add them in CSS with something like:
.reading:before {content: "["}
.reading:after {content: "]"}
But we can just as well wait for furigana...

I like the new way too (not that I read it often.) MUCH better than the old romanization.

It's temporary. I'm aware that it is a bit redundant.
We actually have a tool that converts into furigana. It works at least on IE8, Firefox 3.5, Chome 4.1 and Opera 10:
http://tatoeba.org/eng/tools/ro...&type=furigana
So we can display furigana but this is not in our priorities at the moment, so you will probably have to wait a couple of months before we get it done.

Good to hear that it's only temporary. I was aware of the furigana tool. Thanks for your work on this issue.
Any idea how simple it is to fix the readings for the punctuations, long vowel marks and numbers?

For the punctuations, it should not be too difficult.
For the long vowel marks, if you are talking about the romaji, we used to convert it into a hyphen. So we will leave it like it is now.
For the numbers, it depends what kind of fix you want...

Punctuations: Great.
Long vowel marks: Sentence nº75484 shows what I mean. The long vowel mark is repeated in the "reading" like the punctuation: "ー" becomes "ー[ー]".
Numbers: It would be good if "10時" got the reading "じゅうじ" rather than "1[いち] 0[ぜろ] 時[じ]" (see e.g. sentence nº115312).
Similarly, 10日 should get "とおか" and 10分 should get "じっぷん" (or "じゅっぷん" though sometimes it's "じゅうぶん" ... *sigh*).

Re: Numbers: It would be good if "10時" got the reading "じゅうじ" rather than "1[いち] 0[ぜろ] 時[じ]" (see e.g. sentence nº115312).
The fix for that is to add to the IPADIC files used by MeCab. They look something like:
蹴込,1285,1285,5622,名詞,一般,*,*,*,*,蹴込,ケコミ,ケコミ
in their raw form. Adding 10時 ジュウジ is possible. It would be best to become familiar with the structure and weightings in those files before embarking on it.

I'd like to see furigana. However, when I last checked a few years ago, support for the RUBY tag was rather limited and furigana displayed rather poorly in some browsers.
The 'kanji + readings' display doesn't really bother me, but then I rarely look at that line anyway.

Yes, it appears IE is the only browser with even limited ruby character support. The markup is, however, designed to fall back on something similar to what we currently have and the ruby characters can be implemented on modern browsers using CSS.
For more, see: http://en.wikipedia.org/wiki/Ruby_character

Quick fix idea.
This should be a relatively easy change, and it should make my (and Jim's) life easier.
1. Setting the 'meaning' field for a Japanese sentence automatically links that sentence to the English sentence identified.
2. On a standard sentence display of a Japanese sentence the link to the sentence identified in the meaning field looks different to the rest. (i.e. Red arrow instead of green, or something).
I also suggest that a meaning field entry of zero (0) could be used to identify Japanese sentences that are intentionally not to be used with WWWJDIC.

Duplicate removal script.
I don't know how the script works exactly, but I think it may be missing a step.
Suppose we have
100000 Hello.
100001 こんにちは。
100002 Hi.
100001 is linked to 100000
100001 has the meaning field of 100000
Now, suppose someone decides that 'Hello' and 'Hi' are close enough to not need both.
100000 Hello.
100001 こんにちは。
100002 Hi. ---> Hello.
Then suppose the script removes 100000.
100001 こんにちは。
100002 Hello.
Is 100001 still linked to 100000? It should be linked to the duplicate 100002 instead.
Does 100001 still have the meaning field of 100000? It should have the meaning field of 100002 instead.
In other words is Sentence A is removed as a duplicate of Sentence B then all the links that pointed to Sentence A should now point to Sentence B instead.

the remove duplicate script does the following
identify all the sentence which have both the same language and the same text
and after it will keep the oldest sentence which are owned by someone (or the oldest one if none of the duplicate belongs to someone) and then will relink all links to the duplicate to this one
(so comments / translations / lists etc... etc.. )
and finally will remove the duplicate and keep only one
so the script will not produce any broken reference to a removed sentences

> identify all the sentence which have both the same language and the same text
So it also merges duplicates that are not linked whatsoever?

Yep, that way even if new comers add
I love you
and translate it,
as I love you already exist, the script will delete the new "I love you" and link the translation to the old "I love you" (or also removed it, if the translation already exists too)

> so the script will not produce any broken reference to
> a removed sentences
There are, however, some broken references being produced. It's not clear how though.
236727 あなたには姉妹がいますか。
was linked to 71123, which now no longer exists.
69566 Do you have any sisters?
does exist and was indirectly linked from 236727.
I don't know when 71123 was removed, why it was removed, or how it was removed, but something obviously went wrong somewhere. (It was one of the \N records last week - so it obviously isn't a recent deletion)
Hopefully these broken links are left over from earlier times and won't be reoccurring.

ok at least the remove duplicate script will not produce anymore broken links

I just gotta say I love the modified japanese readings.
そこで 彼女[かのじょ] に 会お[あお] う と は 思い[おもい] も かけ なかっ た 。
I think I'm getting passive practice just being on tatoeba :)

who shall I thank for this delicacy?

Biptaste but he's not really often here ^^

btw sysko, any ideas on how to compile eclectus in a windows (7) environment...the wiki doesn't help at all and I have no experience with python (but some with c,c++,java,vb)

Hi saeb, as sysko said, the current dependecy on KDE (more specific pykde) makes it impossible to run on Windows. I already ported the whole lot to Qt only (read: it runs on Windows :) but didn't commit anything to the repository right now. I'll be at it pretty soon, why don't you send me an email, I'll be happy for a beta-tester.
I'm OK with ppl emailing me, in fact I hardly get any mails asking for assistance but I need to visit walls on other projects to help out :)

thx a lot for the reply :)
beta-tester? count me in! I'll give it a go whenever it's up n ready ^^

Oh saeb, saeb. "Windows" is a word that should NEVER, EVER be said in front of sysko... But you didn't know so he will forgive you.
He won't help you to compile eclectus on Windows Seven. However, he gladly will help you install a Linux distribution and compile eclectus on it.

linux...you guys are gonna make me change my major soon enough :PP...must..take a break..from tatoeba

You know, you don't need to major in Computer Science to use Linux ^^
Also, how could you ever take a break from Tatoeba knowing that Portuguese is ahead of Arabic? :O
This is not the way to go. You have to fight harder!

5 hrs integrated exams are not cool at all I swear xD. Not even spanish will be safe once I get my break, mark my words :PP

python is an interpreted language so you don't need to compile it,
but i think it will not work for the moment on OS other than linux as eclectus as still some dependencies with KDE :( . but after cburgmer knows better than me, because at least he's eclectus author :p

>Oh saeb, saeb. "Windows" is a word that should NEVER, EVER be said in front of sysko
oh somebody shoot me (apologies to all the cute babies who died in this incident :P)
>However, he gladly will help you install a Linux distribution and compile eclectus on it.
is that right sysko? anytime when u're ready (I just hope this doesn't take the whole weekend :P)

> is that right sysko? anytime when u're ready (I just hope this doesn't take the whole weekend :P)
I forgot to mention, "in a very distant future" :P Not that I want to prevent you from using eclectus saeb, but we (including sysko) actually have a lot of work at the moment :)
As sysko said, you can always contact the author of eclectus if you really want to use it (he's in Tatoeba under the username of cburgmer).
Also, note that eclectus is not something you compile, since Python is an interpreted language (I hadn't paid attention that it was in Python).
PS: If you are not a major in Computer Science, and do not know what the difference is between compiled and interpreted, Google is your friend :)
=> http://www.google.com/search?so...d+and+compiled

I know..."use common sense", "read the manual", "do your research", "less talk more work". got it :) *opens anatomy book*
[I know you guys are very busy :P...I just can't stop messing with you frenchmen...always so serious (oh I'm so getting slaps on the face in my inbox aren't I?)]

would be really nice if you could pass it on ^^ , tell'm he got fans on tatoeba :D