Wall (6,722 threads)
Tips
Before asking a question, make sure to read the FAQ.
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
Thanuir
4 days ago
mollydot
4 days ago
Thanuir
8 days ago
CK
8 days ago
Yorwba
8 days ago
AlanF_US
8 days ago
CK
8 days ago
Cabo
8 days ago
iopq
9 days ago
Pfirsichbaeumchen
9 days ago

[not needed anymore- removed by CK]

1) Separate handling for Meta information in Tatoeba sentences is already in the todo list.
2) [M] and [F] tags were removed from the English sentences and applied to the Japanese in the last update done to the Tanaka Corpus before control was turned over to Tatoeba. Unfortunately recent events suggest the last update didn't make it to Tatoeba. This hasn't been fixed because of 1)
3) When the meta information system is redone it is planned to re-evaluate the basis of the [M] and [F] tags. 僕 alone will almost certainly not be worth an [M] tag (due to developments in modern Japanese). Although I would disagree that beginners of Japanese necessarily know about 'boku' and 'kimi' being used in masculine speech.

Spurious line feeds not removed from Index data input.
See the Index data for sentence 101622.

Also, could records with a 'meaning' field of -1 be excluded from the wwwjdic.csv file?

Yes.

Yes, there were a heap of them in the last wwwjdic.csv. There were also two blank lines. The first time that has happened. (Spurious line feeds?)

Also 315382 came through with "\N" as the Japanese and English. (Just an index...)

Actually, there were about 20 with \N for the English, Japanese or both.

I think they related to manual deletions or something. I think they were fixed in Tatoeba shortly after the index data download was updated.

Indeed... I'll trim the input before it gets saved.
Other than that there was another index with an extra new line. I corrected both.

Great - thanks for both of those. They'll make my life easier, once a week. ;-)

Could we have a duplicate removal script run soon to, please?

Okay, it's done.
By the way, is there any reason why you add these "For duplicate removal script" comments?
When the sentences are merged, all the comments of the deleted sentence are moved to the remaining sentence...
If you are posting these comments to keep track, it is best to also indicate the id's of the sentences that have to be merged, not just that it has to be deleted ^^

> By the way, is there any reason why you add these
> "For duplicate removal script" comments?
Only so that Jim (and other users) can see what I have changed on the basis that it is a 'near duplicate' before the merge happens. Basically it's so people have a chance to complain.
I delete the comments post merge when I come across them (which is quite often, for technical reasons).

I thought you guys checked your facebook group (ok I'll shutup :P). well, 2 issues I wanna raise (I know...others already brought it up...he he plaigiarism):
http://bit.ly/cO4t8E
http://bit.ly/cg3rXJ

I proposed a long time ago to implement the possibility in sentence comments to write something like @saeb, and it will warn you (trough a private message, or a dedicated section) this way asking for someone helps on a sentence will be easier / or to involve someone when chatting about "how to correct this sentence" ? (by the way nice picts :p)

BTW personal messages don't attract much attention. Is it possible to change design somehow when you have unread ones?

I agree. An email alert that there are messages would be good. And
of course one for "@JimBreen" in a comment too.

it's planned for this release (I will try to do both), for other users who fear "tatoeba spam" you have an option in your profile do desactivate email (though we need to make it more precise to be able to desactivate email notification for each kind, PM, comments, etc.)
this way, as Pharamp would like this (the others tell us what you think) she would like to warned when a translation is added to one of the sentences she likes, so maybe when precise filtering will be possible, have the possible to activate email sending when someone translate a favorite sentences ? (that will give a reason for the existence of favorites ^^)

Tribute to sysko:
http://bit.ly/aJ13uT

yeah

glad to know you have plans :)

tribute to Trang:
http://bit.ly/9rtoRS

still need to think of one for sysko...and baptiste :P

I am learning japanese, but I still dont know how to use Tatoeba.org!
does it all depends on asking others Questions!?

I'm not sure what idea you had of Tatoeba, but perhaps reading this will clear things up:
http://blog.tatoeba.org/2009/11...-language.html

ok, I read that thanx
so its a project more than a learning tool!
so far found it interesting =)
I will contribute to tatoeba as much as i can ^^

In fact I would rather say it's a learning tool, I would say a "learning by example" and "learning by producing" tool, rather than a complete "learning language" method
glad to see you like it :)

Do you plan on having a learning component to Tatoeba?

I would say that asking questions about the meanings of sentences / words included in Tatoeba is a perfectly valid means of learning, and taking part in the community.

Learning component?
Why complicate the project?
I just take the sentences and use them for the 10,000 sentences method. Does a hammer come with the nail? :p

IMHO what will be useful is an advanced search capabilities.
And I'm looking forward to a tagging system. I feel it should be immensely useful.

But asking questions does really help. :)
I'm really grateful to blau_paul, for he answers to most of mine.

What is the official position on puctuation?
Is ‘’ better than ''?

‘’ is sucky for WWWJDIC. Jim still uses EUC-JP by default for example display and I think it has issues with smart quotes. (Or maybe Jim just hates them ;-)

I've already changed ' to ’ somewhere... -_-
Isn't it a matter of two sed commands anyway?

WWWJDIC can display in UTF8, but files are in EUC-JP internally, and "smart quotes" are not supported there. I don't have a problem if they appear, as I can switch them to regular ones as I convert the wwwjdic.csv file.

zMoo has also been using smart quotes in his sentences.
We don't have a policy on this yet, but chances are we'll end up using smart quotes. For now, you can do whatever. But it's probably simpler for you to use straight quotes. They'll be converted when the time comes.

Minor list glitch.
If you look at the list page it says 'DELETE ME!' has 34 entries, but it actually only has seven. I think that deleting a sentence does not remove it from the 'count' in lists. There are probably lost of 'null entries' against IDs in lists and possibly favourites.

I noticed something similar yesterday with my "Dutch sentences to be translated into any language" list. It now says it has 102 sentences, when actually there are only 20. There aren't over a hundred sentences that are or have been in the list, though, so I don't think it has something to do with deleting sentences.

I have updated the lists counts. We don't know why your list ended up with 102, dorenda... But if it happens again, let us know, and try to give as many details as possible ^^

Yes indeed, deleting sentences doesn't remove from list. We know this but since it's not extremely important, we didn't fix it yet.

There's a white? (invisible?) flag in the lang stats that has 4 sentences and gives no description when I hover over it.

white flag ? wha ..what? where ? I see nothing ? Are you sure Saeb ? :angel face:
more seriously, it was due to a little bugs, but it's fix :)

yes in the search engine. I can see example sentences in many languages (also languages i am not interested in). Therefore I want to choose only German, English, Japanese and Russian. :) Thanks for the fast reply.

(can you use the reply button rather than create a new thread :) )
Ok I see, you would like to be able to filter on a set of languages you would defined, for the moment it's not possible, but it will be implemented in a "advance search" page, but to be honnest, I don't know when we will find time to implement this.

Sorry, for now it's not possible.
The best you can do is specify a source language and a target language (for instance from German to English). You would then only have results in this pair of languages.
But there's nothing like searching from German and display translations only in English, Japanese and Russian. You'd have to search German->English, then German->Japanese, then German->Russian, if you really need to filter out unwanted languages.