Dês (mewzûyêk)
Tîpî
Verê perskerdişê persêk xeyrê xo persa xo ser o PZP de cigêrayîş bikerê.
Ma wazenîme ke seba munaqeşeyanê medenîyan atmosferêko rindane îdame bikerîme. Xeyrê xo qaydeyanê ma yê verba hereketanê xiraban biwanê.
sharptoothed
vizêr
sharptoothed
vizêr
TATAR1
vizêr
AlanF_US
vizêr
sharptoothed
vizêr
Shanaz
vizêr
Qaztat
vizêr
TATAR1
vizêr
Tartar
vizêr
menaud
vizêr

World English Bible and 公協訳聖書
See http://c11n.net/ and http://ebible.org/
I think that special handling would be required to import them in full, but it would probably be no problem from a copyright point of view.
Because it is important to have the text in the right order and to be able to search through it by verse there should probably be a special interface implemented. It would probably also benefit from keeping a reserved number space as well.

> I think that special handling would be required to import
> them in full
Also Tatoeba is limited to (IIRC) a maximum of 500 characters for an example - some verses might exceed that.

For longer text, there will be a special section, that way one will be able to post longer paragraph, short novel, speach etc.

I have another question. Hopefully it will prove to be less controversial than the last one.
Are there any Japanese sentences that need to be translated into English? I might translate some if I have some time but all the sentences that I find with the serial translation tool already have an English translation. Thanks.

Trang can create a list of Japanese sentence which have no English translation

Thanks for the offer. I don't know how much time I can spare really so no need to make that list just for me.
It might a good idea to have a feature in the serial translation tool to find sentences in source language X with no translation in destination language Y. That would enable users to do mass translations of sentences.

Yep that's something we plan to do ... when we will have time (because in fact it's no so tricky to implement this in sql in an efficient way)

How about this list http://tatoeba.org/eng/sentences_lists/show/24
not really sure how up to date it is...

WWWJDIC is looking in particular for sentences that cover words or grammatical phrases that are poorly represented at present. (Tatoeba is not so restricted)
What I find works best is to read something and check interesting words in WWWJDIC. If you find one with no, or few, examples then that would be a good one to add. Note that were words have several senses in WWWJDIC then poorly represented senses are just as important as poorly represented words.

Nice job on improving the website. It looks much prettier and more functional than the last time I visited.

Quick fix idea.
This is a simple idea that should make things a little simpler. Each Japanese sentence should have zero, or one, set of index data. Spurious extra sets can be left over when duplicates are merged.
If you could generate the equivalent of the wwwjdic.csv file including only records with more than one set of index data the day _before_ you export the whole file (e.g. on Friday of each week) then me and Jim would have a chance to fix things before the weekly update.

Quick request with regard to sentence edit behaviour.
There's one thing that bothers me with how the sentence editing function works - if you look at another tab (Firefox) the edit in progress disappears.
So what happens is that I'm half way through translating a sentence when I decide to check something, and when I go back it's all gone and I have to remember it from scratch.

[not needed anymore- removed by CK]

I'm not sure they need correcting as such. It's just a sign that a typical English bible uses 'him' more and a typical Japanese bible uses 'イエス' more. What you can do is note "[Bible, Psalms 166:68]" (or whatever) if it is an actual quote and you can work out where it's from.

[not needed anymore- removed by CK]

> If you want to do something like "[Bible, Psalms
> 166:68]", then it would probably be faster to delete
> all the existing Bible sentences and find public domain
> version of the Bible in various languages and dump lots
> of those sentences into the database.
Using PD Bibles as sources sounds like a good idea to me, but the existing sentences have the advantage of already having index data so I wouldn't just delete them.

[not needed anymore- removed by CK]

> This Google search only gets "Tanaka Corpus" results.
Yeah, that's because it should be "いうところによれば"
> Doing similar searches might be an interesting
> approach to check the accuracy and/or naturalness
> of sentences in this database.
I think that's pretty much a standard approach here. Google has quite a few quirks that you need to know to get the best results, but overall it's pretty good.

Could we have an official decision on using -1 in the meaning field to mean 'not for WWWJDIC' ?
*bump*

As I said in my email, I'm okay with it. It all depends on Jim :)

[not needed anymore- removed by CK]

Every language includes a different set of ambiguities, omitted information and conventions. If you attempt to remove these ambiguities, etc. in the sentences being translated then you are going to create bias in the vocabulary used and will also be likely end up with unnatural sentences.
The typical example for this is pronouns in Japanese. If you start adding あなた to sentences that, when translated, use 'you' in the English then there will far more あなたs used in those examples than you would ever find in natural Japanese.

When I went to the "trusted" index edit page for 97078, I found there were two sets of indices! Is this a result of automerge? I can't think it is valid.

> I found there were two sets of indices!
> Is this a result of automerge?
Probably. It's not the first time that's happened.
> I can't think it is valid.
I'm not sure it would be an easy fix - if you've got two identical sentences with an index each the computer isn't going to know which one is right and which one is spurious. What you should do is delete the index of one of the sentences you make identical in advance - but I obviously didn't remember to do it for that one.