Wall (7,019 threads)
Tips
Before asking a question, make sure to read the FAQ.
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
sharptoothed
11 days ago
CK
12 days ago
AmarMecheri
16 days ago
imalaqvayli
17 days ago
DJ_Saidez
17 days ago
imalaqvayli
17 days ago
sharptoothed
25 days ago
sharptoothed
September 29, 2024
soridsolid
September 17, 2024
sharptoothed
September 15, 2024
WWWJDIC index line.
I suggest adding links from words in the Japanese sentence to WWWJDIC entries using the information in the index line. That would be a useful 'first step' towards adding furigana to the sentence.
The basic set-up is relatively straight forward, but there is one complication - namely 'deliberately non-indexed text'. Punctuation, English words, place names and other proper nouns are not generally included in EDICT and so do not have entries in the Index line. Jim Breen should have a 'no index' field that includes all non-indexed text (although it may not be up to date). In order to parse a sentence properly you need both the index line and the non-indexed text.
Adding furigana to place names etc. should probably be left for later.
Word-by-word links based on the Japanese index words would be good, and not too hard to implement, I think.
At present I am pulling the sentences and indices into WWWJDIC once a week, and I put them through a utility which matches the text and the index contents, and reports if there is a mismatch (which usually means that someone has changed a sentence.) To get around the problem of "deliberately non-indexed text" I have a file of
words which I ignore if they are not in the index. You can see this list of words at http://www.csse.monash.edu.au/~...amplestopwords (in EUC-JP). Most are names. Some look a bit odd as they are two or more names which had been separated by punctuation (which I ignore.)
Translation suggestions
There are now 100 translation suggestions waiting to be checked at
https://translations.launchpad..../ja/+translate
I would urge people who understand Japanese to check them and either confirm or correct them.
Source Code?
In order to better determine possible translations for
https://translations.launchpad..../ja/+translate
it would help if I could view the source code.
Also, some translation items require code reworking as well. e.g. "linked to" should probably be "linked to » %s" (where %s is the sentence number) so that the Japanese could be something like "%s とつながる".
Normally the path of the file should be enough of a hint for you to figure out on which page of the website the string can be found.
If the path is something like /views/<something>/file.ctp file, then you would usually (not always) need to go to http://tatoeba.org/<something>/file
If the path is /controllers/<something>_controller.php, then the string is a bit harder to find, but it can be found somewhere in the pages that start with http://tatoeba.org/<something>/
I don't know how comfortable you are looking at source code, but it could be simpler if you just translated what you can first. We have a "test" version of Tatoeba where we test things before we update the "real" version of Tatoeba. As soon as you have your translations done (even partially), we can update the "test" version and you can then browse around in there to check if the translations fit or not. I'll give you the link in a private message.
Other than that, the source code can be found here:
http://subversion.assembla.com/...ba2/trunk/app/
Just note that the strings in Launchpad are not always exactly synchronized with the code source.
> I don't know how comfortable you are looking at source code
Reasonably. I'm familiar with Visual Basic, Javascript and Visual Basic - PhP is like the bastard offspring of all of those.
Without looking at the code it's very difficult to correctly translate things like
<b>Share</b> your knowledge.
because they are handled as _two strings_ and the order needs to be reversed in Japanese.
<b>知識</b>を共有する
Ah yes, forgot to mention, like you noted, there are some strings that we forgot to make more "compliant" for internationalization.
You can send me an email to list those you find. I'll fix it in the code and update the strings in Launchpad.
Question, which places more strain on the server: generating sentences using a keyword query or using the random sentence generator?
random sentences for sure, mysql doesn't like random at all ^^ we're on the way to try to make it faster
Would be really cool if we could add audio someday to the example sentences ;-)
We plan to do so, you will have more details and maybe a proof of concet at the beginning of April :)
Looking forward to that!
Did you change something with the database dump? This Saturday's jpn_indices contain invalid utf8 characters and the affected lines seem to be truncated.
The following sentence ids have problems: 83767, 91272, 140460, 146080, 152054, 190707, 195118, 199753, 205628, 211131, 213530, 235850
Ah, indeed, indeed. I had changed the 'text' field from varchar to varbinary, but kept the length to 500. That's why those entries were truncated. I've fixed it and did a new export of the jpn_indices.
> This Saturday's jpn_indices [...]
How do you know about that by the way? I don't remember making it official yet, that the download files would be upadted on Saturdays. (or did I? o.o)
1000+ sentences in arabic.
I'd like 2 thank everyone that has ever thanked everyone. On behalf of all of us you've thanked I say thank u for thanking us.
lol dane cook is brilliant :D
:D thank you^^.
I believe in ghosts. I believe in aliens. But theres no way u will ever persuade me into believing in alien ghosts. Ridiculous.
I believe in the sentence method. I believe in language websites. But theres no way u will ever persuade me into believing in sentence websites. Ridiculous
yay! first tatoeba joke :P (hmm I wonder if I can consider this a wall abuse..)
TRANG says:
omg you're so funny, stop "abusing" the wall :D
I should just mention I never said that :P
But I do think it. Well, especially the "abusing the wall" part, because now I'm working on figuring out how to paginate this wall. Certainly there will be more abuse.
Just wanted to let everyone know, Tatoeba has been updated.
http://blog.tatoeba.org/2010/03...13th-2010.html
Enjoy :)
*shock* just figured out TRANG is a she, he he.
Ah, who told on me, that was supposed to be a secret.
lol, I never said this before but I actually stalked this website for quite a while before I finally decided to join, and I always imagined that you'd be like these programmers who like anime and have studied japanese for 5 yrs in their university..you know..with a cool blog about every obsessive detail of their life...and eyeglasses...you know the whole shabang... :D
P.S. guys like that do really exist :D