bitmu (to la mupli toi)

bitmu (to 7183 boxna toi)

te sidju se stidi

i ba'o lo ka retsku vau do tcidu e'o lo cafne se retsku

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

lo romoi se benji

subdirectory_arrow_right

small_snow

pu za lo djedi be li 2

subdirectory_arrow_right

gillux

pu za lo djedi be li 2

feedback

sharptoothed

pu za lo djedi be li 2

subdirectory_arrow_right

fatimamarques

pu za lo djedi be li 2

feedback

fatimamarques

pu za lo djedi be li 2

subdirectory_arrow_right

AlanF_US

pu za lo djedi be li 2

subdirectory_arrow_right

marafon

pu za lo djedi be li 2

subdirectory_arrow_right

PaulP

pu za lo djedi be li 2

feedback

Vortarulo

pu za lo djedi be li 2

subdirectory_arrow_right

Ooneykcall

pu za lo djedi be li 4

CK April 22, 2010, edited October 25, 2019 April 22, 2010 at 3:14:59 PM UTC, edited October 25, 2019 at 8:02:50 AM UTC

flag

Report

link

Permalink

[not needed anymore- removed by CK]

de'a zgana lo se retsku di'a zgana lo se retsku

JimBreen April 22, 2010 April 22, 2010 at 3:22:44 PM UTC

flag

Report

link

Permalink

Hmmm. "He adopted a war orphan and is bringing her up as a foster daughter."

Comment on the sentence itself and Francis will get a message.

blay_paul April 18, 2010 April 18, 2010 at 3:48:07 PM UTC

flag

Report

link

Permalink

Bump.

Examples.gz, how often is it updated?

On the Monash FTP Archive page it says:

[...] It is updated daily from the server site.
* examples.gz (8371412 bytes) the file.

But I think that is no longer accurate. The one I've just downloaded hasn't been updated from a week ago. Also could the ID numbers given in examples.gz be from the Japanese sentences, not the English sentences? The English ones aren't unique so the IDs are pretty much useless.

de'a zgana lo se retsku di'a zgana lo se retsku

TRANG April 18, 2010 April 18, 2010 at 5:06:17 PM UTC

flag

Report

link

Permalink

You'll have to ask this to Jim for this because I don't think anyone else has the answer here ^^
It may be faster to simply send him an email...

de'a zgana lo se retsku di'a zgana lo se retsku

blay_paul April 18, 2010 April 18, 2010 at 5:14:21 PM UTC

flag

Report

link

Permalink

Could you make available a download with the stuff Jim uses for WWWJDIC? i.e.
a) All Japanese sentences that have an Index field linked. (With sentence number of Japanese sentences)
b) All English fields that are mentioned in the 'Meaning' field. (With sentence number of English sentences)
c) All index fields. (With 'Meaning' field).

de'a zgana lo se retsku di'a zgana lo se retsku

TRANG April 18, 2010 April 18, 2010 at 5:20:36 PM UTC

flag

Report

link

Permalink

I haven't had time to update the "downloads" page yet, but the file data Jim uses can be downloaded here:

http://tatoeba.org/app/webroot/files/downloads/
(wwwjdic.csv)

The fields are:
jpn_sentence_id, eng_sentence_id, jpn_text, eng_text, jpn_index

de'a zgana lo se retsku di'a zgana lo se retsku

blay_paul April 19, 2010 April 19, 2010 at 8:57:30 PM UTC

flag

Report

link

Permalink

Just one more question - how often do you update the files on the download page?

de'a zgana lo se retsku di'a zgana lo se retsku

TRANG April 20, 2010 April 20, 2010 at 9:46:02 AM UTC

flag

Report

link

Permalink

On this page:
http://tatoeba.org/app/webroot/files/downloads/
Once a week. On Saturdays around 9AM France time.

On the download page that you can access from the link at the bottom, never. I have to update that page though, to link to the files in http://tatoeba.org/app/webroot/files/downloads/.

blay_paul April 18, 2010 April 18, 2010 at 6:05:52 PM UTC

flag

Report

link

Permalink

That should do nicely. Thanks.

The other thing is that I'd like to do a complete check and revamp of the index field. To be certain of not losing any data I'd need you to lock the index field so I can download / fix up / upload without any changes happening on your side.

I'm still working on things so I probably won't be ready for a week or so.

de'a zgana lo se retsku di'a zgana lo se retsku

JimBreen April 19, 2010 April 19, 2010 at 7:48:06 AM UTC

flag

Report

link

Permalink

Can you keep me in the loop. I change the odd index when making a correction to a Japanese sentence. Also, when I do the weekly download I run it though a utility that checks that the index and sentence agree. That way I can detect when others have changed a sentence. I usually have to update the index and occasionally add to the list of names to be ignored (e.g. ムーリエル and 赤ずきん this week.)

In addition, I have a list of words from Collin McCulley which had mismatches between the index and dictionary. I have cleaned up most of them, but still have ~100. We need some way of tracking when they get out of kilter, mainly when dictionary entries need qualifying.

TRANG April 19, 2010 April 19, 2010 at 7:14:59 PM UTC

flag

Report

link

Permalink

@Paul, yes, I can easily block the access to the indices to everyone.

JimBreen April 19, 2010 April 19, 2010 at 7:38:17 AM UTC

flag

Report

link

Permalink

I download the file Trang sets up once a week. I check it over then set it up in the WWWJDIC system. At that stage the "examples.gz" file, etc. is rebuilt.

JimBreen April 19, 2010 April 19, 2010 at 7:59:36 AM UTC

flag

Report

link

Permalink

I missed the bit about the ID numbers. I use the English sentence number because 90%+ of the corrections coming from WWWJDIC users are to the English sentence, so it makes sense for WWWJDIC to link there.

As discussed on another forum, I could put in both - e.g.#ID=375963_12345. I have a major change half done in WWWJDIC which is blocking other changes. Once it is clear (maybe a week or so) I can make that sentence number change. I'll enable WWWJDIC users to select whether they want to link to the Japanese or English.

blay_paul April 18, 2010 April 18, 2010 at 9:22:46 PM UTC

flag

Report

link

Permalink

Example export system.

Japanese sentences with multiple English translations are (sometimes?) being exported with both versions.

For example:

4924 73899 「これが探していたものだ」と彼は叫んだ。 "This is what I was looking for," he exclaimed. 此れ[01]{これ} が探す{探していた} 物|1(もの)[01]{もの} だと|1 彼|2(かれ)[01] は|1[01] 叫ぶ{叫んだ}
4924 1513 「これが探していたものだ」と彼は叫んだ。 "This is what I was looking for!" he exclaimed. 此れ[01]{これ} が探す{探していた} 物|1(もの)[01]{もの} だと|1 彼|2(かれ)[01] は|1[01] 叫ぶ{叫んだ}

On the sentence annotations page only 73899 is given as the 'meaning' field with the Index field. So either a) The meaning field in the sentence annotations page isn't used, or b) There can be two or more 'meaning' field values, but only one is shown on the sentence annotations page.

de'a zgana lo se retsku di'a zgana lo se retsku

JimBreen April 19, 2010 April 19, 2010 at 7:52:34 AM UTC

flag

Report

link

Permalink

I've noticed that, and I have assumed it wasn't used. When I notice cases such as that one, I have changed the English so they become identical, and hope it will lead to the removal of one of them

de'a zgana lo se retsku di'a zgana lo se retsku

blay_paul April 19, 2010 April 19, 2010 at 9:13:38 AM UTC

flag

Report

link

Permalink

That's OK for WWWJDIC, and for cases where one is mistaken or they are both very similar.

It's not good 'Tatoeba practice' though.

de'a zgana lo se retsku di'a zgana lo se retsku

JimBreen April 19, 2010 April 19, 2010 at 9:27:20 AM UTC

flag

Report

link

Permalink

In the case you quoted I thought it *was* the Tatoeba practice. They only differ by an exclamation mark.

Where they differ in more substantial ways, e.g. choice of personal pronoun where the Japanese has none, I guess there is a case for both being kept, and that's a situation where an index is best tied to a sentence-pair rather than just to the Japanese.

Tying to sentence pairs has another problem. The number of sentences in German and French is getting to the stage where it would be nice to include them in WWWJDIC. I'd be looking to having "examples_de" and examples_fr" extracts along with indices.

de'a zgana lo se retsku di'a zgana lo se retsku

blay_paul April 19, 2010 April 19, 2010 at 9:38:16 AM UTC

flag

Report

link

Permalink

> In the case you quoted I thought it *was* the Tatoeba
> practice. They only differ by an exclamation mark.

The case I quoted, yes. There are, however, at least a handful where both alternatives are valid, significantly different, and illustrate something about the English language.

blay_paul April 18, 2010 April 18, 2010 at 8:22:37 PM UTC

flag

Report

link

Permalink

.csv format in downloads.

Just a note for those using the downloads. You use \ as the escape character.

This is a line from your csv file:

"4923";"1512";"「信用して」と彼は言った。";"\"Trust me,\" he said.";"信用為る|1(する){して} と|1 彼|2(かれ)[01] は|1 言う{言った}"

This is how it appears when loaded into Excel.

4923 1512 「信用して」と彼は言った。 \Trust me,\" he said." 信用為る|1(する){して} と|1 彼|2(かれ)[01] は|1 言う{言った}

Excel uses double " marks when escaping quotes. The same line in csv for Excel would be...

"4923";"1512";"「信用して」と彼は言った。";"""Trust me,"" he said.";"信用為る|1(する){して} と|1 彼|2(かれ)[01] は|1 言う{言った}"

Which imports to Excel as follows:

4923 1512 「信用して」と彼は言った。 "Trust me," he said. 信用為る|1(する){して} と|1 彼|2(かれ)[01] は|1 言う{言った}

I think the 'escaping with extra quote mark' may be the more standard version ...

kellenparker April 18, 2010 April 18, 2010 at 3:29:03 PM UTC

flag

Report

link

Permalink

Right. So. I'm not 13 years old. It was an honest mistake. Here's the problem:

I was wondering if Tatoeba had any sort of resistance to profanity. I thought something like "damnit" would be a common enough thing. So I MEANT to SEARCH for "damn". Turns out I added it as a sentence instead. Same for "fuck" because it took me two tries to realise I was using the wrong text box.

So those can be deleted outright. I didn't see any way to do it so I abandoned the sentences instead in case there is such a way and someone else wants to adopt them to get them deleted. 380292 and 380290.

Apologies.

de'a zgana lo se retsku di'a zgana lo se retsku

TRANG April 18, 2010 April 18, 2010 at 3:40:42 PM UTC

flag

Report

link

Permalink

Hahaha, it's fine. It's alos our mistake, it means we need to change the form to make it clearer that it adds a new sentence.

I will delete your entries. There's currently no way for users to delete sentences, only admins can. The only solution when you want to "delete" a sentence is to replace it by a sentence that you actually want to keep.

As for profanity, we don't have anything against it, but we'd rather avoid it until we set up a mechanism to filter out sentences that are "not safe" for kids.

de'a zgana lo se retsku di'a zgana lo se retsku

kellenparker April 18, 2010 April 18, 2010 at 3:44:33 PM UTC

flag

Report

link

Permalink

Good to know. And actually it's still pretty much entirely my fault. I searched at the top but then I think I stopped paying attention so when it sent me to the page saying "Nope, but you can add a sentence:", I thought I was searching again. It's not the form's issue. It's my attention span's issue.

sysko April 18, 2010 April 18, 2010 at 4:48:26 PM UTC

flag

Report

link

Permalink

And for profanities , we have some "colorful" sentences (spoiler : "search XXX in the search engine")

sysko April 18, 2010 April 18, 2010 at 2:39:06 PM UTC

flag

Report

link

Permalink

stemming should be working again for most languages when using the search engine
i.e search "think" should also return "thinking" "thought" etc. same for French / Spanish / Italian / Russian etc.

by the way it will not work with Ukrainian but I was wondering if using the russian stemmer will produce "better than nothing" result ? Demetrius, Dorenda ?
still looking for Arabic and georgian stemmers

de'a zgana lo se retsku di'a zgana lo se retsku

Dorenda April 18, 2010 April 18, 2010 at 3:02:30 PM UTC

flag

Report

link

Permalink

Probably it will. But maybe there is a way to adapt the Russian stemmer into a Ukrainian one (or at least something more fit to Ukrainian)? I have no idea how those things work or how much work it would be, but if it's feasible, I could help with that.

de'a zgana lo se retsku di'a zgana lo se retsku

sysko April 18, 2010 April 18, 2010 at 3:19:34 PM UTC

flag

Report

link

Permalink

globally how the stemmer works for russian is explained here http://snowball.tartarus.org/al...n/stemmer.html , I admit I haven't read it entirely, as I've no notion in Russian (and moreover they provided something which work out of the box for this).

So I dunno how "easy' it is to adapt this to Ukrainian.

de'a zgana lo se retsku di'a zgana lo se retsku

Dorenda April 18, 2010 April 18, 2010 at 5:08:13 PM UTC

flag

Report

link

Permalink

It looks doable. I'd just have to adapt it to the Ukrainian alphabet, change the endings into their Ukrainian counterparts and add/remove some endings that either of the two languages doesn't have.

So I'd have to just change that piece of script on the blue background, right?

de'a zgana lo se retsku di'a zgana lo se retsku

sysko April 18, 2010 April 18, 2010 at 10:55:00 PM UTC

flag

Report

link

Permalink

yep this one http://snowball.tartarus.org/al...em_Unicode.sbl to be more precise :) thanks

de'a zgana lo se retsku di'a zgana lo se retsku

Dorenda April 24, 2010 April 24, 2010 at 3:55:48 PM UTC

flag

Report

link

Permalink

Okay, I adapted it. The results won't always be right, though, cause sometimes it's just not possible to see from the form of a word what type of word it is and thus what belongs to the ending. For example, "koromyslo" is a noun, so only "o" should be removed, but the script will think it's a past tense verb and remove "lo". I tried to choose the least bad options...
Anyway, is there some way to test it? And where should I send it?

And one more question. How can I make the thing also remove the superlative prefix '{n}{a}{i'}' from the beginning of words?

de'a zgana lo se retsku di'a zgana lo se retsku

sysko April 24, 2010 April 24, 2010 at 4:43:31 PM UTC

flag

Report

link

Permalink

send us the file to our email address team [at] tatoeba [dot] org, and i will see how to integrate it.
to be honnest i don't really how it works (A) at least I will contact the guys of this project to see what can we do:),
but it's already great if you have adapted it to Ukrainian

saeb April 17, 2010 April 17, 2010 at 8:48:46 PM UTC

flag

Report

link

Permalink

Congrats on the new server! I can already feel the site is 100x faster. oh and I'm in love with the new inbox, great update!...now are we cool or are we cool :)

de'a zgana lo se retsku di'a zgana lo se retsku

Dorenda April 17, 2010 April 17, 2010 at 11:05:49 PM UTC

flag

Report

link

Permalink

You're cool. :)
It's so much faster, great! :D
(And I just loved that note we got while the site didn't work. :))

blay_paul April 15, 2010 April 15, 2010 at 7:32:59 PM UTC

flag

Report

link

Permalink

*psst* Trang (or sysko)

I need to replace
それら<space>
with
其れ等{それら}<space>
but the <space> doesn't seem to work for the 'replace with' field. (At least there aren't any spaces in the preview).

There are 87 instances that need to be replaced in the index field so I don't really want to do it manually.

de'a zgana lo se retsku di'a zgana lo se retsku

TRANG April 15, 2010 April 15, 2010 at 11:13:44 PM UTC

flag

Report

link

Permalink

You have to use an actual space in the "Replace" field, not the <space> tag :)

The reason why you have to type <space> in the "Search" is because trailing spaces are not taken into account in the search, for some reason. But the "Replace" field accepts trailing spaces (normally...).

de'a zgana lo se retsku di'a zgana lo se retsku

blay_paul April 16, 2010 April 16, 2010 at 5:40:46 AM UTC

flag

Report

link

Permalink

> You have to use an actual space in the "Replace" field, not the <space> tag :)

I tried it both ways - no spaces in the preview display.

I've found what the problem is, though. The preview button only works ONCE. If the old preview is still displayed then it doesn't do anything when you click the preview button with a different string.

de'a zgana lo se retsku di'a zgana lo se retsku

TRANG April 16, 2010 April 16, 2010 at 10:51:45 AM UTC

flag

Report

link

Permalink

> The preview button only works ONCE.

Ah right, I forgot to warn you about this. The "preview" function may work more than once, but I have yet to figured out the conditions for it to work/not work a second time. In your case, I'm guessing it didn't work because of the < and >...

brauliobezerra April 15, 2010 April 15, 2010 at 3:46:11 PM UTC

flag

Report

link

Permalink

About names, can we translate them if there's an obvious correspondence? I'm talking about names like Peter, Mary, etc.

de'a zgana lo se retsku di'a zgana lo se retsku

MUIRIEL April 15, 2010 April 15, 2010 at 5:23:55 PM UTC

flag

Report

link

Permalink

Good question. Personally, I never translate them, because I think Ann should be called Ann, and not Anne, no matter if she is in France or in the UK at the moment ;).
But I often see translations of names on Tatoeba...

de'a zgana lo se retsku di'a zgana lo se retsku

JimBreen April 16, 2010 April 16, 2010 at 2:39:11 PM UTC

flag

Report

link

Permalink

Hmmm. So my younger sister should change her name from Anne to Ann?
My parents got it wrong? 8-)
Actually Anne is about as common as Ann among English-speaking people. Canonical spellings are a thing of the past. and we've always had Graham and Graeme, Roger and Rodger, etc.

de'a zgana lo se retsku di'a zgana lo se retsku

MUIRIEL April 16, 2010 April 16, 2010 at 2:47:54 PM UTC

flag

Report

link

Permalink

That's not what I meant.
I just meant that I would call your sister like your parents call her and not translate her name in my language (or in any other language).

blay_paul April 15, 2010 April 15, 2010 at 5:27:09 PM UTC

flag

Report

link

Permalink

With Japanese you should 'transliterate' to katakana so Paul becomes ポール (for instance). When going from Japanese to English there are a number of variations to consider.

TRANG April 15, 2010 April 15, 2010 at 6:09:01 PM UTC

flag

Report

link

Permalink

You can.

As far as I'm concerned, I have the same opinion as Muiriel. But we won't forbid translations of names. I don't see any good reason to forbid it anyway.

de'a zgana lo se retsku di'a zgana lo se retsku

Dorenda April 15, 2010 April 15, 2010 at 7:02:21 PM UTC

flag

Report

link

Permalink

In general I agree that a person should be called by his/her own name, no matter where he/she is, but some languages have more of a tendency to translate names (as I read somewhere lately, when they speak about George Bush in Scottish Gaelic, they call him Seòras Bush, for example, while in Dutch we would (nowadays) just leave his name the way it is), so I think you should also consider how common it is for the language you're translating into to translate names or to use the foreign version.
And then there is the next problem... Suppose an English sentence about Peter has been translated into Russian by someone who decided to translate the name. So now we have a Peter and a Pyotr. If someone translated the Russian sentence into Ukrainian, it would look silly not to make it Petro, since that's how they do it: Ukrainians use different versions of their name depending on what language they are speaking. Now if I wanted to translate any of these sentences without translating names, I'd have to make three translations. Or I could just choose one of them and link my translations to all other three sentences, but it would be strange to have a Dutch sentence with Pyotr as a translations of an English sentence about Peter, for example. So I would choose a name that is common in Dutch: Peter, or maybe Pieter or even Petrus.

Long story, but what I wanted to say is: it all depends on the situation and the language you're translating into. :)

de'a zgana lo se retsku di'a zgana lo se retsku

MUIRIEL April 15, 2010 April 15, 2010 at 7:12:16 PM UTC

flag

Report

link

Permalink

same example for the French^^: They pronounce George Bush as if it was a French name. Too strange for me as German - we would never call him Georg Busch :D.

saeb April 15, 2010 April 15, 2010 at 7:29:46 PM UTC

flag

Report

link

Permalink

oh god I would never translate peter into arabic. The arabic version sounds awful :P

blay_paul April 14, 2010 April 14, 2010 at 2:31:43 PM UTC

flag

Report

link

Permalink

Sentence Annotation page

Could you put up a "Changes saved" message on the page after you click the 'save' button? Otherwise it's easy to forget whether you've saved the work you've done or not.

de'a zgana lo se retsku di'a zgana lo se retsku

JimBreen April 15, 2010 April 15, 2010 at 3:03:37 AM UTC

flag

Report

link

Permalink

I second that request. Also a log of changes would be really good.

TRANG April 15, 2010 April 15, 2010 at 9:28:52 AM UTC

flag

Report

link

Permalink

> Could you put up a "Changes saved" message on the page after you click the 'save' button?

Yes, I'll take care of this after we have moved to our new server.

> Also a log of changes would be really good.

I'll try to do that for the end of the month.

bitmu (to 7183 boxna toi)

te sidju se stidi

small_snow

gillux

sharptoothed

fatimamarques

fatimamarques

AlanF_US

marafon

PaulP

Vortarulo

Ooneykcall

le drata te sidju

lo jai gau farvi

skicu