clear
{{language.name}} No language found.
swap_horiz
{{language.name}} No language found.
search

Wall (5,597 threads)

Tips

Before asking a question, make sure to read the FAQ.

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

Latest messages feedback

kamalos

42 minutes ago

subdirectory_arrow_right

Thanuir

5 hours ago

feedback

CK

10 hours ago

subdirectory_arrow_right

AmarMecheri

13 hours ago

subdirectory_arrow_right

belkacem77

yesterday

subdirectory_arrow_right

AmarMecheri

yesterday

subdirectory_arrow_right

K_hina

yesterday

subdirectory_arrow_right

AmarMecheri

yesterday

subdirectory_arrow_right

Pandaa

yesterday

feedback

AmarMecheri

yesterday

blay_paul blay_paul July 9, 2010 at 2:16 PM July 9, 2010 at 2:16 PM link permalink

Near duplicates - example case.

In accordance with earlier debate on this forum I have been removing near duplicate sentences from those exported to WWWJDIC instead of deleting them. However it was never, IMO, clear exactly when (if ever) deleting near duplicates from Tatoeba is recommended.

Here is one example:

http://tatoeba.org/eng/sentences/show/196748
べティは彼女を殺した。
Betty killed her.

http://tatoeba.org/eng/sentences/show/196749
べティは彼を殺した。
Betty killed him.

The only difference is whether 'her' or 'him' was the victim of Betty's crime of passion. Is it OK to delete one from Tatoeba or not?

{{vm.hiddenReplies[1579] ? 'expand_more' : 'expand_less'}} hide replies show replies
CK CK July 9, 2010 at 3:03 PM, edited October 26, 2019 at 3:56 AM July 9, 2010 at 3:03 PM, edited October 26, 2019 at 3:56 AM link permalink

[not needed - removed by CK]

{{vm.hiddenReplies[1582] ? 'expand_more' : 'expand_less'}} hide replies show replies
Demetrius Demetrius July 9, 2010 at 3:28 PM July 9, 2010 at 3:28 PM link permalink

Not all 'My name"'s are translated identically.



Bel. (Rus., Ukr.) make clear distinction between imia (first name) and proz'višča (last name).

Other languages may have other unexpected things (like changing gender), so the more the better.

CK CK July 9, 2010 at 3:24 PM, edited October 26, 2019 at 3:56 AM July 9, 2010 at 3:24 PM, edited October 26, 2019 at 3:56 AM link permalink

[not needed - removed by CK]

{{vm.hiddenReplies[1584] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko July 9, 2010 at 3:26 PM July 9, 2010 at 3:26 PM link permalink

no sentence "Betty is a serial killer" ?

sysko sysko July 9, 2010 at 2:26 PM July 9, 2010 at 2:26 PM link permalink

http://tatoeba.org/eng/wall/sho...7#message_1237

so no, we do not delete for 2 reasons:
1 - some language learner (me at least) like to have a sentence in a target language, and to see how it changes depending of tense etc.
2 - we will use them for natural language processing

moreover I think contributors will by themselves limit the number of variation they add on a single sentence

{{vm.hiddenReplies[1580] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko July 9, 2010 at 2:38 PM July 9, 2010 at 2:38 PM link permalink

but I agree that when sentences like these are found, it needs to be noted in comments with links to each other,and maybe add them a tag, in order to ease @blay_paul job to no integrate them in wwwjdic ?

CK CK July 7, 2010 at 5:09 PM, edited October 26, 2019 at 3:57 AM July 7, 2010 at 5:09 PM, edited October 26, 2019 at 3:57 AM link permalink

[not needed - removed by CK]

{{vm.hiddenReplies[1546] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul July 7, 2010 at 5:48 PM July 7, 2010 at 5:48 PM link permalink

I don't think that's a job that can be rushed. A lot of them have translations in third languages or are phrases that it would be useful to keep in a longer sentence.

I expect that sooner or later we'll get a few more moderators and that should speed things up.

{{vm.hiddenReplies[1548] ? 'expand_more' : 'expand_less'}} hide replies show replies
Pharamp Pharamp July 7, 2010 at 7:48 PM July 7, 2010 at 7:48 PM link permalink

I'm currently working on it :) please check my comments.

{{vm.hiddenReplies[1550] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul July 7, 2010 at 8:34 PM July 7, 2010 at 8:34 PM link permalink

I can't wait to see CK's reaction when he finds there's a moderator even less willing to delete examples than I am! ;-)

{{vm.hiddenReplies[1553] ? 'expand_more' : 'expand_less'}} hide replies show replies
CK CK July 8, 2010 at 1:45 AM, edited October 26, 2019 at 3:56 AM July 8, 2010 at 1:45 AM, edited October 26, 2019 at 3:56 AM link permalink

[not needed - removed by CK]

{{vm.hiddenReplies[1563] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul July 8, 2010 at 7:15 AM July 8, 2010 at 7:15 AM link permalink

I should point out that, despite the joking tone of my post, that Pharamp being less willing to delete examples than me is another way of saying I'm more willing to delete examples than her.

As to the actual situation it was my rough estimate that, in the 2008-10-10 Tanaka Corpus examples I last worked on, around 1 in 30 of the English sentences was wrong. That's about 5,000 sentences.

Rushing through 130* 'delete tagged' sentences will not resolve that situation quickly. In the natural course of Tatoeba there are around 20-30 examples removed (from the Japanese-English pairs used by WWWJDIC) every week. So that makes it about 200 weeks, or about 4 years, until (theoretically) that situation will be resolved. It may be a depressing figure, but remember that it's taken eight years to get the data to this point.

* Probably actually around 70. Deleting sentences does not delete tags.

{{vm.hiddenReplies[1566] ? 'expand_more' : 'expand_less'}} hide replies show replies
CK CK July 8, 2010 at 8:19 AM, edited October 26, 2019 at 3:56 AM July 8, 2010 at 8:19 AM, edited October 26, 2019 at 3:56 AM link permalink

[not needed - removed by CK]

{{vm.hiddenReplies[1567] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul July 8, 2010 at 9:19 AM July 8, 2010 at 9:19 AM link permalink

> which means that I felt that 5 out of 6 were a
> less than natural-sounding or would not be that
> useful for my students to learn.

Which just makes it more obvious that for your purposes the OK list (white listing) is a lot better than the removal of dubious entries (black listing).

{{vm.hiddenReplies[1568] ? 'expand_more' : 'expand_less'}} hide replies show replies
CK CK July 8, 2010 at 3:03 PM, edited October 26, 2019 at 3:56 AM July 8, 2010 at 3:03 PM, edited October 26, 2019 at 3:56 AM link permalink

[not needed - removed by CK]

{{vm.hiddenReplies[1573] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul July 8, 2010 at 3:37 PM July 8, 2010 at 3:37 PM link permalink

You missed out 'American'.

Demetrius Demetrius July 9, 2010 at 12:33 PM July 9, 2010 at 12:33 PM link permalink

What is a 'useful' sentence? What is an 'everyday' sentence?

If you don't like old sentences, please start with deleting those tagged By-Shakespeare. They are certainly not modern.

Demetrius Demetrius July 8, 2010 at 9:20 AM July 8, 2010 at 9:20 AM link permalink

> 1. How many of the sentences on any one of these pages
> would you actually use yourself?

IMHO "using yourself" can't always be a criterion.

There are some Russian sentences I wouldn't actually say, but search engine proves someone does use them.

In fact, sometimes I even feel I would never say a sentence added by another contributor. And probably sentences that are OK for me would feel weird for someone else.

I believe we have just to wait until the 'voting' for the sentence is available.

saeb saeb July 8, 2010 at 6:32 AM July 8, 2010 at 6:32 AM link permalink

my sentiments exactly: http://tatoeba.org/eng/wall/sho...44#message_344

and it's a real hold up when I'm translating because I'll have to pick out 'translatable' sentences first...which takes a while

Pharamp Pharamp July 9, 2010 at 3:16 PM July 9, 2010 at 3:16 PM link permalink

AHHHHHHH yesterday i thought they were finished!!
but it's only a bug in the first page T__T

so

let's work

again and again

blay_paul blay_paul July 8, 2010 at 7:23 PM July 8, 2010 at 7:23 PM link permalink

WWWJDIC - Thursday update.

Since last week there have been 35 records deleted and 20 new records.

blay_paul blay_paul July 8, 2010 at 11:25 AM July 8, 2010 at 11:25 AM link permalink

Re: DELETE THIS: It's correct elsewhere! -- The problem is worthy of being remember.

http://tatoeba.org/eng/sentences/show/43683

For future reference it is better _not_ to delete cases like these, but to change it to match the correct version and wait for the duplicate removal script to be run. Deleting it caused the WWWJDIC record to be lost from the export.

Also if you are unlinking English sentences that don't match the Japanese please note so in the comments as I have to fix the index data.

kolibet kolibet July 8, 2010 at 11:13 AM July 8, 2010 at 11:13 AM link permalink

bonjour - j'apprends le japonais depuis un an, et votre site est une mine de vocabulaire et d'expressions !
mais : récemment, quelque chose a changé - avant, je pouvais choisir un caractère, le copier, et aller chercher son sens dans un dictionnaire en ligne - maintenant, les phrases forment un bloc, dont je ne peux détacher aucun signe. c'est dommage ! est-ce que cela peut être modifié ?
merci en tous cas - ce projet est passionant - hélas, mes compétences sont trop limitées pour participer plus activement...

{{vm.hiddenReplies[1570] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko July 8, 2010 at 11:21 AM July 8, 2010 at 11:21 AM link permalink

salut

oui, j'avoue que ce n'est pas pratique, je pense qu'on peut au moins mettre la partie "romanisation" du japonais pas en "bloc", comme cela tu pourras facilement copier/coller des caractères

sinon pour l'instant, la solution temporaire que je vais proposer n'est pas super pratique mais mieux que rien, tu peux cliquer sur la phrase japonaise, cela va t'amener sur la page de cette phrase
et ensuite sur la droite tu auras "historique/log" avec la phrase, et là tu pourras sélectionner comme si c'était du texte normal

sinon tu parles français, donc tu peux toujours ajouter des phrases françaises que tu aimerais savoir comment dire en japonais/autres langues, car maintenant on a ajouter un module pour rechercher par exemple "les phrases françaises pas traduite en japonais". Donc même si tu ne parles qu'une seule langue, tu peux nous aider :), et vu que les gens traduironts tes phrases, tu pourras améliorer ton niveau en langue étrangère en même temps :)

voila

blay_paul blay_paul July 8, 2010 at 6:55 AM July 8, 2010 at 6:55 AM link permalink

Feature request: Export lists from tags

You can export lists (with IDs) from lists, so how about adding the same ability from tags?

KeEichi KeEichi July 7, 2010 at 9:49 PM July 7, 2010 at 9:49 PM link permalink

Hi! Newbie here.

I'm wondering, with languages that have feminine and masculine forms (i.e. Arabic), how do you write sentences? Do you put two sentences, one for a masculine speaker, one for a feminine one? And then again, how about if you're talking to a male or female? Or do you post a comment on the Arabic sentence's page saying how you could say it depending on your gender or the listener's gender.

Just wondering what would be more convenient.

{{vm.hiddenReplies[1554] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul July 7, 2010 at 10:03 PM July 7, 2010 at 10:03 PM link permalink

You can write two, or four(!), versions of the sentence if you want but it is not required. If your sentence is gender specific then please add the appropriate tag(s).

male (= spoken by male)
female (= spoken by female)
said to male
said to female

{{vm.hiddenReplies[1555] ? 'expand_more' : 'expand_less'}} hide replies show replies
KeEichi KeEichi July 7, 2010 at 10:12 PM July 7, 2010 at 10:12 PM link permalink

Aha! Thanks.

Tags seem to only show up on individual sentence pages, right? Like if I'm looking at an English sentence, the only way to know the difference between the different translations would be to click on each individual sentence?

{{vm.hiddenReplies[1557] ? 'expand_more' : 'expand_less'}} hide replies show replies
saeb saeb July 7, 2010 at 10:14 PM July 7, 2010 at 10:14 PM link permalink

try clicking on the tag ;)

{{vm.hiddenReplies[1558] ? 'expand_more' : 'expand_less'}} hide replies show replies
KeEichi KeEichi July 7, 2010 at 10:22 PM July 7, 2010 at 10:22 PM link permalink

Now I'm overwhelmed. ^_^
I meant when on an English sentence's page, for example, translated sentences with tags don't show the tag on that page.

It feels like click, back, click, back... to check sentences would be not so nice. Or maybe having the tags next to the translated sentences would be too much clutter on the page?

{{vm.hiddenReplies[1559] ? 'expand_more' : 'expand_less'}} hide replies show replies
saeb saeb July 7, 2010 at 11:06 PM July 7, 2010 at 11:06 PM link permalink

true, but I can't see how useful for a translated sentence to carry the tags of the original...tags that are linguistic in nature will not always be true for a translated sentence...

{{vm.hiddenReplies[1560] ? 'expand_more' : 'expand_less'}} hide replies show replies
KeEichi KeEichi July 7, 2010 at 11:10 PM July 7, 2010 at 11:10 PM link permalink

I didn't mean the translated sentences should carry the original's tags. I meant to have translated sentences' tags visible (maybe not all of them), rather than have to go to their respective pages. Maybe some tags could be turned into a small icon, like the male or female tags.

I think tags aren't available for all users now. Is it ok if I write them in the comments on the Arabic sentences' page?

{{vm.hiddenReplies[1561] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko July 7, 2010 at 11:13 PM July 7, 2010 at 11:13 PM link permalink

welcome :)
yep sure you can write them in comments, I'm sure saeb will be glad to add them as real tag
in fact for the moment add tags is limitated to member with the status "trusted_user" as the tag feature is a recent feature and still not "complete" yet

blay_paul blay_paul July 7, 2010 at 5:44 PM July 7, 2010 at 5:44 PM link permalink

Could we have the duplicate removal script run?

Also I sent in an email with 600+ sentence IDs that can have "British English" as a tag - what's the status of that?

{{vm.hiddenReplies[1547] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG July 7, 2010 at 7:58 PM July 7, 2010 at 7:58 PM link permalink

I'm taking care of the British tag right now.

{{vm.hiddenReplies[1551] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul July 7, 2010 at 8:14 PM July 7, 2010 at 8:14 PM link permalink

Got your email - thanks for taking care of that for me. :-)

sysko sysko July 7, 2010 at 5:51 PM July 7, 2010 at 5:51 PM link permalink

in fact I will need to readapt the duplicate removal script to now handle tags and also to handle audio, which is not the case yet
and now we have more and more audio, I really need to take care of this before re-running it, and as I've said I've really few free times yet :(

tanzoniteblack tanzoniteblack July 7, 2010 at 12:28 AM July 7, 2010 at 12:28 AM link permalink

Is it possible to download the lists as a tab separated file or xml file rather than a CSV?

{{vm.hiddenReplies[1536] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG July 7, 2010 at 9:39 AM July 7, 2010 at 9:39 AM link permalink

Perhaps I'm misunderstanding the question, but aren't the lists already tab separated...?

If you click on download, it won't download right away, it will first lead to a page where you can choose what data to include in your list, and there's a short explanation about the structure.

=> sentence_id [tab] sentence_text [tab] translation_text

Unless there's a bug we missed, you will definitely be able to import them into Anki :)

{{vm.hiddenReplies[1541] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko July 7, 2010 at 9:50 AM July 7, 2010 at 9:50 AM link permalink

you're faster than me ^^

sysko sysko July 7, 2010 at 12:34 AM July 7, 2010 at 12:34 AM link permalink

no we only export in csv yet as it's the format which is the more commonly used, and it can easily import in excel etc.
(and anyway a csv can be a "whatyouwant separated file" so even with tab, it's still a csv, it's just the separator which change, so with a little script, you can easily from one kind of csv to another)

if you plan to reuse the data for a website/application or so, don't forget the CC-BY licence obliged you to notice us of the use (except for personnal use-only of course) and to say where the data come from etc.

{{vm.hiddenReplies[1537] ? 'expand_more' : 'expand_less'}} hide replies show replies
tanzoniteblack tanzoniteblack July 7, 2010 at 12:37 AM July 7, 2010 at 12:37 AM link permalink

The reason I ask, is because I have issues with the csv if the source sentences contain commas themselves, as it appears that there are more columns in that row then there should be.

{{vm.hiddenReplies[1538] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko July 7, 2010 at 12:43 AM July 7, 2010 at 12:43 AM link permalink

yep sure, it's a known issue, but it was not the priority yet, so yep I had in mind to replace coma by tab to avoid the problem of coma in sentences

the export was coded really quickly, as this was not something really vital for normal user

we will warn you once it will be corrected (just to warn you, we're more or less only 2 students to make/maintain/promote/debug this website on our freetime, so lack of time/not in priority list is the main reason for most missing features/bugs here)

{{vm.hiddenReplies[1539] ? 'expand_more' : 'expand_less'}} hide replies show replies
tanzoniteblack tanzoniteblack July 7, 2010 at 12:47 AM July 7, 2010 at 12:47 AM link permalink

How is the data for the linking of the sentences stored? Would I be able to access the source files used by the project's website itself to be able make my own export code into a format that's much more useful for my purposes?

For the note, my own purposes are creating lists while browsing and being able to download and import said lists into my Anki deck.

{{vm.hiddenReplies[1540] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko July 7, 2010 at 9:43 AM July 7, 2010 at 9:43 AM link permalink

we use a database, and our server is already to small to support much more feature, so no, till we don't have an other server, we will not permit "total" export of the database,

but what you want (create list and export them in anki) is already possible, create a list in the "list" section, and while browsing tatoeba each sentences has an icon "add" too list, and when you want to export it, you go in the list section, you click on your list, and then on the big green download button

moreover if you check the anki plugin list, there's a tatoeba's plugin for japanese learner (made by one of our user)

hope this help

CK CK July 6, 2010 at 1:11 AM, edited October 26, 2019 at 3:57 AM July 6, 2010 at 1:11 AM, edited October 26, 2019 at 3:57 AM link permalink

[not needed - removed by CK]

{{vm.hiddenReplies[1533] ? 'expand_more' : 'expand_less'}} hide replies show replies
saeb saeb July 6, 2010 at 2:54 AM July 6, 2010 at 2:54 AM link permalink

I already do ;), keep up the great work!

sysko sysko July 6, 2010 at 1:45 PM July 6, 2010 at 1:45 PM link permalink

in the long long todo list, it's planned to have the "all sentences of user X" to be displayed like list, with possibility to directly translate etc.