Wall (6,959 threads)
Tips
Before asking a question, make sure to read the FAQ.
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
sharptoothed
4 days ago
Cangarejo
4 days ago
Cangarejo
7 days ago
Thanuir
7 days ago
ondo
8 days ago
ddnktr
8 days ago
ondo
8 days ago
AlanF_US
11 days ago
Nandixer
13 days ago
cblanken
14 days ago
[not needed - removed by CK]
Where's Pharamp???
I think that it takes a few days for new sentences to be searchable. Searching for =Pharamp doesn't yield any results.
That aren't all names that are on Tatoeba ;).
I wonder if Spanish should be broken down in more specific languages.
In fact it is a difficult problem, maybe 85% of sentences may be the same for all the Spanish speaking countries but 15% may not be.
For instance, I came across "Is this a computer?"
Spanish from Spain would say: "¿Es esto un ordenador?"
Almost everywhere else would say: "¿Es esto una computadora?"
I think separating Spanish into different languages wouldn't be a good solution. Think about the 85% of the sentences that are identic. You would then have to add them several times with different flags.
I think this problem should be solved with tags (like "Spain only").
Yes, tags exist for this purpose too^^
Now their features are quite basic and the ability to tag sentence is restricted to trusted_users, but in a few weeks/months I think it will be extended to everyone!
So, when this will happen, Jordi, feel free to tag every Spanish sentence with a regional tag :)
yep for English we already have "british" flag, so for some "regional" difference, tags will help make the difference, maybe after we will see to make something more specific than tags, but I think in a first time it's a good way to make the difference
btw, there is a similar problem with Portuguese.
ok, thinking about it, I think that this problem exists more or less in nearly all languages.
French from France vs French from Quebec
American English vs British English
German in Germany, in Austria, in Switzerland
and so on...
I've thought about it sometimes, and my conclusion is that, since the governments of all Portuguese speaking countries are always making efforts to unificate at least the grammar and the orthography, we should put all of them under a single flag. This works well on Wikipedia, IMHO.
Near duplicates - example case.
In accordance with earlier debate on this forum I have been removing near duplicate sentences from those exported to WWWJDIC instead of deleting them. However it was never, IMO, clear exactly when (if ever) deleting near duplicates from Tatoeba is recommended.
Here is one example:
http://tatoeba.org/eng/sentences/show/196748
べティは彼女を殺した。
Betty killed her.
http://tatoeba.org/eng/sentences/show/196749
べティは彼を殺した。
Betty killed him.
The only difference is whether 'her' or 'him' was the victim of Betty's crime of passion. Is it OK to delete one from Tatoeba or not?
[not needed - removed by CK]
Not all 'My name"'s are translated identically.
Bel. (Rus., Ukr.) make clear distinction between imia (first name) and proz'višča (last name).
Other languages may have other unexpected things (like changing gender), so the more the better.
[not needed - removed by CK]
no sentence "Betty is a serial killer" ?
http://tatoeba.org/eng/wall/sho...7#message_1237
so no, we do not delete for 2 reasons:
1 - some language learner (me at least) like to have a sentence in a target language, and to see how it changes depending of tense etc.
2 - we will use them for natural language processing
moreover I think contributors will by themselves limit the number of variation they add on a single sentence
but I agree that when sentences like these are found, it needs to be noted in comments with links to each other,and maybe add them a tag, in order to ease @blay_paul job to no integrate them in wwwjdic ?
[not needed - removed by CK]
I don't think that's a job that can be rushed. A lot of them have translations in third languages or are phrases that it would be useful to keep in a longer sentence.
I expect that sooner or later we'll get a few more moderators and that should speed things up.
I'm currently working on it :) please check my comments.
I can't wait to see CK's reaction when he finds there's a moderator even less willing to delete examples than I am! ;-)
[not needed - removed by CK]
I should point out that, despite the joking tone of my post, that Pharamp being less willing to delete examples than me is another way of saying I'm more willing to delete examples than her.
As to the actual situation it was my rough estimate that, in the 2008-10-10 Tanaka Corpus examples I last worked on, around 1 in 30 of the English sentences was wrong. That's about 5,000 sentences.
Rushing through 130* 'delete tagged' sentences will not resolve that situation quickly. In the natural course of Tatoeba there are around 20-30 examples removed (from the Japanese-English pairs used by WWWJDIC) every week. So that makes it about 200 weeks, or about 4 years, until (theoretically) that situation will be resolved. It may be a depressing figure, but remember that it's taken eight years to get the data to this point.
* Probably actually around 70. Deleting sentences does not delete tags.
[not needed - removed by CK]
> which means that I felt that 5 out of 6 were a
> less than natural-sounding or would not be that
> useful for my students to learn.
Which just makes it more obvious that for your purposes the OK list (white listing) is a lot better than the removal of dubious entries (black listing).
[not needed - removed by CK]
You missed out 'American'.
What is a 'useful' sentence? What is an 'everyday' sentence?
If you don't like old sentences, please start with deleting those tagged By-Shakespeare. They are certainly not modern.
> 1. How many of the sentences on any one of these pages
> would you actually use yourself?
IMHO "using yourself" can't always be a criterion.
There are some Russian sentences I wouldn't actually say, but search engine proves someone does use them.
In fact, sometimes I even feel I would never say a sentence added by another contributor. And probably sentences that are OK for me would feel weird for someone else.
I believe we have just to wait until the 'voting' for the sentence is available.
my sentiments exactly: http://tatoeba.org/eng/wall/sho...44#message_344
and it's a real hold up when I'm translating because I'll have to pick out 'translatable' sentences first...which takes a while
AHHHHHHH yesterday i thought they were finished!!
but it's only a bug in the first page T__T
so
let's work
again and again
WWWJDIC - Thursday update.
Since last week there have been 35 records deleted and 20 new records.
Re: DELETE THIS: It's correct elsewhere! -- The problem is worthy of being remember.
http://tatoeba.org/eng/sentences/show/43683
For future reference it is better _not_ to delete cases like these, but to change it to match the correct version and wait for the duplicate removal script to be run. Deleting it caused the WWWJDIC record to be lost from the export.
Also if you are unlinking English sentences that don't match the Japanese please note so in the comments as I have to fix the index data.
bonjour - j'apprends le japonais depuis un an, et votre site est une mine de vocabulaire et d'expressions !
mais : récemment, quelque chose a changé - avant, je pouvais choisir un caractère, le copier, et aller chercher son sens dans un dictionnaire en ligne - maintenant, les phrases forment un bloc, dont je ne peux détacher aucun signe. c'est dommage ! est-ce que cela peut être modifié ?
merci en tous cas - ce projet est passionant - hélas, mes compétences sont trop limitées pour participer plus activement...
salut
oui, j'avoue que ce n'est pas pratique, je pense qu'on peut au moins mettre la partie "romanisation" du japonais pas en "bloc", comme cela tu pourras facilement copier/coller des caractères
sinon pour l'instant, la solution temporaire que je vais proposer n'est pas super pratique mais mieux que rien, tu peux cliquer sur la phrase japonaise, cela va t'amener sur la page de cette phrase
et ensuite sur la droite tu auras "historique/log" avec la phrase, et là tu pourras sélectionner comme si c'était du texte normal
sinon tu parles français, donc tu peux toujours ajouter des phrases françaises que tu aimerais savoir comment dire en japonais/autres langues, car maintenant on a ajouter un module pour rechercher par exemple "les phrases françaises pas traduite en japonais". Donc même si tu ne parles qu'une seule langue, tu peux nous aider :), et vu que les gens traduironts tes phrases, tu pourras améliorer ton niveau en langue étrangère en même temps :)
voila
Feature request: Export lists from tags
You can export lists (with IDs) from lists, so how about adding the same ability from tags?
Hi! Newbie here.
I'm wondering, with languages that have feminine and masculine forms (i.e. Arabic), how do you write sentences? Do you put two sentences, one for a masculine speaker, one for a feminine one? And then again, how about if you're talking to a male or female? Or do you post a comment on the Arabic sentence's page saying how you could say it depending on your gender or the listener's gender.
Just wondering what would be more convenient.
You can write two, or four(!), versions of the sentence if you want but it is not required. If your sentence is gender specific then please add the appropriate tag(s).
male (= spoken by male)
female (= spoken by female)
said to male
said to female
Aha! Thanks.
Tags seem to only show up on individual sentence pages, right? Like if I'm looking at an English sentence, the only way to know the difference between the different translations would be to click on each individual sentence?
try clicking on the tag ;)
Now I'm overwhelmed. ^_^
I meant when on an English sentence's page, for example, translated sentences with tags don't show the tag on that page.
It feels like click, back, click, back... to check sentences would be not so nice. Or maybe having the tags next to the translated sentences would be too much clutter on the page?
true, but I can't see how useful for a translated sentence to carry the tags of the original...tags that are linguistic in nature will not always be true for a translated sentence...
I didn't mean the translated sentences should carry the original's tags. I meant to have translated sentences' tags visible (maybe not all of them), rather than have to go to their respective pages. Maybe some tags could be turned into a small icon, like the male or female tags.
I think tags aren't available for all users now. Is it ok if I write them in the comments on the Arabic sentences' page?
welcome :)
yep sure you can write them in comments, I'm sure saeb will be glad to add them as real tag
in fact for the moment add tags is limitated to member with the status "trusted_user" as the tag feature is a recent feature and still not "complete" yet
Could we have the duplicate removal script run?
Also I sent in an email with 600+ sentence IDs that can have "British English" as a tag - what's the status of that?
I'm taking care of the British tag right now.
Got your email - thanks for taking care of that for me. :-)
in fact I will need to readapt the duplicate removal script to now handle tags and also to handle audio, which is not the case yet
and now we have more and more audio, I really need to take care of this before re-running it, and as I've said I've really few free times yet :(