clear
{{language.name}} No language found.
swap_horiz
{{language.name}} No language found.
search

Wall (5,702 threads)

Tips

Before asking a question, make sure to read the FAQ.

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

Latest messages subdirectory_arrow_right

Aiji

39 minutes ago

subdirectory_arrow_right

TRANG

an hour ago

subdirectory_arrow_right

Pfirsichbaeumchen

an hour ago

feedback

CK

4 hours ago

subdirectory_arrow_right

Ricardo14

6 hours ago

subdirectory_arrow_right

AlanF_US

12 hours ago

feedback

Ricardo14

17 hours ago

subdirectory_arrow_right

sharptoothed

21 hours ago

subdirectory_arrow_right

sharptoothed

21 hours ago

subdirectory_arrow_right

CK

yesterday

CK CK May 9, 2010 at 6:14 AM, edited October 25, 2019 at 8:07 AM May 9, 2010 at 6:14 AM, edited October 25, 2019 at 8:07 AM link permalink

[not needed anymore- removed by CK]

{{vm.hiddenReplies[741] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul May 9, 2010 at 8:39 AM May 9, 2010 at 8:39 AM link permalink

I don't get this. The other language _do_ show while you're typing in a translation. Are you sure you're not
http://content.pyzam.com/funnyp...ngItWrong6.jpg
?

{{vm.hiddenReplies[743] ? 'expand_more' : 'expand_less'}} hide replies show replies
Swift Swift May 9, 2010 at 2:17 PM May 9, 2010 at 2:17 PM link permalink

I see the same behaviour. When you click on the "あ->a" image it turns the list of translations into a form to enter the new translation.

Personally, I don't mind this behaviour. The idea is to translate that one sentence, which may or may not correspond to the other translations. I do admit that I often read the other translations to better understand the context.

{{vm.hiddenReplies[761] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul May 9, 2010 at 2:24 PM May 9, 2010 at 2:24 PM link permalink

Oh, now I'm with you! Yes, when you're translating the only sentence you should really be paying attention to is the one you are translating from.

If you don't understand the sentence you're working from you shouldn't be translating it in the first place.

{{vm.hiddenReplies[762] ? 'expand_more' : 'expand_less'}} hide replies show replies
Swift Swift May 9, 2010 at 2:28 PM May 9, 2010 at 2:28 PM link permalink

I don't think CK meant he didn't understand the question, but just that he wanted to make the translation better match the others.

{{vm.hiddenReplies[763] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko May 9, 2010 at 2:44 PM May 9, 2010 at 2:44 PM link permalink

In fact it was hidden, because before people really often don't understand the fact there is a "main sentence" and others are just translations and so most of the time they were translating one of the translation instead of translating the main one. So hidden + message in red was the simplest way we found yet to make it more obvious.

Denizar Denizar May 8, 2010 at 11:42 PM May 8, 2010 at 11:42 PM link permalink

Hello. I have a problem. I want to search for sentences in japanese, but the search function seems to have changed. I used to type in 沸かす for example, to see exactly when it is used completely as "wakasu" - not wakashite, not the 沸 kanji itself - in other words I could do exact searches.

Right now, when I search for a number of characters together the engine seems to give me mixed results... Is this an issue or is this the way it is going to be from now on? I think it completely destroys the aim here, and I'd love to know if there is a way to make exact searches in Japanese.

{{vm.hiddenReplies[736] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko May 9, 2010 at 1:37 AM May 9, 2010 at 1:37 AM link permalink

This is an issue, as we've switching from an engine to an other, we will try to fix this asap

{{vm.hiddenReplies[737] ? 'expand_more' : 'expand_less'}} hide replies show replies
Denizar Denizar May 9, 2010 at 2:36 AM May 9, 2010 at 2:36 AM link permalink

Oh great - I thought it would be permanent. Great to hear this!

blay_paul blay_paul May 8, 2010 at 10:12 PM May 8, 2010 at 10:12 PM link permalink

ENAMDICT ?

What about including names in the index information? I know Jim isn't going to want it in the output he uses - but I suppose it could be stripped from the WWWJDIC.csv file.

Possible advantages include
* Ability to link to ENAMDICT from sentences (when linking from words in sentences is developed).
* Might be useful in checking MeCab output.

blay_paul blay_paul May 8, 2010 at 6:35 PM May 8, 2010 at 6:35 PM link permalink

New icon?

What's the blue circle with an 'i' in it do?

{{vm.hiddenReplies[727] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG May 8, 2010 at 6:52 PM May 8, 2010 at 6:52 PM link permalink

It's actually not so new. You can also see it from here:
http://tatoeba.org/sentences/my_sentences

We used to display it in the sentences menu, so that people could go to the standard display (in Browse). Then we took it out and made the text of the sentence as a link instead.

But then there was a problem when you adopted a sentence, you couldn't browse to the sentence because it would be editable (clicking on it would display the input field to edit the sentence).

So we added back this little icon, so that you can browse to the sentence even when it belongs to you.

blay_paul blay_paul May 7, 2010 at 8:20 AM May 7, 2010 at 8:20 AM link permalink

Suggestion box

Just thought of a quick idea for comments / wall postings.

Some sort of BBcode or tag system to mark sentence numbers in coment text so they can be turned automatically into links.

e.g. "See also ##250165"
would turn into
"See also <a href=tatoeba.org/eng/sentences/show/250165>250165</a>"

{{vm.hiddenReplies[720] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG May 8, 2010 at 4:26 PM May 8, 2010 at 4:26 PM link permalink

Yes, it's something I've been thinking about as well... But the question is, what format to use?

As far as I'm concerned, I'd tend to write it with only one #.
"See also #250165"

But perhaps it's too simple and there can be issues with this, I don't know.

{{vm.hiddenReplies[722] ? 'expand_more' : 'expand_less'}} hide replies show replies
Swift Swift May 8, 2010 at 4:42 PM May 8, 2010 at 4:42 PM link permalink

How about using the "nº"? It will hardly be used for anything else and is available on every page by the sentence number, ready to be copied along with it.

{{vm.hiddenReplies[724] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul May 8, 2010 at 6:32 PM May 8, 2010 at 6:32 PM link permalink

Except that there is no "nº" displayed on 'the Wall' so it would only really work for comments there, not posts here as well.

{{vm.hiddenReplies[726] ? 'expand_more' : 'expand_less'}} hide replies show replies
Swift Swift May 8, 2010 at 8:06 PM May 8, 2010 at 8:06 PM link permalink

Why would there need to be one displayed on the Wall? It's where you look up the sentence number.
... well, unless you have it memorised.

{{vm.hiddenReplies[729] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul May 8, 2010 at 8:38 PM May 8, 2010 at 8:38 PM link permalink

Sentence numbers are also included in the csv export files. (Which I often work from)

OK, I'll admit it - I just don't like nº's.

Swift Swift May 8, 2010 at 8:17 PM May 8, 2010 at 8:17 PM link permalink

Actually, perhaps it's just easiest to go with something dead-easy. The #<number> is nicer than the ##<number> unless the latter markup were trimmed somehow, but that would make the usage unclear to new users.

The occasional false-positive wouldn't be obtrusive and well balanced out by getting people to drop the use of the hash character in their writing.

Oh, and if we're building a bike-shed, I want it red and will fight anyone who thinks otherwise tooth and nail!

sysko sysko May 8, 2010 at 4:30 PM May 8, 2010 at 4:30 PM link permalink

I think ## is better

{{vm.hiddenReplies[723] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul May 8, 2010 at 6:31 PM May 8, 2010 at 6:31 PM link permalink

I also recommend ## rather than #, because you will be less likely to get false positives.

contour contour May 8, 2010 at 9:02 PM May 8, 2010 at 9:02 PM link permalink

A format I've seen other places is sentence #123, which is convenient because you can generalize it to list #123, wall #123, etc. Sentence links may be common enough to warrant a shorter special syntax, though.

I also think the syntax should be kept in the final post, e.g. if you use ##, it should turn into <a href=tatoeba.org/eng/sentences/show/123>##123</a>, so new users can immediately tell how it's done. Using the format other places, like in the message logs, also helps discoverability.

Swift Swift May 7, 2010 at 10:34 AM May 7, 2010 at 10:34 AM link permalink

I was just about to suggest the same thing. This would be immensely useful.

Swift Swift May 4, 2010 at 11:21 PM May 4, 2010 at 11:21 PM link permalink

I've been wondering how best to make notes of genders when translating from languages such as English that don't always differentiate between the different genders.

So far I've just been using the comments, but was figure this might eventually go into metadata. There are several interesting ways to tackle that, but in case it might help me pick what to put in the comments and how to form it, has it been decided how issues like these will be tackled? Should I perhaps just be adding an extra sentence for each variation?

{{vm.hiddenReplies[708] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul May 4, 2010 at 11:30 PM May 4, 2010 at 11:30 PM link permalink

I think this is one of a number of things that are 'on the backburner'. I don't think there's likely to be much done about it for some time.

I wouldn't go the route of adding extra sentences as that would just produce needless duplication of content.

I think what we need is
* Meta-data that is not included as annotations, but in a separate field.
* A method of showing / hiding meta data associated with a sentence.
* A format for entering the metadata, and translations for them (at least the most common ones).

{{vm.hiddenReplies[709] ? 'expand_more' : 'expand_less'}} hide replies show replies
Swift Swift May 5, 2010 at 10:11 PM May 5, 2010 at 10:11 PM link permalink

Extra sentences wouldn't actually be needless duplication. They're essentially equally valid (though partially identical) sentences that happen to translate to the same sentence in at least one language. The main reason I haven't created them is that I'm hoping for a much more elegant solution later on.

{{vm.hiddenReplies[713] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul May 5, 2010 at 10:53 PM May 5, 2010 at 10:53 PM link permalink

Look there are 151,909 Japanese sentences. Each could have a feminine and a masculine variant. Each could be plain form or (-masu) form, and many would also have one or more extra polite versions. That would take you up to over 600,000 sentences (and probably give Trang a heart attack). If that isn't needless duplication I don't know what you'd call it.

{{vm.hiddenReplies[714] ? 'expand_more' : 'expand_less'}} hide replies show replies
Swift Swift May 6, 2010 at 12:01 AM May 6, 2010 at 12:01 AM link permalink

I'd call it unhelpful.

Genders in Japanese are simple in that they don't affect conjugation. In other languages they do.

Take for example the Icelandic phrases:
o Hann var keyptur/seldur/gefinn. Þeir voru keyptir/seldir/gefnir.
o Hún var keypt/seld/gefin. Þær voru keyptar/seldar/gefnar.
o Það var keypt/selt/gefið. Þau voru keypt/seld/gefin.
that translate to English as:
o {{He, she, it} was, they were} bought/sold/given.
As you see they are highly irregular and therefore a student of the language would be well served with examples of each.

The only problem I see with adding all these sentences lies in storing their relationships to each other. Cluttering the search results or translations overview with multiple translations for every Finnish pronoun is just silly. Such translations wouldn't be unnecessary though they wouldn't ultimately be helpful, and why metadata would really come in handy.

As for proliferation of sentences, I'd imagine they'd give Trang great joy -- provided that they're useful, of course. That's the point of the project, after all.

At 500 sentences a week, I'll hit 600,000 in 23 years. :-)

The community at large has, however, contributed an average of 324 sentences a day since the end of the late march slump which would add that many sentences in a little over five years, bringing the total number to 974,851 sentences.

{{vm.hiddenReplies[715] ? 'expand_more' : 'expand_less'}} hide replies show replies
saeb saeb May 7, 2010 at 5:15 AM May 7, 2010 at 5:15 AM link permalink

ugh...I've got the same issue in Arabic, sure other languages do have it to. What I did was keep adding different conjugations to different sentences until I felt like I covered them all, and if I did add another conjugation to the same base sentence, I make sure to use different synonyms and wordings to say the same idea.

Nonetheless, I do agree that some kind of metadata to indicate conjugation (or any other nuances) in the future would be nice.

{{vm.hiddenReplies[718] ? 'expand_more' : 'expand_less'}} hide replies show replies
saeb saeb May 7, 2010 at 5:16 AM May 7, 2010 at 5:16 AM link permalink

*too

Demetrius Demetrius May 5, 2010 at 1:39 PM May 5, 2010 at 1:39 PM link permalink

As for the metadata... I believe all those metadata should be attached to the original sentence, not to the translated one. I.e., [M], [F] should be not in Finnish (like they currently are), but to the English sentence that has gender disctinction (he/she). And Finnish should go unmarked.

Speaking on the metadata, I believe we also need a metadata for transcription, to fix it when it cannot be correcly generated automatically.

And also author and origin information for some sentences would be nice to have.

And a simple possibility to add IPA to sentences in any language would also be nice, though I'm not sure if this is really necessary.

{{vm.hiddenReplies[710] ? 'expand_more' : 'expand_less'}} hide replies show replies
Swift Swift May 5, 2010 at 9:31 PM May 5, 2010 at 9:31 PM link permalink

I'm not quite sure what you mean by translated versus original sentences. Surely the idea is that they're equivalent. I do agree with you that the metadata should be associated with the more specific language.

It would be nice if one could define, for example, the genders of pronouns and contribute variations on these. That way, if one came across a Finnish sentence such as nº354807:
He väittivät, että hän tappoi hänet.
one could choose from the various Icelandic versions:
{Þeir, Þær, Þau} kváðu {hann, hana} hafa drepið {hann, hana}.
though implementing that could be a bit tricky.

I reckon it might be wiser to focus on improving the automatic generation algorithms and encourage people to report errors than adding comments on the readings.

As more people are able to record their voices than transcribe into IPA, these are probably largely unnecessary here on this project.

It might however be useful to get readings for symbols such as numbers. I tend to write these out rather than use numerals.

{{vm.hiddenReplies[711] ? 'expand_more' : 'expand_less'}} hide replies show replies
Demetrius Demetrius May 6, 2010 at 1:10 PM May 6, 2010 at 1:10 PM link permalink

> Surely the idea is that they're equivalent.
Hmm... I thought that metadata is to be added by a translator to expain sentence that contains information that was lost or added during his translation, that’s why I’ve said about original sentence.
It shouldn’t be obvious from a sentence that it’s the original/a translation, of course.


> I reckon it might be wiser to focus on improving the automatic
> generation algorithms and encourage people to report errors than
> adding comments on the readings.
I do agree that improving algorithms is important. But there are situations when it’s impossible to generate transcription automatically because it requires complex grammar analysis or even understanding of the situation in which the sentence can be said.

Moreover, adding metadata for transcription will help to easily find problems with our current transcription algorithm. If people write this in comments, they’ll get lost. If a special field it designated for a transcription, these can be easily found by a DB search by those who improve the algorithm.


> As more people are able to record their voices than transcribe
> into IPA, these are probably largely unnecessary here on this project.
I’m not sure about this. Most people have at least some understanding of transcription and IPA because it is used at school.

The problem is with voice files is the high-quality required. As for me, it’s certainly easier for me to transcribe something than to to buy a microphone and learn how to use it.

{{vm.hiddenReplies[716] ? 'expand_more' : 'expand_less'}} hide replies show replies
Swift Swift May 6, 2010 at 6:32 PM May 6, 2010 at 6:32 PM link permalink

OK, now I understand what you mean by "original" sentence. Personally I'm not quite sure where the information would best be stored. I've been sort of leaning towards only storing information about added information (i.e. more specific sentences). One of the problems with the information is that a sentence may be connected to several others and indicating what metadata refers to which "original" sentence might be difficult.

There are several solutions to this problem, but the ones I've been thinking may well be too complicated. Frankly, the current situation doesn't really bother me all that much.

While we're calling the solution different things, I completely agree with you on the transcription issue.

On the IPA, I'm not convinced of it's usefulness, but if you are, there certainly will be others who'd agree and benefit from it.

blay_paul blay_paul May 5, 2010 at 9:37 PM May 5, 2010 at 9:37 PM link permalink

> I'm not quite sure what you mean by translated
> versus original sentences. Surely the idea is that
> they're equivalent.

Yeah, that's the _theory_, but it often isn't the practice. To take one obvious example, if one of the sentences is a quote then the original source was only in _one_ of the languages - all the rest must be translations.

blay_paul blay_paul May 3, 2010 at 8:52 PM May 3, 2010 at 8:52 PM link permalink

Japanese index update

I've sent in an update to 716 records for the Japanese index data to the team@tatoeba.fr address - hope you can sneak them in. ;-)

{{vm.hiddenReplies[703] ? 'expand_more' : 'expand_less'}} hide replies show replies
JimBreen JimBreen May 4, 2010 at 1:58 AM May 4, 2010 at 1:58 AM link permalink

I have been amending the indices in situ in a few places, so I hope we
don't overlap. Perhaps we need an RCS of some sort.

{{vm.hiddenReplies[704] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul May 4, 2010 at 3:43 AM May 4, 2010 at 3:43 AM link permalink

If there was a 'last changed' date field (to the nearest day) the SQL on the update could be made to avoid changing entries last changed past a certain day.

The risk of overlap isn't going to be huge though, so anything too complicated or tricky to implement may actually reduce work efficiency.

nickyeow nickyeow May 3, 2010 at 11:27 AM May 3, 2010 at 11:27 AM link permalink

I'm planning to add some Cantonese sentences. This might be a stupid question, but should I include the jyutping in the sentences?

{{vm.hiddenReplies[694] ? 'expand_more' : 'expand_less'}} hide replies show replies
Demetrius Demetrius May 3, 2010 at 11:50 AM May 3, 2010 at 11:50 AM link permalink

As far as I can see, currently transcription isn't generated for Cantonese (eg. sent. No. 382502). But it's generated for Mandarin and Shanghainese, so I believe it'll be implemented in the future.

{{vm.hiddenReplies[696] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko May 3, 2010 at 2:55 PM May 3, 2010 at 2:55 PM link permalink

Yep you're right Demetrius, it will be added soon, adso as a beta support for jiutping, but if you know a good free software for Cantonese romanization, tell us :)

@nickyeow thanks to contribute (also) in Cantonese :)

{{vm.hiddenReplies[700] ? 'expand_more' : 'expand_less'}} hide replies show replies
Demetrius Demetrius May 3, 2010 at 3:53 PM May 3, 2010 at 3:53 PM link permalink

I don't know any of these, unfortunately.

The only Cantonese wordlist with transcriptions I’ve seen is here: http://e-guidedog.sourceforge.net/cantonese.php , but it is inaccurate according to its creators.

Demetrius Demetrius May 3, 2010 at 11:45 AM May 3, 2010 at 11:45 AM link permalink

No, you shouldn’t. All transcription is generated automatically. Maybe it isn't generated for Cantonese (I don't know), but it should be generated in future.
But if you're inclined to, you can add transcription in comments. :)

It's good we'll have Cantonese sentences! :)

{{vm.hiddenReplies[695] ? 'expand_more' : 'expand_less'}} hide replies show replies
cburgmer cburgmer May 3, 2010 at 11:50 AM May 3, 2010 at 11:50 AM link permalink

Btw, is Jyutping really employed more often than Yale? In all books I read and the course I attended Cantonese Yale was used.

{{vm.hiddenReplies[697] ? 'expand_more' : 'expand_less'}} hide replies show replies
nickyeow nickyeow May 3, 2010 at 12:26 PM May 3, 2010 at 12:26 PM link permalink

In Hong Kong we usually use Jyutping, but you can also see Yale employed in some dictionaries.

{{vm.hiddenReplies[698] ? 'expand_more' : 'expand_less'}} hide replies show replies
cburgmer cburgmer May 3, 2010 at 8:07 PM May 3, 2010 at 8:07 PM link permalink

At HKUST they teach Yale :)

nickyeow nickyeow May 3, 2010 at 12:27 PM May 3, 2010 at 12:27 PM link permalink

Thanks for your answer! :)

blay_paul blay_paul May 2, 2010 at 11:08 AM May 2, 2010 at 11:08 AM link permalink

Member status

I think it would be a good idea to be able to tell who has 'trusted user' status and who has 'admin access'. I suggest:

* Icon next to name in comments posted and next to posts on the wall.
* Full title given in the user's profile "%s is a Trusted-User" or something. Could make Trusted-User link to an explanation of what that means and how you get it.

{{vm.hiddenReplies[692] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG May 2, 2010 at 2:03 PM May 2, 2010 at 2:03 PM link permalink

There is a way to tell, but it's not obvious. From the "Members" section, if you re-organize by status, you can easily see who are the current trusted users:
http://tatoeba.org/eng/users/al.../direction:asc

But I agree it would be nice to be able to tell, right from the comments, what is the status of a user.

blay_paul blay_paul April 30, 2010 at 9:42 PM April 30, 2010 at 9:42 PM link permalink

I've read the latest Tatoeba Blog, but I'm going to comment here because I think more people read it. :-)

*******
I'm not 100% convinced by the 'adoption' approach, but I think it could work with a few adjustments. Here's one idea:

I think people (not the owners) should be able to issue an official 'call for action'. That call would only be able to be closed by the person who made it, or by a 'super user' (e.g. Trang, Sysko, etc.)

How it could work:

* User sees a sentence that is linked to a sentence that it is not a good translation of. User cannot unlink it because both sentences are owned.

* User posts a comment with the 'Call for action' checkbox ticked. "Please unlink this sentence from sentence 123XXX00 as it is not a good translation."

* Owner of sentence is notified. Owner of sentence has a link to a list of all currently open 'calls for action' on his sentences. The super users (Trang, Sysko) also have a similar link that works for all users.

* If, after one week, the call for action is _not_ closed the the ownership of the sentence is revoked and the person who posted the request is notified.

* The person who posted the request can close it at any time, when he is satisfied with the explanation given by the owner or action taken.

* Super-users (Trang, Sysko, etc.) can deal with the request themselves, and can also close the action item even if the person who made it is not satisfied (person making request might not come back to Tatoeba, it might have been a trivial or frivolous request).

This would give a formal and more easily trackable way of handling corrections needed to owned sentences, given that the owner may be away or may otherwise lose track of the comments made on his sentences.

{{vm.hiddenReplies[684] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG May 1, 2010 at 1:09 AM May 1, 2010 at 1:09 AM link permalink

> I've read the latest Tatoeba Blog, but I'm going to comment here because I think more people read it.

Actually I haven't posted about it here yet because it wasn't really official yet ;) I was still discussing it with sysko, and actually reviewed certain things, but the overall idea remains the same.

For now, we are focusing correcting sentences themselves, not the way they are linked. Because there are many sentences in French that could be corrected, and quickly, if only we took the time to check them in an organized way.

The whole linkage problem is of course something we will have to deal with, but it is actually in what would be the "phase 2". Not something we will work on yet... If we can already provide a "sentences.csv" that is not filled with spelling/grammar mistakes, it will be a good step forward.