menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search

Wall (6,005 threads)

Tips

Before asking a question, make sure to read the FAQ.

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

Latest messages subdirectory_arrow_right

Ricardo14

14 hours ago

subdirectory_arrow_right

small_snow

20 hours ago

subdirectory_arrow_right

Selena777

22 hours ago

subdirectory_arrow_right

DJ_Saidez

yesterday

subdirectory_arrow_right

PaulP

yesterday

subdirectory_arrow_right

Tepan

yesterday

subdirectory_arrow_right

Pfirsichbaeumchen

yesterday

subdirectory_arrow_right

maaster

3 days ago

subdirectory_arrow_right

Pfirsichbaeumchen

3 days ago

feedback

maaster

3 days ago

xtofu80 xtofu80 August 16, 2010 August 16, 2010 at 1:36:19 PM UTC link Permalink

I am not sure whether it is a bug or a curiosity ("feature"), but when you want to search for multiple Japanese words, the whitespaces have to be entered in English. Typing whitespace when using the Japanese IME leads to different results (always no results?)

{{vm.hiddenReplies[2125] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko August 16, 2010 August 16, 2010 at 1:47:45 PM UTC link Permalink

hmmm I think it's because the search engine handle only "normal" space as word separator, can you give me this space (in an aswer to this message, between []) this way I will be able to convert them before sending the request to the search engine.

{{vm.hiddenReplies[2126] ? 'expand_more' : 'expand_less'}} hide replies show replies
xtofu80 xtofu80 August 16, 2010 August 16, 2010 at 4:43:14 PM UTC link Permalink

Sure. And here it is:[ ]. Lol, do you see the difference?

{{vm.hiddenReplies[2130] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko August 16, 2010 August 16, 2010 at 4:45:12 PM UTC link Permalink

yep it's a full width space :)

Demetrius Demetrius August 16, 2010 August 16, 2010 at 1:55:49 PM UTC link Permalink

Please do the same for non-breaking space [ ]. I sometimes use it to prevent dashes (—) from being moved to the next line; it should be treated in the same way as the ordinary space for search purposes.

There are lots of other spaces, but I’m not sure anyone has ever used these on Tatoeba: en quad [ ], em quad [ ], en space [ ], em space [ ], 3-per-em space [ ], 4-per-em space [ ], 6-per-em space [ ], figure space [ ], medium mathematical space [ ], punctuation space [ ], thin space [ ], hair space [ ], zero-width space [​]

{{vm.hiddenReplies[2128] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko August 16, 2010 August 16, 2010 at 4:12:32 PM UTC link Permalink

ok thanks, it will be present in next release

CK CK August 16, 2010, edited October 26, 2019 August 16, 2010 at 1:46:55 AM UTC, edited October 26, 2019 at 4:05:50 AM UTC link Permalink

[not needed anymore- removed by CK]

{{vm.hiddenReplies[2116] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG August 16, 2010 August 16, 2010 at 12:30:18 PM UTC link Permalink

Yes it's sort of a bug, and don't worry I noticed it :) I'm just waiting until I'm done importing lukaszpp's sentences, then I'll fix the logs to indicate his name. I still have one batch and I import only when there's not a lot of new sentences.

In the case of lukaszpp, he was the one who compiled the sentences so I wouldn't be comfortable having them as my contributions anyway.

{{vm.hiddenReplies[2121] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko August 16, 2010 August 16, 2010 at 12:38:55 PM UTC link Permalink

Trang you can tell us you did it to stay in the top 20 :p

{{vm.hiddenReplies[2122] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG August 16, 2010 August 16, 2010 at 1:07:54 PM UTC link Permalink

Ah, shoot, you caught me sysko :P

Demetrius Demetrius August 16, 2010 August 16, 2010 at 9:06:48 AM UTC link Permalink

This seems to happen with all the batch imports. The Ukrainian proverbs from Shtoota are contributed by sysko and owned by me (422069). I don't see any problem with this.

After all, such things come from collections not neccessarily compiled by the people who have suggested importing them, and IMO there’s nothing bad with admins being higher in the contributors table. They contribute in other ways too.

g1itch g1itch August 15, 2010 August 15, 2010 at 7:40:37 PM UTC link Permalink

Is there anyway to sort the sentences that show up based on "difficulty" or length,

so that the easier, shorter sentences show up first-- followed by increasingly more difficult/lengthy sentences?

{{vm.hiddenReplies[2111] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG August 15, 2010 August 15, 2010 at 9:35:16 PM UTC link Permalink

Indeed there is no way. I actually don't event know how results are sorted... But I agree it would be much better if search results could be sorted by length (even though shorter doesn't necessarily mean easier ^^).

Anyway, sysko will be able to tell you when this feature can be expected. He is the one in charge of the search engine.

{{vm.hiddenReplies[2113] ? 'expand_more' : 'expand_less'}} hide replies show replies
Demetrius Demetrius August 16, 2010 August 16, 2010 at 9:11:29 AM UTC link Permalink

Actually, I’m not sure it’s the best way of sorting sentences. Shorter sentences tend to be stranger, as there is little context.

{{vm.hiddenReplies[2119] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG August 16, 2010 August 16, 2010 at 1:17:37 PM UTC link Permalink

Yes, definitely not the best, but better than no sorting at all. Often I just want short sentences but I have to browse through many pages to find them.

Anyway, the best way of sorting sentences would have to take into consideration things like the number of people who thought the sentence was useful for their particular query, or the number of people who used the sentence as a learning material. But we don't have a way to measure these things right now.

blay_paul blay_paul August 15, 2010 August 15, 2010 at 7:53:27 PM UTC link Permalink

Basically, not really, but if you ask nicely Trang or Sysko might put together a list of sentences in whatever language you choose of a certain length.

Alternatively you can go to the downloads page and use a database program to make your own lists.

Unfortunately there is not yet a way of 'uploading' batches of sentence IDs to create a Tatoeba list
@Trang/Sysko: *HINT HINT*

sysko sysko August 14, 2010 August 14, 2010 at 10:01:15 PM UTC link Permalink

the duplicate removal script has been updated and re run, and will be as before, run once a week.

{{vm.hiddenReplies[2109] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul August 14, 2010 August 14, 2010 at 10:04:02 PM UTC link Permalink

That's nice to know.

blay_paul blay_paul August 14, 2010 August 14, 2010 at 4:49:16 PM UTC link Permalink

Unlinking Japanese / English pairs.

Just a reminder that unlinking a Japanese / English pair may well have repercussions for WWWJDIC example sentences. Specifically, if a Japanese / English pair does not match in Tatoeba, unlinking them and adding new translations will _NOT_ automatically fix it in WWWJDIC.

If you unlink a Japanese / English pair, at the least, please mention this in a comment! Adding a @change tag at the same time will help track it down later if I don't see the comment right away.

Trusted users can also correct the 'meaning' field of the Japanese index data by using the annotations page.
http://tatoeba.org/eng/sentence_annotations/index

I am currently working through 183 sentence pairs where the link is broken in Tatoeba but not in WWWJDIC. >_<

{{vm.hiddenReplies[2107] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG August 14, 2010 August 14, 2010 at 8:29:15 PM UTC link Permalink

I'm working on setting up the necessary for all of this to be fixed automatically.

Posting comments or adding @change tags won't be needed.

CK CK August 13, 2010, edited October 26, 2019 August 13, 2010 at 2:15:05 AM UTC, edited October 26, 2019 at 4:05:59 AM UTC link Permalink

[not needed anymore- removed by CK]

{{vm.hiddenReplies[2102] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic August 13, 2010 August 13, 2010 at 3:33:14 PM UTC link Permalink

What I notice is that many duplicates are created in good faith for 2 main reasons:
1) there are no links between the sentence that are viewed and the desired translations because
a) the deduplication process has failed for some reason.
b) the sentences are not deduplicated because they are the same except for a different name or a unit (problem which I mentioned earlier and which could be solved through conventions)
2) the desired translation is not visible.
This is the case, for example, if you view a list of sentences in L1, translated into L2 and for which no translation exists into L3. This list doesn't enable to see the translations from L2 into L3 when they exist, so the temptation is great to believe that they don't and to recreate them.

{{vm.hiddenReplies[2104] ? 'expand_more' : 'expand_less'}} hide replies show replies
xtofu80 xtofu80 August 13, 2010 August 13, 2010 at 10:41:37 PM UTC link Permalink

I wonder if this whole duplicate business could not be solved by automatically merging a duplicate sentence at the time it is entered into tatoeba.

So if I add a new sentence, the server looks it up in the database, and if it is identical to an existing entry, both are merged, which is basically a link operation to the sentence I am translating from. That should be as demanding as a simple search in the database, plus one link operation.
As a consequence, the database would at no point in time contain duplicates.
(We have to consider multiple entries though if two sentences look identical in two languages.)

{{vm.hiddenReplies[2105] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko August 13, 2010 August 13, 2010 at 11:15:22 PM UTC link Permalink

@xtofu80 But simple does not imply fast, and in fact exact sentences match in a nearly 500 000 sentences database is simple but at all not fast. otherwise I will have not choose the more complex but faster duplicate removal script ^^

@feuDRenais, yep good idea, I will add it in the todo list, but to be honnest, don't expect it before a looooooong time.

@CK it can be an idea, but the question is "which sentence to add in this smaller set ?" maybe basic sentences "i love you" etc. but it will not solve the problem entirely

@sacredceltic, yep I think so, except one user recently, since I'm in tatoeba, the only reasons of duplicate were the one you give, either because due too much "indirect" to be viewed, or because people did not search before adding (for not so common sentences, I can understand that one does not check existence of every single sentence he adds)

so far the best solution I've found, is the duplicate removal script, run once a week, it handles every case (even if its look identical but are not in the same languages), keeps link, tags, audio

the two problems are the following
it's not real time
it is dependant of the database structure and so need to slightly modified each time we add new feature linking to sentences (which happen every 6 months ^^⁾

anyway a not real time solution will always the second drawback because you can't know if people add tags/ add to list/ add to [add here whatever future feature] ,
and a real time solution will need to be really fast. (fast < 0.1s)

the fact is I was extremly busy this week, so I didn't find the time to readapt the duplicate removal script, but this is now my current priority.
Personnaly I think real time is not really so much important, and once a week (or if you want 2 time a week) is enough yet.

FeuDRenais FeuDRenais August 13, 2010 August 13, 2010 at 10:38:06 AM UTC link Permalink

I think a similarity match would be really nice. Like in Google. E.g.:

Your sentence "A went to the store with B" was not found. Did you mean:

"C went to the store with B"
"A went to the hardware store with B"
"A and B went to the park"
etc.
etc...

It would at least let the searcher know what's out there. Better yet, it would be nice to have an automatic check before submitting a brand new sentence (NOT a translation). E.g.:

Your sentence "A went to the store with B" is already very similar to...

etc.
etc...

blay_paul blay_paul August 12, 2010 August 12, 2010 at 4:03:34 PM UTC link Permalink

Thursday WWWJDIC examples update summary

16 records deleted.
10 records added.

FeuDRenais FeuDRenais August 4, 2010 August 4, 2010 at 4:21:21 PM UTC link Permalink

Regarding, again, sentence quality:

I would propose a much more liberal use of the "Needs Native Check" tag (or something similar, if it already exists). I see right now that it's been mostly used by myself (and somewhat by Demetrius), but otherwise has gotten very little exposure (unfortunately, we use it for languages where there are currently no natives chez Tatoeba...)

If its use was formally encouraged for sentences a foreign translator was not, say, 95+% sure on, corrections afterwards would be much easier. Native speakers could just check all the tagged sentences in their respective languages, and go through with the checks when they had a chance.

{{vm.hiddenReplies[2015] ? 'expand_more' : 'expand_less'}} hide replies show replies
Demetrius Demetrius August 12, 2010 August 12, 2010 at 3:54:17 PM UTC link Permalink

I suggest not putting the tag on the other people's sentences, or at least writing something about it in the comments. I've been very surprised to find my Russian sentence tagges 'needs native check' recently (http://tatoeba.org/sentences/show/451147).

{{vm.hiddenReplies[2097] ? 'expand_more' : 'expand_less'}} hide replies show replies
FeuDRenais FeuDRenais August 12, 2010 August 12, 2010 at 5:10:30 PM UTC link Permalink

Strange...

Just so you know, it wasn't me who put the tag ;-)

I have occasionally done it to other people's sentences, but I generally try to leave a "NNC-tagged" comment to indicate that I was the one who tagged it. I do think it's a powerful tool for people to use on their own sentences, though... (though only trusted users can tag)

FeuDRenais FeuDRenais August 12, 2010 August 12, 2010 at 5:20:40 PM UTC link Permalink

(I think it could also be a powerful tool if you want native speakers to swarm to a specific sentence as well... Since it's very easy to miss comments, but the tag is easy to find. But yea, too liberal of a use would not be great either...)

Swift Swift August 12, 2010 August 12, 2010 at 7:19:13 AM UTC link Permalink

How does one go about merging sentences? I just came across a duplicate:
http://tatoeba.org/eng/sentences/show/437144
http://tatoeba.org/eng/sentences/show/437146

{{vm.hiddenReplies[2095] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul August 12, 2010 August 12, 2010 at 10:17:43 AM UTC link Permalink

Short answer; you don't.
Longer answer; if they are exactly identical they will be merged automatically the next time the duplicate removal script is run.
Final answer; ... but the script is currently under maintenance so for the meantime if you identify both sentences by number of link a moderator can sort things out for you.

mbm mbm August 10, 2010 August 10, 2010 at 6:45:02 PM UTC link Permalink

Hi everyone. I'm new here but I can already feel how addictive this project is. I just can't stop myself from translating another sentence... and another... and another...

{{vm.hiddenReplies[2092] ? 'expand_more' : 'expand_less'}} hide replies show replies
Scott Scott August 10, 2010 August 10, 2010 at 6:47:20 PM UTC link Permalink

Yes, it can become quite addictive... Welcome to Tatoeba!

landano landano August 11, 2010 August 11, 2010 at 10:49:28 PM UTC link Permalink

http://tatoeba.org/deu/sentences/show/460191 ;-)