menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search
lilygilder lilygilder December 23, 2009 December 23, 2009 at 11:57:15 AM UTC link Permalink

Hi there,

What can I do with repeated sentences? Is there a way to link one entry to the other or maybe even merge them?

Anyways, thank you for this wonderful project.

{{vm.hiddenReplies[80] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG December 23, 2009 December 23, 2009 at 12:24:16 PM UTC link Permalink

You don't have to worry about them. We take care of merging them :) We actually already launched a loooong cleaning process a few weeks ago, it removed about 10,000 exact duplicate sentences.
We're going to launch it again sometime, after we've cleaned the sentences from typos or extra spaces where there shouldn't be or things like that.

Anyways, thank you for your contributions. I'm happy to see German getting popular again :D It used to be the 4th language in Tatoeba, until extremely motivated contributors in Chinese and Spanish came along...

{{vm.hiddenReplies[81] ? 'expand_more' : 'expand_less'}} hide replies show replies
lilygilder lilygilder December 23, 2009 December 23, 2009 at 12:42:36 PM UTC link Permalink

Does this cleaning programm also remove nearly identical sentences? I found a pair where the only difference is the punctuation mark... I'm glad you don't have to do that manually...

I'd be happy if German took the fourth place again. I'll see what I can do and show some competitive spirit. =) This is a fun way to pass time and help other language learners. :)

{{vm.hiddenReplies[82] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG December 23, 2009 December 23, 2009 at 1:08:08 PM UTC link Permalink

No it doesn't remove nearly identical sentences. I've seen sentences which differ only from the punctuation, but... Well this is a bit tricky.

If you take Japanese, there is supposedly no question mark or exclamation mark (although I suppose it's changing). Instead you have particles to express a question or an exclamation.
The fact that you write "I'm cold." or "I'm cold!" can change something in the Japanese sentence (samui desu / samui desu yo).

So to be safe, I wouldn't delete a sentence that has a nearly identical twin, with only a difference of punctuation.