menu
Tatoeba
language
Luo käyttäjätili Kirjaudu sisään
language Suomi
menu
Tatoeba

chevron_right Luo käyttäjätili

chevron_right Kirjaudu sisään

Selaa

chevron_right Näytä satunnainen lause

chevron_right Selaa kielen mukaan

chevron_right Selaa listan mukaan

chevron_right Selaa tunnisteen mukaan

chevron_right Selaa äänitteitä

Yhteisö

chevron_right Seinä

chevron_right Luettelo kaikista jäsenistä

chevron_right Jäsenten kielet

chevron_right Äidinkieliset puhujat

search
clear
swap_horiz
search
Vortarulo Vortarulo 18. helmikuuta 2013 18. helmikuuta 2013 klo 22.04.43 UTC flag Tee ilmoitus link Ikilinkki

Forgive me, if this has already been discusses.
I just happened to see these two French phrases:

http://tatoeba.org/deu/sentences/show/1048946 (Viens ici !)
http://tatoeba.org/deu/sentences/show/1275472 (Viens ici !)

The difference is the width of the space before the exclamation mark. In the first sentence it's a shorter space while in the second one it's longer. The visibility of this depends on the font.
The problem with those sentence is, that the automatical merging script won't merge them, because they're different for it. A similar problem occurs of course with typographical vs. simple apostrophes (’ vs. ') and between different kinds of quotation marks ("example" vs. „example“ in German). Or dashes as in Russian, the long — vs. the short -.

I have heard that French keyboards (or text editors?) sometimes automatically create the obligatory space before ! and ?, but not everyone writes French with a French keyboard or text editor, so this might explain why these spaces are sometimes wider, sometimes shorter.
I wonder if it was a good idea for a script to either automatically change one of those spaces to fit the other, or to teach the merging script to treat them the same.

For (German) "quotations" vs. „quotations“ it's perhaps more difficult to implement, also because some people might prefer to use the simple " keys, while others insist in using the orthographically correct „“ ones.
But I suspect that there are a bunch of sentence twins out there, with just this difference.

What's your opinion?

{{vm.hiddenReplies[15640] ? 'expand_more' : 'expand_less'}} piilota vastaukset näytä vastaukset
Shishir Shishir 18. helmikuuta 2013 18. helmikuuta 2013 klo 22.28.57 UTC flag Tee ilmoitus link Ikilinkki

About substituting French spaces before questions for the standard ones, it was done once and I guess could be done again.

http://tatoeba.org/spa/wall/show_message/4586

In the case of punctuation in other languages... I guess the same thing could be done, although I'd recommend to do so only in the cases when the current punctuation is plainly wrong. For example, in Spanish it's allowed to use "" instead of «», but it isn't to use - instead of the long — (although, having each of them different uses but both of them being used in Spanish, in this case it wouldn't be possible to change them automatically).

{{vm.hiddenReplies[15641] ? 'expand_more' : 'expand_less'}} piilota vastaukset näytä vastaukset
Vortarulo Vortarulo 18. helmikuuta 2013 18. helmikuuta 2013 klo 22.39.02 UTC flag Tee ilmoitus link Ikilinkki

Ah, thanks!
Yes, I think for -, – and — it's a bit more difficult. One would perhaps have to decide for each case and each language distinctly.

As for French: nice, I didn't know such a script was used already. The sentence pair above is actually the first time that I came across these different spaces in otherwise identical sentences.

By the way, a totally different question:
How often is the merger script used? And how long does it take for it to go through the whole database (or just through English or German or Klingon or Esperanto, for instance)? Also, how many sentence pairs get merged in one go, usually? Just asking out of curiosity.