menu
Tatoeba
language
Kaydol Giriş yap
language Türkçe
menu
Tatoeba

chevron_right Kaydol

chevron_right Giriş yap

Göz At

chevron_right Rastgele cümle göster

chevron_right Dile göre ara

chevron_right Listeye göre ara

chevron_right Etikete göre ara

chevron_right Ses ara

Topluluk

chevron_right Duvar

chevron_right Tüm üyelerin listesi

chevron_right Üyelerin dilleri

chevron_right Ana diller

search
clear
swap_horiz
search

Menü

Duvar'a dön

kemushi69 kemushi69 22 Mayıs 2016 22 Mayıs 2016 02:08:51 UTC link Kalıcı bağlantı

A question about punctuation ... not relevant to "Tatoeba day"

I actually have a few questions:

1. The "em dash"

In typesetting (and in html) there are two types of dash: the "n dash", which usually appears between numbers (eg, 1-9), and the "em dash" that's longer and usually appears between words, generally to indicate a pause (colon or semicolon) or parenthetical sentence. I mean, for example, "late againーwhat a surprise!".

When I use HTML/XML for this kind of thing, I just use the "mdash" HTML glyph, but here I've ended up just switching over to Japanese input mode and entering a hyphen, which get rendered as in my example above.

So my question is whether this is a good/correct way to get that symbol into a sentence?

2. Localisation of quotation marks

In English and Irish (which are my main languages), it's simply a case of using the double-apostrophe key, whereas other languages (eg, Spanish, German) use different characters to delimit quotations.

My question is whether English-style quotes are acceptable in general, or whether the localised text should always follow local rules (eg, 「」 for Japanese) ?

This is more a question about any automatic tools that process the data rather than what I should be typing in (ie, I guess I should always use the correct language-specific punctuation).

3. double-width space characters

If I make a mistake and include a double-width space character (due to being in Japanese input mode) instead of the usual space, will it mess up the value stored in the corpus? Or are these automatically converted to regular spaces (and coalesced, should I hit space twice)?

{{vm.hiddenReplies[26455] ? 'expand_more' : 'expand_less'}} cevapları gizle cevapları göster
TRANG TRANG 22 Mayıs 2016 22 Mayıs 2016 18:05:01 UTC link Kalıcı bağlantı

1. I'd say this is not the correct way. Even though they look similar, they are not detected as the same symbol, and might look different with other fonts.

- http://unicode-table.com/en/30FC/ (the Japanese dash that you've entered in "late againーwhat a surprise!")
- http://unicode-table.com/en/2014/ (the em dash)


2. In theory each language uses its own standard quotes. In practice there could be English quotes in non-English sentences because we don't pre-process and standardize punctation.


3. As said above, we currently don't standardize punctuation so if you enter a double-width space it will be saved as double-width space.

{{vm.hiddenReplies[26464] ? 'expand_more' : 'expand_less'}} cevapları gizle cevapları göster
kemushi69 kemushi69 22 Mayıs 2016 22 Mayıs 2016 18:36:04 UTC link Kalıcı bağlantı

Thanks, I'll have to be careful with points 1 and 3 then.