menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search
halfb1t {{ icon }} keyboard_arrow_right

Profile

keyboard_arrow_right

Sentences

keyboard_arrow_right

Vocabulary

keyboard_arrow_right

Reviews

keyboard_arrow_right

Lists

keyboard_arrow_right

Favorites

keyboard_arrow_right

Comments

keyboard_arrow_right

Comments on halfb1t's sentences

keyboard_arrow_right

Wall messages

keyboard_arrow_right

Logs

keyboard_arrow_right

Audio

keyboard_arrow_right

Transcriptions

translate

Translate halfb1t's sentences

halfb1t's messages on the Wall (total 89)

halfb1t halfb1t January 13, 2013 January 13, 2013 at 6:26:57 PM UTC link Permalink

I have the impression that the mobile version of Safari does not support user CSS. I have no IOS devices and have been unable to determine which browsers under IOS, if any, support CSS. Possibilities seem to be Chrome and Icab.

halfb1t halfb1t January 13, 2013 January 13, 2013 at 4:47:55 AM UTC link Permalink

If you mean for Tatoeba, you need to use a browser that supports user style sheets (as mentioned by liori). Then the following CSS may work for you:

.sentenceContent .text {
font: 16pt/20pt "Charis SIL";
text-align: center;
}

The text-align is just for testing. It will let you see that your CSS is being loaded and is effective, even if there are problems with your font spec.

If pt sizes don't work, try px.

halfb1t halfb1t January 13, 2013 January 13, 2013 at 3:06:59 AM UTC link Permalink

>> What is irrelevant is the existence of correct sentences of limited interest and utility--particularly to less than expert students

You are right to disagree with this assertion, since it is an intentional overstatement intended to reflect and to parody your own hyperbolic reference to relevance. My true view is that many sentences in our corpus (and many of my own contributions) are of primary interest--and therefore relevance--only to specialists.

One of the attractions of any sort of (raw, not weighted) rating system is that, as the number of ratings grows, it will eventually be possible to compute statistically meaningful estimates of the degree to which individual raters are in step with Tatoeba's thousand users. As you correctly point out, being out of step with the mass may be a good thing. The dispersion of this measure will reflect Tatoeba's success at simultaneously serving a variety of disparate purposes.

Your views on the efficacy of academic norms are interesting; but you must expect that to persons knowledgeable about language, linguistics, and sociolinguistics they will appear overblown, quaint, and--frankly--simplistic. In particular they pay insufficient attention to the disparity between the largeness of the varieties of speech in communities of speakers numbering in the tens of millions and the smallness of all such acadamies, which were more effective in a bygone era characterized by a greater reverence for authority and a feebler understanding of the nature and the power of the engines of language change.

> Contrary to your belief, There are only 2 kinds of remaining languages that are not decreed : English, and all the languages that it is replacing and that are on the verge of extinction, at a rate of about a dozen each year...

To mention only the first counterexample that comes to mind: Navajo. Do you suppose there is an acadamy that attempts to dictate the usage of, say, Comorian? What about Low Saxon? Malagasy? What academy provides decrees for Modern Standard Arabic? No matter, the single example of Navajo suffices to show that your assertion is an unsupportable exaggeration.

> Mandarin is so little a shared convention, that its name initially means "Language of the Ministers"...which reveals it was certainly not initially the language of the people on whom it was subsequently imposed

This assertion seems to indicate a lack of understanding of the phrase "shared convention." In the present context the phrase has nothing to do with voluntarism, but simply indicates a learned agreement regarding the (arbitrary) meanings of words and the structures that organize them.

halfb1t halfb1t January 12, 2013 January 12, 2013 at 1:31:55 PM UTC link Permalink

All languages are shared conventions: lexicons and grammars do not drop from heaven; and communication is successful just to the degree that speaker and hearer agree on the conventional meaning of the words and structures of the languages they share. This is a commonplace of linguistics and of language learning and teaching. It is for just this reason that languages, like all other conventions, must be learned.

Most languages have few speakers and no academies. When they are taught formally, it is by non-native speakers to non-native speakers, with occasionally a native informant in the wings. Correct is what the still surviving native speakers say to one another, understand when they hear, and accept as correct.

Prescription is effective only in the absence of substantial dissent: persistent deviation from academic prescriptions results in changed prescriptions. Numbers matter. Language existed before schools, government, or law; and solecisms have been around since the first language that was spoken by more than one group.

When languages--like French, English, Arabic, and, Malay--are officially established in more than one country, usage varies; and what is correct in one may not be correct in another.

What I propose is not the reign of mediocrity, but the rational, economical, and effective use of crowd sourcing to identify the middle of the road. The premise is that there is utility in identifying sentences whose correctness is broadly acknowledged. What is irrelevant is the existence of correct sentences of limited interest and utility--particularly to less than expert students.

The pairing of correctness and utility is suggestive. Reasonable people care about correctness precisely because it is useful. Maybe we should be wise to rate sentences on utility, with correctness just a part of the mix.

halfb1t halfb1t January 12, 2013 January 12, 2013 at 5:59:43 AM UTC link Permalink

I like per-sentence (and per-translation) thumbs-up/down. If the voter is logged (to prevent repeat votes and allow mind-changes), the accumulated votes could be used for a range of derivative calculations and applications, including ratings of contributors (and raters!) and apparent divisions of the corpus.

To the degree that languages are shared conventions, bad ratings for good sentences may accurately indicate sentences that are outside the main stream. If one of my sentences collects bad ratings, I shall be motivated to provide a comment that justifies the sentence. Those looking for main-line sentences will be happy to be spared both that sentence and its defense.

halfb1t halfb1t January 10, 2013 January 10, 2013 at 6:27:43 AM UTC link Permalink

You are more than welcome: I love to be the hero almost as much as I hate (being caught) being wrong.

halfb1t halfb1t January 10, 2013 January 10, 2013 at 2:33:45 AM UTC link Permalink

I am in general agreement with these helpful remarks--with reservations about 2) and 3) already detailed. As FeuDRenais has pointed out, there are also legacy issues.

I believe the most important point is that this problem is not going to go away anytime soon. I hope that what cognoscenti know is becoming clearer to all: some difficulties of this sort can only be dealt with by users.

Devising solutions to particular problems requires information about particular browsers, fonts, and OS's. A Wiki would be ideal for gathering/discussing this data. [Maybe. Getting one running would at least help with Sysko's excess free-time problem.] A Wiki would also be useful for posting/discussing the user style sheets that liori mentions. It seems likely that that these could, with time, become very sophisticated.

Please correct me if I'm wrong: It seems that per-language font specifications need not be implemented wholesale, but could be instituted one language at a time.

The failure of sharptoothed's Uyghur sample page to display correctly in FeuDRenais's Uyghur netbar is probably due to IE: a search for "bulletproof @font" may be enlightening.

I have also a vague idea about jQuery. If Tatoeba's pages were to try to load a local script file with a well-known name, many system- and user-specific manoeuvres might become possible. It will be objected that using such a facility would demand considerable sophistication. The answer is "Only for coding. Code that works can be passed around for cut-and-paste."

halfb1t halfb1t January 9, 2013 January 9, 2013 at 11:18:22 AM UTC link Permalink

My view is it's best to specify no fonts at all. That gives users the greatest freedom to find solutions to their own problems--or for the community to find solutions to their problems. Any font (face=family) spec at all--even a generic font spec like serif--restricts the range of solutions to particular problems; and a specific font spec, if a user has it installed but her browser fails to do the right thing, forces her to disable, hide, or delete that font.

It's possible, however, that specifying no fonts may cause some problem I haven't thought of. It wouldn't be the first time.

It's also more than likely that some users won't like what they see. In particular, everything will be in the same font; and I suspect most browsers will default to a serif font. Users will be able to change the font, but they won't be able to specify different fonts for, e.g., the main sentence and the translations; so the present look of the pages will be lost.

My belief is that it's still a good trade--for the foreseeable future: Tatoeba's pages are richer in glyphs than almost anything on the Web; and as more languages are added, that richness will only increase; and more problems like FeuDRenais's Uyghur will crop up. Specifying no fonts will make their solutions easier. His problem, e.g., will require him but to set his browser's default font to one that works on his system.

In the long run, per-language font specs may be needed. This will be the case when users' find problems with two languages, and the pick-the-right-default font solutions to the two problems are incompatible.

halfb1t halfb1t January 9, 2013 January 9, 2013 at 9:56:26 AM UTC link Permalink

From a little Web research, it seems you're right. Although a few translation initiatives are in the works, they seem not to be progressing apace.

halfb1t halfb1t January 9, 2013 January 9, 2013 at 9:26:22 AM UTC link Permalink

Here's the font string from the WordPress Uyghur translation effort: 'UKIJ Tuz Tom','Alpida Unicode System','Alkatip Tor','Alp Ekran','Microsoft Uighur',Tahoma,Verdana,Arial,Helvetica.

halfb1t halfb1t January 9, 2013 January 9, 2013 at 9:07:36 AM UTC link Permalink

How do you enter those Narrow No-Break Spaces when you contribute sentences?

halfb1t halfb1t January 9, 2013 January 9, 2013 at 9:05:05 AM UTC link Permalink

I think it's Trebuchet MS.

halfb1t halfb1t January 9, 2013 January 9, 2013 at 8:40:52 AM UTC link Permalink

Getting the (right) glyphs.

The first step is to install a font that has the glyphs. The second step is to get your browser to use that font. If the page you want to display specifies fonts for the glyphs in question, you may have to disable, hide, or delete those fonts from your system. If the page specifies a generic font like serif, sans-serif, or monospace, then you need to find not just a font that has the glyphs, but a serif, sans-serif, or monospace font that has the glyphs.

Some browsers allow you to override pages' font specifications. If you use one of those, you need not disable, hide, or delete specified fonts.

In some browsers, font changes are displayed immediately. Other browsers may require you reload the page or restart the browser.

halfb1t halfb1t January 9, 2013 January 9, 2013 at 8:06:02 AM UTC link Permalink

Thin Spaces in French: OS X + Chrome.

Microsoft Sans Serif and, for a wonder, Times New Roman have the glyph. Set either of these to be Chrome's Standard Font and disable Georgia.

halfb1t halfb1t January 9, 2013 January 9, 2013 at 7:55:40 AM UTC link Permalink

Charis SIL, a serif font, also contains the needed glyph, and is freely downloadable from the Web. The procedure is the same: (1) Set Chrome's Standard Font to Charis Sil, and delete the Georgia Font.

halfb1t halfb1t January 9, 2013 January 9, 2013 at 7:34:45 AM UTC link Permalink

Thin spaces in French: XP + Chrome.

The spaces at issue are not Unicode Thin Spaces at all, but Narrow No-Break Spaces. The distinction is important, because the latter are a relatively recent addition (version 3.0) to Unicode. XP predates that addition; and as shipped, XP has not fonts that contain that glyph. If your XP has a recent version of Word, you probably have Segoe UI installed, which contains the glyph. (Lucida Sans Unicode does not.) To get the glyph to appear you can delete the Georgia font and set Chrome's Standard Font to Segoe UI. Other options may be forthcoming.

halfb1t halfb1t January 9, 2013 January 9, 2013 at 7:28:29 AM UTC link Permalink

My experience suggests that there are strict limits on what Tatoeba can accomplish from its end, because the renderings that appear in users' browsers are strongly dependent on the fonts they have installed and the settings they put into their browsers.

The question is "Is my experience too narrow?" Careful reports by others with accurate relevant detail contribute to answering that question;
and when they appear here, they contribute to the community's understanding of the nature of the problem, which of course has important implications for rational solutions.

halfb1t halfb1t January 9, 2013 January 9, 2013 at 6:15:29 AM UTC link Permalink

> I suspect

That's fine; but more helpful would be a little data. Do you have the font Georgia on your computer? What happens if you disable or rename it? What happens if you set your browser's serif font to sans-serif?

halfb1t halfb1t January 9, 2013 January 9, 2013 at 3:53:18 AM UTC link Permalink

The questions that come to mind are: (1) Which of the five alphabets in the Russian Wikipedia article are to be chosen? (2) Is simple transliteration between all pairs accurate in every case? On the face of it, it seems that transliteration into the Arabic script involves at least the usual initial-medial-final-solitary glyph selection.

These are not insoluble problems, but they need careful identification before they have any chance of being solved. A related and more fundamental issue is support for multiple scripts for a single language in general. This issue has arisen before. I believe it had to do with an Indian language, but I forget which.

The important thing to recognize in this connection is that no solution is likely to be simple.

halfb1t halfb1t January 9, 2013 January 9, 2013 at 12:18:22 AM UTC link Permalink

If you're thinking of transliteration you might want to start with the Wikipedia article on Uyghur alphabets.