menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search
papabear papabear April 16, 2011 April 16, 2011 at 7:28:35 PM UTC link Permalink

Someone talked a "like" system a while back, but perhaps we can use it as a measure of how natural a sentence sounds.

{{vm.hiddenReplies[5733] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic April 16, 2011 April 16, 2011 at 7:47:00 PM UTC link Permalink

It will produce the same crap as search engines: After enough time, mistakes become the rule, simply because people who make them are more numerous than people who don't. The victory of ignorance over expertise !
OK, in English, usage rules, regardless of how it originated. But it isn't the case with other languages that have inner structures and rules. So the number of votes is irrelevant.

{{vm.hiddenReplies[5734] ? 'expand_more' : 'expand_less'}} hide replies show replies
Zifre Zifre April 16, 2011 April 16, 2011 at 9:30:14 PM UTC link Permalink

> OK, in English, usage rules, regardless of how it originated. But it isn't the case with other languages that have inner structures and rules.

Actually, all languages (except some constructed ones) work like English. The only difference is that some languages are more resistant to change than others. English has rules just like French, but they change faster.

> So the number of votes is irrelevant.

I think you're misunderstanding papabear. The "liking" system would not be used to determine whether a sentence is correct, but only if it sounds natural to native speakers (and votes are relevant in all languages for this). Naturalness and correctness are distinct - it's very easy to write sentences that are perfectly correct but sound strange.

{{vm.hiddenReplies[5743] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic April 16, 2011 April 16, 2011 at 9:52:03 PM UTC link Permalink

>actually, all languages (except some constructed ones) work like English. The only difference is that some languages are more resistant to change than others. English has rules just like French, but they change faster.

No you're wrong. Chinese, for instance, has a strong inner structure and usage is second to that.
But there are other differences. For instance, in French, Law is key in defining the language, because French Law is entirely written, and not based on cases, such as in the anglo-saxon judicial system. Consequently, terms must be defined very precisely and for a very long time. That is why the French administration relies on an Academy to help it define terminology unambiguously. US or UK laws, so far, do not need such a non-ambiguity,because they're mainly local and case-based, so they are always prone to interpretation by local judges, which French Law is not. As a matter of fact, whenever there are 2 words used by the French, one being street based and the other one being official terminology, the official terminology always ends up prevailing, because it percolates through contracts, businesses, professions, trainings, education, etc...At the end of the day, only official words subside with very few exceptions.
This is the opposite in English.
In German, although most Germans would deny having an "official German" and deride the french Academy, there is nevertheless a written Law with a necessary official terminology and official writing rules which have been recently reformed. So usage is not the law in German either.
Japanese has also been repeatedly reformed by state decrees. And so has been Mandarin.
I think the idea that a language is defined by usage, such as English, contrary to English speakers perception, is actually a minority case in the world languages. Most official languages are mainly states' constructions.

{{vm.hiddenReplies[5744] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic April 16, 2011 April 16, 2011 at 10:07:16 PM UTC link Permalink

*subside => prevail

U2FS U2FS April 16, 2011 April 16, 2011 at 10:12:51 PM UTC link Permalink

> No you're wrong. Chinese, for instance, has a strong inner structure and usage is second to that.

False. Chinese has a strong inner structure but usage is not second to inner structure. Practically, Chinese Mandarin works a lot like English (or English like Chinese Mandarin). Rules and usage are equally important.

{{vm.hiddenReplies[5754] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic April 16, 2011 April 16, 2011 at 10:16:26 PM UTC link Permalink

I bet 100€ you would contradict me here. I won.

{{vm.hiddenReplies[5756] ? 'expand_more' : 'expand_less'}} hide replies show replies
U2FS U2FS April 16, 2011 April 16, 2011 at 10:28:02 PM UTC link Permalink

Cool. Yet what you just said about Chinese language is bullsh*t, it needed to be corrected.

{{vm.hiddenReplies[5758] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic April 16, 2011 April 16, 2011 at 11:00:05 PM UTC link Permalink

Well, if you say Chinese is driven by usage, you must be deluded and I would question your ability to produce chinese sentences here...
Mandarin, at least, has systematically been defined by the state and its educational system, since very early, including more recently, by Mao himself. Written Mandarin, at least, is probably one of the most constructed languages of all, because it served the state's unity to unify and standardise the written language. In China, even more than in France, the written language has always been an instrument of the administration, and I am curious to know how many words from usage in the extended Chinese Diaspora has made it to the official status in China. Probably none. So the bullshit is pretty much on the "usage legend's" side, I would think...

{{vm.hiddenReplies[5759] ? 'expand_more' : 'expand_less'}} hide replies show replies
U2FS U2FS April 16, 2011 April 16, 2011 at 11:21:26 PM UTC link Permalink

I said "Rules and usage are equally important." Read it twice.
Now you may want to prove your statement ? You seem to know a lot more about Chinese language as a whole than I do... ?

{{vm.hiddenReplies[5760] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic April 16, 2011 April 16, 2011 at 11:43:15 PM UTC link Permalink

>You seem to know a lot more about Chinese language as a whole than I do... ?

Yeah, I probably read my first history of China and its language(s) several years before you were born...
I started learning Chinese ideogramms when I was 20. I think you probably saw the first one a decade later...

{{vm.hiddenReplies[5761] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic April 16, 2011 April 16, 2011 at 11:50:59 PM UTC link Permalink

du reste, je l'avais presque oublié, mais mon mémoire de fin d'études portait sur l'adoption du système d'écriture chinois par le Japon...Une absurdité qui dure depuis plus de 1000 ans...

U2FS U2FS April 16, 2011 April 16, 2011 at 11:55:32 PM UTC link Permalink

Yeah we all already know that you're an oldie and that anyone else below the age of 45 just can't put two words together. Now come on, prove us how broad your knowledge is!

{{vm.hiddenReplies[5763] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic April 17, 2011 April 17, 2011 at 12:50:30 AM UTC link Permalink

I am not you, alas. I wish I had mastered 6 languages at the age of 22, as you do. For me, as for most of us poor souls, it was just plain hard work and reading hundreds of books and translating long texts. Not everybody gets to sucking the world's literature from their feeding bottles...You should thank your parents!

Zifre Zifre April 17, 2011 April 17, 2011 at 3:19:25 AM UTC link Permalink

> No you're wrong. Chinese, for instance, has a strong inner structure and usage is second to that.

And English has no "strong inner structure"? You kind of give the impression that you think that English is inherently "bad" or illogical. It's really just like countless other languages. I'm annoyed with its overly large world influence just as much as you are... just don't blame it on the language itself.

> But there are other differences. For instance, in French, Law is key in defining the language, because French Law is entirely written, and not based on cases, such as in the anglo-saxon judicial system. ...

I don't think differences in judicial systems are very relevant. Also, have you ever read US laws? They're pretty meticulous about terminology...

> I think the idea that a language is defined by usage, such as English, contrary to English speakers perception, is actually a minority case in the world languages.

For someone who constantly talks about minority languages, the irony is amazing. There are ~7,000 languages in the world. (It depends a lot on how you count.) The ratio of languages to countries is ~35:1. So I'm guessing languages like French are in the minority there. Even English is much less based on usage than most of the languages in the world. (And language regulation is a fairly new concept in most languages, even French. Keep in mind that people have been speaking languages for thousands of years.)

{{vm.hiddenReplies[5768] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic April 17, 2011 April 17, 2011 at 10:42:06 AM UTC link Permalink

>And English has no "strong inner structure"?

No. English is the superposition of 4 different and completely inconsistent corpus and rules, from saxon, latin, franco-norman and danish...

>I don't think differences in judicial systems are very relevant. Also, have you ever read US laws? They're pretty meticulous about terminology...

Yes they do very much. Because the same word must be unequivocal from Lille to Marseille. When in the USA, it is not always necessary, since most Law is local...
France is also a very centralised country with a central administration and a centrally administered public school system, that receives instructions to strictly abide by official terminology. That doesn't exist in the USA, the UK or any other English-speaking country.
French is mainly constructed by the state since the XVIIth century, when English is mainly constructed by the street. Deny it as you may, this is the way it is.

>There are ~7,000 languages in the world. (It depends a lot on how you count.) The ratio of languages to countries is ~35:1. So I'm guessing languages like French are in the minority there. Even English is much less based on usage than most of the languages in the world. (And language regulation is a fairly new concept in most languages, even French.

Yes, I was refering to "official languages" if you read me precisely.
Indeed, most languages are not official, and that is why they're exterminated by official ones. Several languages disappear every year from the surface of Earth, as a result.
English is solely based on usage, since there is no authority of the language.
Former British Prime Minister Gordon Brown solemnly declared that the UK was "offering English to the world". Well..the world uses it and transforms it...outside of any framework.
Language regulation is born with nations with a written Law. Because Law describes the entire world and, more than any other institution, needs to name everything. More and more languages will be regulated because more and more Law is being written. Even British Law is now progressively moving from case-base to written, under the influence of EU Law, since EU Law must be transcripted into national Law.
Because of that, I predict a growing tension between a Euro-Law-English and the streets-English(es) in the rest of the world. Maybe internet will unify it, maybe not...

sacredceltic sacredceltic April 16, 2011 April 16, 2011 at 10:01:09 PM UTC link Permalink

> think you're misunderstanding papabear. The "liking" system would not be used to determine whether a sentence is correct, but only if it sounds natural to native speakers (and votes are relevant in all languages for this). Naturalness and correctness are distinct - it's very easy to write sentences that are perfectly correct but sound strange.

To a French, almost every turn of phrase by a French speaking inhabitant of Québec doesn't sound natural, and I'm sure it's the same with Austrian German to a standard German, Sichan Chinese to a standard Chinese, etc...
So the proposed system would end up being the law of the majority of the voters, regardless of the biases they have from their origin, age, social class etc...although these biases rule, as I have shown earlier.
The number of votes to decide whether a phrase is "natural" (whatever that means) is utterly irrelevant. It is just an imperialist view of languages and it would result only in further marginalising minorities.

{{vm.hiddenReplies[5749] ? 'expand_more' : 'expand_less'}} hide replies show replies
Zifre Zifre April 17, 2011 April 17, 2011 at 2:54:27 AM UTC link Permalink

Yes, I agree completely. Too often, I edit sentences from the Tanaka Corpus that are perfectly good British English, but sound strange to me, and then CK corrects me. I was only trying to make sure that you understood what I think papabear was suggesting. I don't think it's actually a good idea.

{{vm.hiddenReplies[5766] ? 'expand_more' : 'expand_less'}} hide replies show replies
papabear papabear April 17, 2011 April 17, 2011 at 2:59:45 AM UTC link Permalink

You've all convinced me that it's a bad idea, too. Let's move on.

Shishir Shishir April 17, 2011 April 17, 2011 at 12:57:01 AM UTC link Permalink

I agree with Sacredceltic there: there are some sentences written in Tatoeba that sound quite odd to me, but that are completely natural for the people from Mexico, Chile or Argentina, or any other South American kind of Spanish. So the only thing that would be useful would be to tag the sentence according to the place where it's said (if it is really natural somewhere) or tag it as unnatural or modify it in case it isn't.

Swift Swift April 16, 2011 April 16, 2011 at 7:47:01 PM UTC link Permalink

As I've said before when this has been raised, I have my doubts as to how representative Tatoebans are of the speakers of their language. I also see practial problems with setting up such a system, even to get data on the relative popularity of such sentences among Tatoebans.

{{vm.hiddenReplies[5735] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic April 16, 2011 April 16, 2011 at 7:49:58 PM UTC link Permalink

>I have my doubts as to how representative Tatoebans are of the speakers of their language.

At last, you notice!

{{vm.hiddenReplies[5736] ? 'expand_more' : 'expand_less'}} hide replies show replies
Swift Swift April 16, 2011 April 16, 2011 at 8:06:20 PM UTC link Permalink

“As I've said before when this has been raised …”

You have my vote for this year's Tatoeba award in both selective reading and selective memory categories.

sacredceltic sacredceltic April 16, 2011 April 16, 2011 at 8:43:28 PM UTC link Permalink

There are actually 3 different biases that explain why Tatoeba is NOT representative of languages' speakers:

1) Age bias: Most users are either younger than 25 or older than 60. The reason is that active people hardly dedicate time to such an activity.

2) Interest bias; Most users are people interested in languages, because they are multilingual/of mixed cultures/mixed backgrounds, but most speakers of languages are not multilingual, and not that interested in languages. In the UK, for instance, 77% of people do not speak a second language.

3) Linguistic bias: Most users are people who understand what this service is about at first glance, but in most cases, that implies that one has to understand English, because the chance is that if you find Tatoeba through a search engine, you will find its description in English. Then one will see wall messages, and most of them are, again, in English, because most active contributors and all moderators speak English on Tatoeba (an sometimes chase the people who don't, to the point that a message was introduced to tell people that they could use the language of their choice). A number of people who do not like or do not understand English will feel it is not for them and will subsequently ignore the service. I had myself that impression at the start and almost backed out.

As I repeated over and over again, the vast majority of multilingual people on this planet do not speak English.

I will give 4 examples from different areas of the world (but there are hundreds of others):
If you are for instance Kabyle from Algeria, the main probability is that you speak 3 or 4 languages and they would be: Algerian Arabic, Berber (which has nothing to do with the former), French and possibly Classical Arabic.
If you are Kazakh, the chance is that you speak Kazakh, Russian, and possibly Uzbek or another central asian language, but no English.
If you are Kurdish, you speak Kurdic, probably Turkic and/or Iraqian Arabic and/or Persian.
If you are an educated person from Kerala in India, you speak Malayalam, Hindi, and possibly Tamil because it is one of the linguae francae in Southern India...

There are billions of such people in the world who speak 2, 3 or more languages, without speaking English...

It is symptomatic, for instance, that we do not have any Swahili's contributor, although Swahili is a lingua franca in a vast swathe of Eastern Africa. The same is also true of Quechua, Wolof, Urdu, Tamil and hundreds of languages for which these languages serve as an intermediary, when English doesn't.
In order to correct this bias, more versions of the interface should be created as well as local descriptions of the service that can be retrieved locally through search engines, and the moderation should be de-centralised, to create regional groups of which English should not necessarily be the main moderation's language. Otherwise Tatoeba will remain a rich western teenagers's toy...

{{vm.hiddenReplies[5739] ? 'expand_more' : 'expand_less'}} hide replies show replies
papabear papabear April 16, 2011 April 16, 2011 at 9:58:07 PM UTC link Permalink

This gets me to thinking: should Tatoeba reach out to, say, linguists around the world who specialize in minority languages? Or perhaps even to the people who edit the Wikipedia articles for these languages? (We do have some prominent Wikipedians among us.)

The problem I see with this is which languages we would use to contact such people, although I'm sure it depends on the person. Or perhaps we just have to keep waiting....

{{vm.hiddenReplies[5748] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic April 16, 2011 April 16, 2011 at 10:06:30 PM UTC link Permalink

>Or perhaps we just have to keep waiting....

I think the waiting has been on for the last 4 years...

jakov jakov April 17, 2011 April 17, 2011 at 11:16:38 PM UTC link Permalink

I have mentioned this here: http://tatoeba.org/eng/wall/sho...2#message_5572

But my idea was not to vote, whether one thinks that a sentence is correct or not, but just that one likes it, because it is e.g. intelligent, beautiful, etc.

Maybe we could add this to the "heart" feature. What i wanted to push was the social part: Users that like a sentence should be (optionally) listed there. And maybe we should think about a publishing feature for facebook, twitter etc. so one can share the most beautiful sentences with the world and attract new users.