menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search
OsoHombre OsoHombre February 13, 2017 February 13, 2017 at 11:59:36 AM UTC link Permalink

Tomorrow is Valentine's Day when people celebrate love. Here are a few of my sentences about love for those who are interested in translating them:

http://tatoeba.org/eng/sentence...o=&sort=random

I would appreciate any effort to help me.

{{vm.hiddenReplies[28052] ? 'expand_more' : 'expand_less'}} hide replies show replies
Ricardo14 Ricardo14 February 13, 2017 February 13, 2017 at 12:23:41 PM UTC link Permalink

Let's do it ;D

{{vm.hiddenReplies[28057] ? 'expand_more' : 'expand_less'}} hide replies show replies
CK CK February 14, 2017, edited February 14, 2017 February 14, 2017 at 2:12:14 AM UTC, edited February 14, 2017 at 2:46:00 AM UTC link Permalink

Here are all the English sentences that I've proofread that have either the word "love" or the word "Valentine".

https://tatoeba.org/eng/sentenc...eated&list=907


The following is the same advanced search, limited to sentences that haven't yet been translated into any language.

https://tatoeba.org/eng/sentenc...=&sort=created


You can further fine-tune this advanced search by choosing your own native language under the "Translations" after the "Exclude". This will then show you all proofread English sentences that have not yet been translated into your own language.

For example, here is the same search to find those sentences that are not yet translated into Portuguese.

https://tatoeba.org/eng/sentenc...=&sort=created

This is the same search fine-tuned for native Arabic speakers.

https://tatoeba.org/eng/sentenc...=&sort=created

{{vm.hiddenReplies[28070] ? 'expand_more' : 'expand_less'}} hide replies show replies
Ricardo14 Ricardo14 February 14, 2017 February 14, 2017 at 12:08:18 PM UTC link Permalink

Thanks, CK!

PaulP PaulP February 14, 2017, edited February 14, 2017 February 14, 2017 at 7:52:29 AM UTC, edited February 14, 2017 at 7:56:34 AM UTC link Permalink

I see there many near duplicates. When we have the sentence „Tom doesn't love me.”, what sense does it have to create „Fadil doesn't love me.” and to start translating it in many languages?

{{vm.hiddenReplies[28074] ? 'expand_more' : 'expand_less'}} hide replies show replies
Aiji Aiji February 14, 2017 February 14, 2017 at 9:27:40 AM UTC link Permalink

I think that is not a new problem unfortunately.
In the future, I hope we may get a generic function for names, countries, etc.
I've recently had to adopt maybe twenty times the "same" translation of "What is the average salary in [country name]". Made me think about the same thing you're mentioning but I thought that in the future these sentences could be all merged into "What is the average salary in [generic country]".
Of course, such a feature involves many difficulties specific to each language but I think it was already discussed in the past.

{{vm.hiddenReplies[28075] ? 'expand_more' : 'expand_less'}} hide replies show replies
deyta deyta February 14, 2017 February 14, 2017 at 10:00:09 PM UTC link Permalink

+1

OsoHombre OsoHombre February 15, 2017 February 15, 2017 at 8:49:37 AM UTC link Permalink

Aiji:
I am personally against the adoption of what you called [generic country]. Why? Because the website already has 300 languages and one could be interested in knowing how 'Kenya' is called in all 300 languages, then how Russia, Australia, Egypt and South Africa are called. Therefore, and for the sake of having a rich and diverse corpus in each language, I strongly recommend that we don't adopt standard names for persons, cities, administrative subdivisions and countries. After all, the world can't be revolving around just one city. I need to contribute example sentences specific to some cities like Istanbul, Jerusalem and Bangkok, and in a city like Boston, you wouldn't find mosques and Buddhist temples like those you find in other places of the world, besides, there are no bazaars or pyramids in or near Boston. In my opinion, people live in different parts of the world, they represent their own language communities and they should be free to use the languages they know to express their own linguistic reality.

{{vm.hiddenReplies[28093] ? 'expand_more' : 'expand_less'}} hide replies show replies
Ricardo14 Ricardo14 February 15, 2017 February 15, 2017 at 11:16:34 AM UTC link Permalink

+1

Also, there are some linguistic points here...
As everybody knows, not all languages use the latin script and even some that use them do not "apply" them on the same way for proper names, for example.

some examples:

Tom - In Hungarian, there's also "Tomi" https://tatoeba.org/eng/sentenc...rom=eng&to=hun - https://tatoeba.org/eng/sentences/show/4381219

Τομ - Greek - https://tatoeba.org/eng/sentenc...rom=eng&to=ell (no change)

Том - Russian - https://tatoeba.org/eng/sentenc...rom=eng&to=rus (no change)

but

Ricardo

Greek - Ριχάρδος, Ριχαρδε, Ριχαρδον
Russian - Рикардо
Turkish - Ricardo, Ricardı

there are much more examples so far but this show us that just making strict 'Tom'
and "Mary' might not give us all the examples and show us all the 'beauties' of each language. Besides, there's a practical question here. For example, how can I say 'Cyprus" in Greek, Russian, Hungarian and Turkish? Animes are from Japan, Russia will hold the next FIFA Confederations Cup, etc

CK CK February 15, 2017, edited February 15, 2017 February 15, 2017 at 12:19:21 AM UTC, edited February 15, 2017 at 12:52:13 AM UTC link Permalink

Using standard names does help us more quickly get groups of translations all linked to the same pattern.

For example, someone writes a sentence in German with "Tom" that is translated into Spanish, and then someone translates that Spanish sentence into Japanese which is then translated into Polish, etc. Eventually, we get a lot of sentences that are indirectly translated which potentially can be linked together.

We will always have some newcomers contributing "non-standard names", but perhaps we should discourage people from just flooding the database with a new set of names like this.

There's a well-known proverb that's perhaps applicable.

https://tatoeba.org/eng/sentences/show/1422381
Don't change horses in midstream.

I assume that it's possible to talk about people named Tom, Mary, John and Alice in Arabic. We transcribe foreign names into Japanese all the time and it doesn't seem to be a problem here in Japan.

http://www.biography.com/people...mous-named-tom
http://www.biography.com/people...ous-named-mary
http://www.biography.com/people...ous-named-john
http://www.biography.com/people...us-named-alice

Perhaps some of you will find the following interesting.
This shows you how it's possible to change "wildcard" names.

Sentence Patterns: Substitutions
http://aitstudy.com/sub/

Of course, the computer programming would be a little more complex for languages that make grammatical changes to names.

I plan to “harvest” any of the good English sentences with names that aren’t the standard wildcard names and resubmit them with the wildcard names.

We already have 132,906 English sentences with these 4 standard wildcard names.
http://tatoeba.org/sentences/se...Alice&from=eng

119,675 of these are on my list of proofread English sentences (List 907).
https://tatoeba.org/eng/sentenc...io=&sort=words

94,067 of these have audio.
https://tatoeba.org/eng/sentenc...io=&sort=words


Reference:

Wildcards Used to Help Avoid Too Many Near Duplicates
http://bit.ly/tatoebawildcards

These are the guidelines that I try to follow and that many other members follow.

{{vm.hiddenReplies[28089] ? 'expand_more' : 'expand_less'}} hide replies show replies
Aiji Aiji February 15, 2017 February 15, 2017 at 3:02:57 AM UTC link Permalink

The last link you showed is what I think would be the best for here. If we could have some generic <male name>, <female name>, etc., each language could have "local" names. For the four names you're giving for example, in French it would more naturally be Thomas/Tom, Marie, Jean, Alice.

Of course a name is a name, but translating them is always under questioning so wildcards would be a good compromise in my opinion.
Having wildcards would also be a good occasion to show/read some names of different countries (for example, there aren't so many John in France...), that is a good and important cultural aspect of corpora of sentences.

deniko deniko February 15, 2017 February 15, 2017 at 6:00:01 AM UTC link Permalink

While I think it's a good idea to MOSTLY use a few standard names/countries/nationalities, I don't think this should be formalized in tatoeba's rules, or even that users should be discouraged to use different names.

It's often useful to see patterns of how different names/nationalities work in different cases. I'll use Ukrainian as an example, but this apply to other Slavic languages as well. I'll use transliteration instead of writing using Cyrillic.

Tom = Tom
Mary = Meri
Other names I'm going to use: Yulia (female), Andriy (male), Boria (male)

Tom loves Mary = Tom kohaye Meri.

Knowing this pattern, how can you come up with this:

Mary loves Tom = Meri kohaye Toma (note the ending, Tom->Toma)
Andriy loves Yulia = Andriy kohaye Yuliu (Yulia->Yuliu)
Yulia loves Andriy = Yulia kohaye Andriya (Andriy->Andriya)
Yulia loves Boria = Yulia kohaye Boriu (Boria->Boriu)

You do need more names than four if you want to understand how they work in different combinations.

That's more than true for countries and nationalities as well.

Even in French, if you know how to say:

I live in France = J'habite en France.

how can you come up with the following:

I live in Canada = J'habite au Canada.
I live in the United States = J'habite aux États-Unis.

So while this pair:
I live in France.
I live in Canada.

looks like a near-duplicate in English, it doesn't look like a near duplicate in French:
J'habite en France.
J'habite au Canada.

So please don't formalize anything regarding using names, countries, and nationalities. Part of the experience you get learning a languages is looking at those patterns, when you throw in different names in very similar sentences and see how they behave.

{{vm.hiddenReplies[28091] ? 'expand_more' : 'expand_less'}} hide replies show replies
OsoHombre OsoHombre February 15, 2017 February 15, 2017 at 9:11:41 AM UTC link Permalink

Deniko:
> looks like a near-duplicate in English, it doesn't look like a near duplicate in French:
J'habite en France.
J'habite au Canada.

You have given a perfect example to illustrate the point. Have you noticed how the French preposition changes before each country name? Besides, I would like to add the following: online Arabic resources are not always practical and complete. In terms of names of countries, this website offers us a possibility to show Arabic learners how to properly use the name of a country or a city in different contexts. This is just one argument among many others that I have in favor of letting users enjoy the freedom of expressing themselves freely in the languages they use.

> So please don't formalize anything regarding using names, countries, and nationalities.

I personally should be surprised that Tatoeba formalize this. Opening the website to 300 different languages and formlizing such a thing are two very contradictory things, I think. It is not in Tatoeba's interest to do it. It is better for Tatoeba to be a good and huge collaborative project similar to Wikipedia instead of formalizing such things. This said, I believe that every member of the project is free to choose whatever they like, but I demand that my freedom of expression be respected on this website, and as I said it many times already, as long as Tatoeba doesn't formally require the use of standard names, I will not use them and, once again, it is not in Tatoeba's interest to do it. Tatoeba looks like a great and ambitious project and such a project needs to open up to this huge and diverse world where more than 7,000 languages are still spoken.

{{vm.hiddenReplies[28095] ? 'expand_more' : 'expand_less'}} hide replies show replies
Guybrush88 Guybrush88 February 15, 2017 February 15, 2017 at 4:03:38 PM UTC link Permalink

This is the case also for Italian. With countries, one can also understand their gender: for example, Brasil and France have a neutral gender, but, in Italian, Brasil is masculine ("il Brasile") and France is feminine ("la Francia"), so having different possibilities may be useful for learners, since they can see the different genders a country can have in Italian

{{vm.hiddenReplies[28099] ? 'expand_more' : 'expand_less'}} hide replies show replies
OsoHombre OsoHombre February 15, 2017 February 15, 2017 at 6:38:12 PM UTC link Permalink

Guybrush:
The world is so rich and in some languages, even a person's name can be modified by the syntaxic environment via a demonstrative or a preposition. For example, Arabic nouns (including Arabic proper nouns) are affected by what is called 'huruuf al-jar' (or what may be referred to as prepositions). In a sentence like 'I saw Fadil' and a sentence as 'I told Fadil', the Arabic gramamtical form of the noun Fadil is different:

رأيتُ فاضلاً
قلت لفاضلٍ
Transcription: Ra'aytu Fadilan. Qultu liFadilin.

A learner needs to observe such changes and if we limit our standard-name choice to Tom and Mary, Arabic wouldn't be able to display such grammatical features. In fact, we shouldn't base our choices on just one language. Let the world be a natural world and not be affected by choices related to technical criteria. In fact, limiting standard names to two Englosh names could affect both the education and scientific value of Tatoeba's corpus.

{{vm.hiddenReplies[28102] ? 'expand_more' : 'expand_less'}} hide replies show replies
odexed odexed February 15, 2017 February 15, 2017 at 7:09:39 PM UTC link Permalink

I may be wrong but I think huruf al-jar is when we use prepositions like "في البيت" but you gave some good examples of إعراب

{{vm.hiddenReplies[28105] ? 'expand_more' : 'expand_less'}} hide replies show replies
OsoHombre OsoHombre February 15, 2017 February 15, 2017 at 7:24:30 PM UTC link Permalink

'Harf al-jarr' is part of the terminology of Arabic traditional grammar. In modern grammar, we may refer to it as a preposition. In traditional grammar, they call it 'harf al-jarr' because when it immediately precedes a noun, it causes it to end with an /i/ or the 'kasra' diacritic.

OsoHombre OsoHombre February 15, 2017 February 15, 2017 at 8:56:46 AM UTC link Permalink

CK:
> Using standard names does help us more quickly get groups of translations all linked to the same pattern.

This brings us back to an eternal debate between natural-language supporters and computer linguists that tend to think that everything should be standardized and mechanized for the purpose of developing language software programs. My opinion is that we should let people express themselves in a natural way. If my friend's name is Fadil, then I prefer to write about Fadil and not Tom. If my city is Cairo or Athens, then I prefer to use my city's name. Linguistically speaking, this is much more natural and interesting than recommending (thank God it's not imposing) people to use a limited set of standard names. This could even block the imagination of people, I think. If I lived in a small Indian village, I would be writing about that village and all the surrounding villages and towns, not about Boston or New York I know nothing about.

{{vm.hiddenReplies[28094] ? 'expand_more' : 'expand_less'}} hide replies show replies
Hybrid Hybrid February 16, 2017, edited February 17, 2017 February 16, 2017 at 11:15:48 PM UTC, edited February 17, 2017 at 1:07:49 AM UTC link Permalink

>we should let people express themselves in a natural way.

I agree. We shouldn't be forced to use wildcards. I don't want Anne of the Green Gables to become Mary of Boston :)

{{vm.hiddenReplies[28122] ? 'expand_more' : 'expand_less'}} hide replies show replies
OsoHombre OsoHombre February 17, 2017 February 17, 2017 at 9:40:56 AM UTC link Permalink

Hybrid:
It was a good example. Thank you.

OsoHombre OsoHombre February 15, 2017, edited February 15, 2017 February 15, 2017 at 7:35:25 PM UTC, edited February 15, 2017 at 7:39:35 PM UTC link Permalink

CK:
> We will always have some newcomers contributing "non-standard names", but perhaps we should discourage people from just flooding the database with a new set of names like this.

Discourage people? I don't think that the words Tatoeba and 'discourage' would make a good recipe. I have already explained my reasons and I what I expect you, guys to do, is to respect my choices and opinions as long as I'm being correct and logical, and not to use any coercive measures against me (as a motivated contributor) and any other user that wants to enjoy the freedom of being a member of their own language community. I think that this is a basic human right that's recognized by international institutions and that every member on Tatoeba should enjoy. To tell the truth, I was even a little bit shocked when I read the word 'discourage'.

> These are the guidelines that I try to follow and that many other members follow.

Yes, but I'm part of 7 billion other members of this planet Earth that maybe want to see things differently. Even if I'm the only one to think like that, I modestly think that 7,000 language communities should be represented in a much better way, not just with Tom, Mary and Boston.

gillux gillux February 17, 2017 February 17, 2017 at 3:50:17 AM UTC link Permalink

I strongly believe that we should not change our way of writing sentences for technical reasons. Programs should adapt to languages, not the opposite.

How about relating near-duplicate sentences with a fuzzy matching algorithm? So that for example, on a given sentence page, one could see a list of near-duplicates, along with their translations. I believe such an algorithm could be quite effective, even if it can’t be perfect.

{{vm.hiddenReplies[28123] ? 'expand_more' : 'expand_less'}} hide replies show replies
OsoHombre OsoHombre February 17, 2017 February 17, 2017 at 9:39:15 AM UTC link Permalink

Gillux:
Although programming isn't my concern at all, I like your idea of developing an algorithm to solve the problem of near-identical sentences. My point here is: let programmers find solutions to technical problems and not "bother" normal contributors about that, because, frankly speaking, technological problems should not dictate their requirements on natural people speaking natural languages in a natural way. Those who want to develop talking robots and AI should solve their challenging problems by copying nature. They should not shape nature (and simplify it) in such a way to make it "easier" for them to develop their increasingly sophisticated programs. Just one more note: Gillux, I'm just expressing my view. I'm not being confrontational and I don't want this to turn into an argument, OK? It's just my frank opinion that happens to be very different from some other people's opinion.

deyta deyta February 17, 2017 February 17, 2017 at 12:23:39 PM UTC link Permalink

+1
I agree with you.

Something must be done
If there are no limitations or solutions, tatoeba can turn into garbage.

{{vm.hiddenReplies[28129] ? 'expand_more' : 'expand_less'}} hide replies show replies
OsoHombre OsoHombre February 17, 2017 February 17, 2017 at 1:12:54 PM UTC link Permalink

Deyta:
In my opinion, Tatoeba can't turn into garbage as long as there are many admins and what the site refers to as 'corpus maintainters', there is no risk for the site to become a dump or a useless resource. In my opinion, it's up to Tatoeba to make sure that its open-to-the-public interface be used properly but this should be done while preserving its friendly and inviting atmosphere. After all, I think that if the website ambitions to grow (I'm sure it does) to become like Wikipedia, it needs to have the capacity to manage its quality but it also needs to do so without undermining the freedom of its contributors.

OsoHombre OsoHombre February 15, 2017 February 15, 2017 at 8:41:55 AM UTC link Permalink

Paul:
I understand the issue with what you refer to as near-duplicates. I only have a human brain and I can't memorize and guess all the potential near-duplicates that exist on the website. I also guess that every user is free to choose what to translate and what not to translate. That's exactly what I do personally. Still, I will try to avoid what I think could become near-duplicates by avoiding what seems to be simple sentences that just anyone could make like 'X loves Y' or 'X went home.'

{{vm.hiddenReplies[28092] ? 'expand_more' : 'expand_less'}} hide replies show replies
OsoHombre OsoHombre February 15, 2017 February 15, 2017 at 10:34:45 AM UTC link Permalink

To Paul and everybody:
Oh, there is just one point I would like to warn about, if I may. I have noticed that CK re-wrote some of my 'Fadil-and-Layla' example sentences using the names Tom and Mary. I strongly object this because I think that in this context where we all wish to avoid unnecessary near-duplicates to avoid unnecessary translation work, re-writing 'Fadil-and-Layla' sentences as 'Tom-and-Mary' ones should technically constitute an intentional creation of near-duplicates. So I persoanlly think that if near-duplicates are created unintentionally, this is OK, but if they are created intentionally just to 'standardize' the corpus no matter what, I think this would only aggravate this issue of near-duplicates.

{{vm.hiddenReplies[28096] ? 'expand_more' : 'expand_less'}} hide replies show replies
OsoHombre OsoHombre February 15, 2017 February 15, 2017 at 11:09:38 AM UTC link Permalink

To Paul and everybody (2):

And what if...

...there were two or three Spanish-speakers who decide to only use Spanish names like Pedro, Santiago and Carmen, they contribute thousands of sentences using these names, then some day there comes a contributor who decides to translate all their sentences into English? Should the English translator be asked to only use Tom and Mary, making it thus impossible for them to find a sentence to translate from those Spanish sentences? Should the Spanish-speaking constributors be asked to re-write their thousands of sentences using Tom and Mary? Or should their thousands of sentences be ignored by any potential English translator altogehter? And in case the English translator re-submits translations using Tom and Mary and then the translation are back-translated into Spanish, wouldn't that result in a huge number of near-duplicates in the Spanish language? I honestly think that this choice of standardizing names is a non-viable solution. The world and human language are much bigger and richer to be reduced to a pair of English names and an American city.

{{vm.hiddenReplies[28097] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US February 15, 2017, edited February 15, 2017 February 15, 2017 at 11:25:30 PM UTC, edited February 15, 2017 at 11:38:32 PM UTC link Permalink

OsoHombre, I think your points are solid. Although I understand that restricting names to a small set can have certain benefits, it's certainly true that it can have drawbacks as well. I agree that homogeneity makes working with the corpus boring, whether you're reading existing sentences or contributing new ones. There's only so much time that I can spend in a world composed of one man, one woman, and one city before I want to jump out a window (figuratively speaking). I feel the same way when I experience many sentences based on the same grammatical paradigm: "I smiled." "You smiled." "He smiled." "She smiled." But the same would apply to a paradigm that varied only the place name: "He went to Boston." "He went to Paris." "He went to London." Or the first name: "John went to Boston." "Violeta went to Boston." "Ivan went to Boston." And on and on.

In my personal view (emphasis on "personal"), Tatoeba achieves the most when it tries to do what isn't already being done (often more effectively) elsewhere. Basic grammar is well suited to a textbook, or to a website with a small number of authors that emulates a textbook. Lists of geographical names can be found in a dictionary or atlas or other reference. There are undoubtedly other references for personal names. (Naturally, these are harder to obtain for minority languages, but they are still likely to exist.) But a textbook cannot capture the sheer variety of sentences that you can get from thousands of contributors around the world -- as long as they are challenging themselves not to come up with the easiest type of sentences to produce. Variety can be hard to achieve, but it's rewarding. Maybe it means starting with an external list of vocabulary items that we are not already covering, or items that Tatoeba members have specifically requested, or items that the contributor has found missing. Or maybe it involves tapping some personal source of creativity that could be different for each individual.

What I am trying to say, in short, is that varying personal and geographical names is helpful in increasing the variety and usefulness of our sentences, though it can't do the job by itself.

Several of the administrators had a discussion about the subject of names recently. I don't think that Trang will mind if I quote from her e-mail:

"It's fine [to use your] own sets of names, as long as [you do] it with common sense. We don't actually have standard names. There was never an official statement that contributors should be urged to use Mary, Tom or John over other names. We just have names that are more commonly used than others. ...

[O]verall, everyone has a different opinion about what names to use and how to translate names. So we definitely cannot make any rules about this.

If a contributor feels it is more important to use names that are more connected to their language/culture/identity, than to try to make sentences that 'fit' better in the Tatoeba corpus and/or are more 'convenient', it is their decision, and it is okay."

{{vm.hiddenReplies[28108] ? 'expand_more' : 'expand_less'}} hide replies show replies
Aiji Aiji February 16, 2017 February 16, 2017 at 3:34:34 AM UTC link Permalink

So much have been said above, that I will answer to your post (that makes a lot of sense) so people may see it.

First of all, I am neither for one or the other solution. I was just giving arguments for the wildcards option because CK mentioned it. Therefore I am certainly not for the option CK recommend and I am certainly not against it. But since so many counter-arguments came, let's play the game.
That being said, there was a point somewhere arguing the difference between linguists and computer scientists or something like that. Being part of the two worlds, I find this argument clearly irrelevant. And I would say that the problem comes from the opposite side, that is people who can't see in both worlds.

I have said that we could use wildcards for names, countries, cities, and I maintain my point. I also have said that this would imply a huge programming mess that I can easily imagine (that is the part of the computer world) as such a programming would have to consider the thing in each language independently (that is for the linguist world) making the things nearly impossible to extend to everything anyway.
However, not once have I said that we MUST use wildcards and ONLY wildcards for the sake of Tatoeba. As I said in the introduction, I am against that. That a simplistic approach that could only work for simple languages. I did not say that we should use ONLY four names, I have said the total opposite: "each language can have his set of local names", not "his set of four names".
So if you want to use Farid Layla Jamal Mohamad Kaori Hitomi Daisuke Daiki Pedro Carla Paul Ken Lin or whatever the hell you want, I WILL encourage you for sure (my first message was only nine lines long and I think I mentioned the cultural aspect, didn't I).

When I am told the difference between « la France » et « le Canada », thank you I think I know the difference in my own mother tongue. But again, irrelevant. If I have a French country wildcard of ten countries (or more): five male, five female, problem solved (again, that is the computer part). Then we have to deal with another issue, that is a corpus of sentences is not a dictionary. If you have a sentence with a representative set (that is, every possibilities of the language are represented several times in the wildcard) and you don't find Kenya, you can 1) take a dictionary 2) ask the sentence with Kenya like somebody ask for vocabulary. I'll stop here because the difference of corpora is a different debate.

And finally, there is another wonderful Tatoeba feature that is you can add as many translations as you want. So if a wildcard is appropriate, select the wildcard. If not, do not select the wildcard, I can't see any problem here. In French and several other latin languages, the <name wildcard> is probably the more obvious. In Japanese, the <country wildcard>, the <citizen name wildcard>, etc. are also obvious wildcards. But in a tricky situation, I may find the wildcard is not well-adapted so I will add another translation. And then we have 1 wildcarded sentence + 2 non-wildcarded sentences VS 18 non-wildcarded sentences.

Long story short, I am for you doing what you want to do and clearly against imposing names of one country to every languages. This would be an attack against the cultural aspect of languages. Personally, I think a name is a name so if I write a translation, I do not translate the name. If I write the sentence by myself, I use the name I want. A good wildcard option would NOT restrain anything, it would just be ONE MORE possibility (but again, programming load, etc.) that people could use, not must use. If somebody changes your sentences without your permission, you should strongly oppose that and tell people here.

{{vm.hiddenReplies[28110] ? 'expand_more' : 'expand_less'}} hide replies show replies
OsoHombre OsoHombre February 16, 2017 February 16, 2017 at 8:26:45 AM UTC link Permalink

Aiji:
I will answer your post point by point:

> Therefore I am certainly not for the option CK recommend and I am certainly not against it.

Neither am I against CK's option. What I ask for is that other people and I be left alone if we want to use our own names. I want to enjoy my freedom as a native Arabic speaker. If I feel and I don't want to have anything to do with Tom, Mary, and Boston as standard names, then I should be free to do that and my choice should be respected. So the idea is simple: everyone should be absolutely free to choose their own path to take.

OsoHombre OsoHombre February 16, 2017 February 16, 2017 at 8:29:50 AM UTC link Permalink

Aiji:
"And I would say that the problem comes from the opposite side, that is people who can't see in both worlds."

A problem? Hahaha... I'm no computer programmer and that obviously puts me in the other category, i.e. people who want to contribute natural sentences without taking into account programming criteria, but I enjoy my freedom as a person who produces speech without worrying about whether a computer is going to understand it or not. That's my personal choice and as far as I remember, Tatoeba doesn't oblige its users to take programming criteria into account either.

OsoHombre OsoHombre February 16, 2017 February 16, 2017 at 8:37:24 AM UTC link Permalink

Aiji:
> I WILL encourage you for sure (my first message was only nine lines long and I think I mentioned the cultural aspect, didn't I).

Thank you. Encouragement in the good sense is always welcome. Besides, I think that there is no reason for you to get upset. In my reply, I wasn't 'reproaching you' personally with believing or wanting to impose something on somebody else. I took advantage of my reply to you to reply to the other people that kept messaging me at every opportunity they got both publicly or in private to ask me to adopt that limited set of standard names. So please calm down and be cool. My apologies if my message sounded too direct or personal towards you, but please try to understand that I use this wall to deal with general issues, ideas, and attitude but not to personally quarrel with users.

OsoHombre OsoHombre February 16, 2017 February 16, 2017 at 8:39:53 AM UTC link Permalink

Aiji:
> When I am told the difference between « la France » et « le Canada », thank you I think I know the difference in my own mother tongue.

I don't even know French well. I am among the few people that were trained in English in my own country although French is the dominant foreign language here. The example sounded perfect to illustrate the point to other people, not to a French native speaker.

OsoHombre OsoHombre February 16, 2017 February 16, 2017 at 8:46:57 AM UTC link Permalink

Aiji:
One final remark:

I'm here as a friend and quarreling is the last thing I want on a website like this. You also need to understand that to me, a public message on a wall is not necessarily addressed the person I reply to but to everybody as well. My apologies if there was any misunderstanding.

OsoHombre OsoHombre February 16, 2017 February 16, 2017 at 8:01:08 AM UTC link Permalink

Alan:
I will reply to your message point by point.

> Re: jumping out of the window.
I like your sense of humor.

> But the same would apply to a paradigm that varied only the place name: "He went to Boston." "He went to Paris." "He went to London." Or the first name: "John went to Boston." "Violeta went to Boston." "Ivan went to Boston." And on and on.

I understand your concern and Paul's. As I said before, I only have a human brain (as most of us do on this website), however and in order to avoid nearly-identitcal sentencesn that are boring to translate, I will make everything I can to provide new sentences with new words. I need my own corpus to be lexically and grammatically rich to have interesting and challenging patterns for my co-workers, my students and myself to translate (I am already inviting all the folks around me to take part in the translation of the sentences into Arabic).

OsoHombre OsoHombre February 16, 2017 February 16, 2017 at 8:21:30 AM UTC link Permalink

Alan and Trang:
> If a contributor feels it is more important to use names that are more connected to their language/culture/identity, than to try to make sentences that 'fit' better in the Tatoeba corpus and/or are more 'convenient', it is their decision, and it is okay."

I just breathed a sigh of relief and I would like to thank Trang for writing these wondeful lines and Alan... thank you very much for publishing them. I think that today is my happiest day on Tatoeba although I've only been here for a couple of weeks.

شكرا لكما من صميم قلبي و أنا ممتنّ لكما لاتّخاذ هذا القرار الصائب و الحكيم. هذا هو أسعد يوم لي في موقعكم.

My thanks also go to Admin Pfirsichbaeumchen who had informed me that the admins were talking about the matter.

I can now rest assured and continue to work in peace (at last) on this website that (thank goodness) guarantees the freedom of expression of people of different language communities which is absolutely necessary for the promotion of every language.

Besides, this freedom would also increase the scientific value of Tatoeba's corpus, so long live language diversity and long live the freedom of expression of every language community on this earth.

Amastan Amastan February 16, 2017 February 16, 2017 at 7:19:52 PM UTC link Permalink

I've read everything and it was quite rich and long (whew!) I think that Ostrohombre is right. If Tatoeba doesn't officially "urge" contributors to use wilcard names, then why should someone bother using them if they don't want to? A few years ago, I decided to stop using them but I kind of backed down a little bit although I wasn't really convinced that I had taken the right decision and I somewhat knew that some day this issue would arise again. Now I'm no longer as active as before but I have always wanted to express some ideas in that sense. Not only do I think that it's not necessary to urge contributors to strictly use those wildcard names and nothing but those names (even if 98% of "us" actually use them), but I also agree with Ostohombre on the fact that this might threaten linguistic diversity on Tatoeba.

As a member of the Berber-speaking community, I belong to a community that has its own names that date back to thousands of years ago. Many of our revived names are actually those of our kings in ancient times like Massinissa and Jugurtha (who fought against the Romans) and we have many children that proudly bear those names today.

https://en.wikipedia.org/wiki/Masinissa
https://en.wikipedia.org/wiki/Jugurtha

My Tatoeba nickname itself is a Berber name born by a famous Tuareg king.
https://en.wikipedia.org/wiki/Moussa_Ag_Amastan

Until recently, Algerian authorities banned Berber names and civil registrars refused to record them. Now with the recognition of Berber as an official language both in Morocco and Algeria, the situations is changing little by little. I'm sure that there are many other communities that are facing the same problem around the world, therefore letting people pick the names they like on Tatoeba might be very helpful to these communities as well.

I too disagree with words such as "discourage" and also "flooding the site with X or Y". I think that Tatoeba is like a huge aquarium where every fish is free to swim as they like as long as they abide by the rules of the project. If some want to continue to use (by their own will) wildcard names, other should also be allowed to use whatever they want as long as it's OK with the rules.

To Osohombre:
أهلا بك في موقعنا. بإمكانك أن تراسلني بالعربية أيضا.

{{vm.hiddenReplies[28120] ? 'expand_more' : 'expand_less'}} hide replies show replies
OsoHombre OsoHombre February 17, 2017 February 17, 2017 at 8:38:08 AM UTC link Permalink

Amastan,
Thanks for your support.
شكرا لدعمك.

OsoHombre OsoHombre February 17, 2017 February 17, 2017 at 11:22:55 AM UTC link Permalink

I hardly noticed the word 'flooding' in CK's message. I totally disagree with such a word, too. I think that if there is equality among this website's members, every member has the right to contribute sentences in 'big' numbers. If a person who refuses to use Tom and Mary has their work labled as 'flooding', why shouldn't a person who uses them have their work labeled like that as well? This isn't a personal attack against anyone, but I like to dot my i's and cross my t's.

Hybrid Hybrid February 16, 2017 February 16, 2017 at 11:10:48 PM UTC link Permalink

There's also a tag for love: http://tatoeba.org/eng/tags/sho..._with_tag/2692 and romance: http://tatoeba.org/eng/tags/sho...s_with_tag/140