menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search
marioo marioo August 24, 2019 August 24, 2019 at 6:21:48 PM UTC link Permalink

When a word or an expression has several variations, each with an official reference or being commonly accepted, what is the best way to deal with this? When creating a sentence, should we create one separate sentence for each variation? Or should we create one sentence and document the variations in the comments?

For example, with the English word "malware", we can have three official ways:

1- programme malveillant
2- logiciel malveillant
3- maliciel

Thoughts please.

{{vm.hiddenReplies[32503] ? 'expand_more' : 'expand_less'}} hide replies show replies
sabretou sabretou August 24, 2019 August 24, 2019 at 7:10:50 PM UTC link Permalink

I create a separate sentence for each variation. As long as the sentence is valid in that language, it's fine!

{{vm.hiddenReplies[32504] ? 'expand_more' : 'expand_less'}} hide replies show replies
Thanuir Thanuir August 25, 2019 August 25, 2019 at 6:34:57 AM UTC link Permalink

I do this, too. The alternative would be to make sure that all of the different variants are used every now and then, so that they all appear somewhere in the database. But for me it is easier to add the variations that happen to cross my mind when translating a sentence.

AlanF_US AlanF_US August 24, 2019 August 24, 2019 at 7:34:17 PM UTC link Permalink

The answer depends on the instance, the frequency of the variants, your energy level, and your philosophy. If one of the variants is quite rare (perhaps because it's only favored by a language academy but is never used in the real world), and if your philosophy is that Tatoeba should reflect real-world usage, then it clearly makes sense not to produce a version of it. Also, if you put a high value on diversity in the corpus, you may not want to make a practice of always producing multiple sentences that differ only in the variant used.

{{vm.hiddenReplies[32507] ? 'expand_more' : 'expand_less'}} hide replies show replies
Aiji Aiji August 25, 2019 August 25, 2019 at 12:03:55 AM UTC link Permalink

I would add that it is particularly the case when the variant would be a single word that can simply be checked in a dictionary. That's different than the comment Tu / Vous variants that potentially impact many words in the sentence