Menu
When a word or an expression has several variations, each with an official reference or being commonly accepted, what is the best way to deal with this? When creating a sentence, should we create one separate sentence for each variation? Or should we create one sentence and document the variations in the comments?
For example, with the English word "malware", we can have three official ways:
1- programme malveillant
2- logiciel malveillant
3- maliciel
Thoughts please.
I create a separate sentence for each variation. As long as the sentence is valid in that language, it's fine!
I do this, too. The alternative would be to make sure that all of the different variants are used every now and then, so that they all appear somewhere in the database. But for me it is easier to add the variations that happen to cross my mind when translating a sentence.
The answer depends on the instance, the frequency of the variants, your energy level, and your philosophy. If one of the variants is quite rare (perhaps because it's only favored by a language academy but is never used in the real world), and if your philosophy is that Tatoeba should reflect real-world usage, then it clearly makes sense not to produce a version of it. Also, if you put a high value on diversity in the corpus, you may not want to make a practice of always producing multiple sentences that differ only in the variant used.
I would add that it is particularly the case when the variant would be a single word that can simply be checked in a dictionary. That's different than the comment Tu / Vous variants that potentially impact many words in the sentence