Wall (5,666 threads)
Before asking a question, make sure to read the FAQ.
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
23 minutes ago
2 hours ago
7 hours ago
18 hours ago
18 hours ago
21 hours ago
22 hours ago
22 hours ago
23 hours ago
Here is a list of English translations of the twenty most viral sentences on Toatoeba in 2019.
Bedouins live in the desert.
We live in a society.
This is not his handwriting.
Tom has friends in Germany.
My name is Omid.
The cat is big.
My name is Dilshad.
They will not pass!
The queen must die.
Can you understand Tom?
Is she Italian?
Marina is from Russia and Clarissa is from Sweden.
My bicycle is red.
Tom's cat is sick.
Kigali is the capital of Rwanda.
My cat meows a lot.
Tom works at a hospital.
India is my country.
My father didn't know her.
The big difference with Youtube is that on Tatoeba, Tom is even more popular than cats! No mention of Mary in the top 20 though...
These may not have really been viral, since almost all translations for the first sentence are by one member.
[#7697543] Bedouins live in the desert.
I didn't check the others.
Half of the names for people, though two-thirds of the often male names (Omid and Dilshid seem to not be gender-specific), are 'Tom'. This is still far too many for purposes of name diversity.
On the other hand, 'Mary' and other too often used 'standard' words did not propagate severely, according to this list.
Tom is the most used non stopword of the French corpus.
Does anyone know if there are any machine translation services like Deepl or Google Translate that use Tatoeba's data? What I mean is are there any translation services that generally use machine translation but use Tatoeba's corpora if there's a perfect match?
maybe this one
That's not quite what I had in mind, but thanks anyway.
There was a discussion about translation memory two weeks ago. https://tatoeba.org/fra/wall/sh...#message_33795
Just for your information...
** English Sentences Likely Said By Women **
Perhaps you might find some of these fun to translate, as a different way to find sentences to translate.
This is a list I add to when I find a sentence that would likely sound better with a female voice.
Some of these might be possible with a male voice, for example a gay man can have a husband, but I think a female voice for these would be less confusing for students, since most of the time they would hear these sentences, they would likely be spoken by women.
That's a big mess of sexism and heteronormativity to call attention to.
That's a new one to me. "Let's maintain and promote heteronormativity because it will be less confusing to students."
It's a "hidden list" of sentences that I don't plan to record, since I think it would be misleading for students to hear a male voice say many of these sentences. (I'm breast-feeding my baby. / When I was a young girl, ... / I'm a waitress. / I'm pregnant. / etc.) The intent isn't to promote heteronormativity, but to provide language examples as we would most likely hear them.
What is gendered about sentences like these?
- Did you really think I'd dance with you?
- Tom comes on strong.
- I'm truly horrified.
> What is gendered about sentences like these? ...
Nothing, but a large percentage of them are likely to sound better with a female voice, and unless I've made some errors I think none of them would sound strange with a female voice.
The main purpose of this list was to group together sentences that I could send to female volunteers to record, especially since many of these would require a female voice.
The reason I posted a link to this hidden list was because of gillux's wall post wondering about other ways to explore the corpus. I thought that some members might enjoy trying this way. https://tatoeba.org/eng/wall/sh...#message_33780
I didn't think it would be controversial, and I guess since nobody has commented that this was an interesting way to explore sentences, I should have left this as an unannounced hidden list.
I think the sentences are fine, in general. Some could very well be said by men, though are significantly more often said by women. (The sentences that imply the speaker is married to or romantically loves men, for example.)
I think a collection of these sentences can seem sexist, but one should remember that the corpus presumably contains similar sentences about males, too, and if not, one should add them, rather than worry about their lack.
I wonder how many explicitly gay/asexual/etc. sentences there are in the corpus. Many Germanic and Romance languages have gendered pronouns, which make the gender issues very explicit. The contributors in these languages might want to add some diversity when translating to their language from a language without gendered pronouns, or when adding new sentences.
> a large percentage of them are likely to sound better with a female voice
Please be aware that this is your personal assessment. If you don't feel like recording these sentences, this is really no problem but in the end each contributor should simply whatever sentence they feel comfortable to record.
The controversy here is that you are trying to favor a female voice for certain sentences when there's really no need to.
> you are trying to favor a female voice for certain sentences when there's really no need to.
That's one way to look at it, I suppose, but what I'm doing is offering female speakers these sentences to record. I don't see that as a problem. There are many sentences that can be said be either a female voice or a male voice. Just because a sentence is read by a female voice doesn't mean that it necessarily would only be said by a female voice, and vice versa.
> That's one way to look at it, I suppose, but what I'm doing is offering female speakers these sentences to record. I don't see that as a problem.
If I were responding to someone who wanted to record audio, I would explain to them how to choose the sentences that THEY wanted to record. For instance, I'd tell them how they could do a search for sentences without existing audio, that favored short or long sentences, and so on. I'd even tell them how to write their own sentences, and then select them. This would make things far more interesting to them, and also increase the diversity of sentences for which we have audio.
Instead, you're taking one of your "leftovers" lists consisting of sentences that you think should have audio but that you don't want to record. In this case, you've rejected them for your own recording because they sound to you like something that a woman would say. So now that you're presented with a woman, your first thought is how she can help you achieve your goal of getting these sentences recorded. So you tell her, "Here's a list of sentences that you can record." Of course, you're not holding a gun to her head. But unless you've given her a complete list of fully-described alternatives, you've drastically limited her set of choices. She now has to decide whether to help you out by doing something she knows you want her to do, or potentially disappoint you by doing something else that she then has to figure out how to do on her own.
So then she looks at the list, which consists of sentences of simple vocabulary and grammatical structure whose subject matter is largely restricted to romance, marriage, and child-rearing. It's all too likely that she'll only end up recording a few sentences before she gives up out of boredom, and we'll never see her again. I suspect that this is what happens to many people who are presented with predigested lists of sentences to translate.
You mentioned gillux's post, in which he talked about grouping sentences. As I see it, he wanted people to be able to see which kinds of variety our corpus has, and which it lacks, and to fill in the gaps. His goal was undoubtedly not to urge each person to confine oneself to a corner of the corpus. In writing sentences, and in writing audio, we should make sure that we make our collections as interesting as possible. Monotony is our enemy.
> The intent isn't to promote heteronormativity
you don't need intent to do harm...
> but to provide language examples as we would
> most likely hear them.
Who are those 'we' you are talking about? Why 'y'all' ('yous') are more relevant that 'we', the people who want more feminine and queerer sentences?
Queer-tunnistetta ei näytä vielä löytyvän: https://tatoeba.org/dan/tags/view_all/queer
Ei myöskään tunnistetta LGBT.
Gay-tunnisteen alla on muutama lause. En tiedä, ovatko ne loukkaavia: https://tatoeba.org/dan/tags/sh...s_with_tag/923
Feminismi-lauseita on muutama: https://tatoeba.org/dan/tags/sh..._with_tag/7374
Käännösten perusteella "feminine" viittaa kielioppiin: https://tatoeba.org/dan/tags/sh...s_with_tag/579
Jos alkaisit edistyneeksi käyttäjäksi, voisit alkaa kokoamaan aihepiiriin liittyviä lauseita jonkin hyvän tunnisteen alle. Se helpottaisi niiden löytämistä ja kääntämistä. https://en.wiki.tatoeba.org/art...-contributors#
Мин хатын-кызлар турында (#8306226 #8306180 #8306173 #8103915 #8103890 #8103875 #8103539) һәм ЛГБТ кешеләре турында (#5091994 #4008180 #2739458) җөмләләр өстим. Ләкин вакытым аз, күп җөмлә өсти алмыйм.
> Jos alkaisit edistyneeksi käyttäjäksi, voisit
> alkaa kokoamaan aihepiiriin liittyviä lauseita
> jonkin hyvän tunnisteen alle. Se helpottaisi
> niiden löytämistä ja kääntämistä
Мин беләм, ләкин теләмим.
> Мин хатын-кызлар турында (#8306226 #8306180 #8306173 #8103915 #8103890 #8103875 #8103539) һәм ЛГБТ кешеләр турында (#5091994 #4008180 #2739458) җөмләләр өстим. Ләкин вакытым аз, күп җөмлә өсти алмыйм.
Hyvä. Kääntäisin lauseet, mutta kielitaitoni ei riitä.
> Мин беләм, ләкин теләмим.
Selvä. Ehkäpä joku muu ottaa asiakseen hoita aihepiirin tunnisteita.
Can someone tell me why this group of sentences appear as deleted but it's not mentioned in the sentences' log?
It looks like a case where we found a user who has been copy-pasting massively and so we decided to simply delete all the sentences along with the account itself. Since there was several hundreds of sentences involved, I assume the deletion was just done directly in the database. This is why there is no trace of deletion log.
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.
My friends, I'd like to know if a sentence like that is "allowed" on Tatoeba:
"I want coffee." "Eu acho que você quis dizer "I want some coffee"".
("Eu acho que você quis dizer" is like "I believe you mean")
As you can see, that's a dialogue. The first person says something in English and commits a mistake. The second one corrects it but says something in another language.
Why would they not? Of course, you can add sentences like those.
You can add a comment to clarify your intention.
If allowed in the language, different types of quotation marking might make it more readable. In English, for example, you might use single quotes 'like this', and then inside those, double quotes "like this". Or the other way around, depending on the dialect, I think.
He asked: 'How do you pronounce "Eyjafjallajökull" and other Icelandic names?'
I don't really have any personal opinion but some may argue about the usefulness of the first sentence, for example. Sure this dialogue is natural, but is it useful as such? If you think so, then go ahead.
I appreciate your asking before adding such sentences.
I don't think that adding this kind of sentence is going to be helpful. First of all, it's going to be very hard to translate. Not only does the potential translator need to know two sentences instead of one, but they need to figure out what to translate, and into which language(s). Should they leave the English as is, and only translate the Portuguese? Or should they translate both? If so, how are they going to represent the faulty English sentence as a faulty sentence in another language?
Even if you're not thinking in terms of translation, such an example is very hard to do well. It requires an advanced understanding of two languages, to the point where you'd have to know not only how to write both languages correctly, but which errors are common. As a smaller issue, when you're dealing with embedded conversation, it's hard to get the quotation marks correct (which your example does not).
Remember, too, that you are now adding intentional mistakes to a corpus that is going to be parsed by minds and tools that are not going to understand how to filter them out.
Furthermore, I think that while there is obviously a value in having one's own errors corrected, being presented with someone else's mistakes is not a good way to learn a language.
Also note that while you're presenting a supposedly bad sentence and a way to fix it, it's not always clear whether the sentence is wrong for the context or wrong in general. In order to present this information, you either need to work it into the text of the sentence (which makes it long and unnatural) or into a comment, which many people are not going to see if they encounter the sentence in a list.
Looking at your specific example:
"I want coffee." "Eu acho que você quis dizer "I want some coffee"".
it should actually be written as follows (using US English conventions):
"I want coffee." "Eu acho que você quis dizer 'I want some coffee.'"
But "I want coffee" is a correct sentence in some contexts. Someone who sees your sentence may falsely conclude that it's never correct.
That's a very good point, Alan! Thank you so really much for this! Well, I'm not going to add this kind of sentence because it may "harm" the corpus. Thanks so really much for your explanation.
[EPO] Kandidato por iĝi bontenanto de la turka frazaro.
Soliloquist kandidatas por iĝi bontenanto de la turka frazaro kaj do por helpi la korektadon de eraraj frazoj de anoj ne plu aktivaj. Laŭ la kutimo ni invitas ĉiun komenti pri tio en privata mesaĝo al ni (simple alklaku la suban ligilon).
[DEU] Korpuspflegerkandidat für das Türkische.
Soliloquist kandidiert, um Korpuspfleger für das Türkische zu werden und bei der Korrektur fehlerhafter Sätze nicht mehr aktiver Mitglieder zu helfen. Wie immer ist jeder eingeladen, sich hierzu in einer Privatnachricht an uns äußern (einfach auf die Verknüpfung am Ende klicken).
[ENG] Corpus Maintainer Candidate for Turkish.
Soliloquist stands as a candidate to become a corpus maintainer for Turkish and help make necessary corrections in sentences owned by inactive members. As usual, we invite everyone to give us your comments in a private message (click on the link below).
Georgian also had a wrong flag.
I deployed a fix, should be correct now. Please let me know if you notice other languages that ended up with the wrong flag.
Hopefully it was just those three.
What was the reason for changing the colors on the flags?
- Are they now closer to what the real flag colors are?
- Or, does this change make the file sizes of the flags smaller?
- Do these colors display better on mobile screens?
The reason was to make it all more consistent. Older language icons were designed with color saturation. Newer ones were designed with the original flag colors. We decided to stick to the original color for all icons. And if there is any color adjustment or any effects, it should be applied in the CSS rather than in the image itself.
Not a wrong flag, but I noticed that the icon for Inuktitut looks a bit funky, as if it has some artificial cloth folds.
Can we replace it with a more standard Nunavut flag icon like this?
Yes, this should be updated at some point.
In general you can contact sabretou for such requests. He is the one in charge the language icons. It may already be in his todo list to update the icon for Nunavut.
Ah, I missed this one. It's a good suggestion, I'll make a Github issue of it shortly.
** Stats - 2020-01-04 - Native Speaker Sentence Counts **
Find out what members' native languages are.
Dans la phrase en français suivante, #5996743, il y a une erreur (voir mon commentaire sous la phrase). Cette phrase est de PERCE_NEIGE qui ne parrait plus être actif :
Est-ce que quelqu'un avec les pouvoirs d'accès pourrait appliquer les corrections ?
Du coup, avec un point virgule, il faudrait une espace avant aussi il me semble.
Vous pouvez directement me notifier dans un commentaire @Aiji si vous en avez besoin, ce sera plus pratique que sur le mur :)
D'accord, je vais envoyer mes requêtes directement à toi la prochaine fois.
En passant, il faut vraiment un espace avant le point-virgule... je comprend que mon gouvernement national est flexible, par contre, Grévisse est clair là-dessus ⟶ virgule... s'il vous plait ?
Jos kielelle on monia hyväksyttyjä standardeja, niin minkä tahansa mukainen lause kelpaa. Vaihtoehtoisesti kaikkien mukaiset lauseet voi lisätä erikseen.