Menu
** Gender imbalance **
I was just translating a sentence and had the feeling that there might just be a bit of a gender imbalance in the use of personal pronouns in the corpus.
I ran a quick search for the following words in English sentences, with the first letter optionally upper or lower cased and followed by either a space or punctuation mark, giving the number of sentences they were used in as:
she 10427
her 6467
hers 34
he 23339
him 5240
his 10285
Talk about objectifying women! ;-)
Though not a big fan of positive discrimination, I figured I'd bring this up here on the Wall in case people might want to keep this in mind when thinking about areas where the corpus is lacking to maximise the usefulness of their contributions, such as by contributing sentences in both pronouns.
There are a few tags such as “lnk to alternative grammar” that can be applied to sentences that have a link to a version of it in an alternative gender. This will help bind sentences together (and slightly reduce the likelihood of duplicates) until we get more descriptive links.
The “lnk to *” tags are few enough that they should all come up when you start typing in the tag field, but there is also a list at: http://martin.swift.is/tatoeba/tags.html#links and you can also search through the “Browse tags” page for them.
Trying to impress the girls, Martin Swift? It looks rather desperate...
I, for one, try to switch things up, but I don't have a precise algorithm for it.
A user suggested in private message that one might also take into account persons referred as a relation of someone referred to by a personal pronoun. I checked for:
[Hh]er (boyfriend| fiance| fiancé| husband| brother| father| grandfather| son| grandson)
and
[Hh]is (girlfriend| fiance| fiancé| wife| sister| mother| grandmother| daughter| granddaughter)
which gave 367 and 495 sentences, respectively. So, not a huge contribution, but I figured I'd mention it for the sake of completeness.