menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search
pullnosemans pullnosemans January 2, 2016, edited January 2, 2016 January 2, 2016 at 8:28:18 AM UTC, edited January 2, 2016 at 8:35:11 AM UTC link Permalink

**content and context in example sentences**

has there ever been any discussion about how to make example sentences in the tatoeba project more complete in terms of content and context? I remember sacredceltic raising the issue of the one-sidedness of the english corpus in that a large portion start with "tom", but has there ever been any thought about creating a guideline of what information to include in a sentence?

as a site that works with example sentences, tatoeba has to be aware that example sentences generally have the problem of existing outside of discourse, and therefore can easily become hard or impossible to interpret without giving a sufficient amount of context.

for example, I just changed a sentence of mine that read
"Sie teilen die Entfernung und Richtung des Futters durch einen Tanz mit.",
a more or less close equivalent of japanese "踊りによってその食糧までの距離や方角を伝える。"
and english "They communicate the distance and direction of the food by dancing."

these sentences do not contain actual content as to who communicates, with or to whom they communicate, what food we're talking about, and so on. if the reader happens to not know that this is the way bees communicate, the sentence would probably make little sense to them, and would therefore be hard to use. both sentences suggest that they are taken from a text, especially the japanese one, which contains an anaphoric "その".
I now changed the german one to say that it is bees who communicate, and that it is each other who they communicate with. with this information, it also becomes clear that is general food sources that would be relevant for bees: http://tatoeba.org/jpn/sentences/show/4418682

skimming through my example sentences, I have unfortunately noticed that many of them are such sentences without grammatical or idiomatic problems, but no real semantic content, often because they're translations of sentences with the same problem. I would like to try and go over them and make them more contextual, but this would create the big problem that I would have to unlink them from their less contextual equivalents, maybe leaving comments encouraging those sentences' owners to change them accordingly and re-establish the links.

so because this would be very cumbersome for me and other people, and because I don't think semantically thin sentences like "they shook hands" are not per se bad, but can be useful in early stages of language learning, instead I propose that the site admonish its contributors to avoid creating too many "pronoun - verb - specific assertion without context" sentences, but instead try to create more self-contained sentences that contain general assertions ("the sun is the center of our solar system") or that contain enough context for specific assertions ("whenever I go jogging, I listen to music" rather than "I listen to music") to be interpretable.

I think this would greatly improve the usefulness of this site. especially if this message gets through to the small central group of highly active people in creating sentences, it could make a big change.

{{vm.hiddenReplies[25151] ? 'expand_more' : 'expand_less'}} hide replies show replies
Selena777 Selena777 January 2, 2016 January 2, 2016 at 9:02:16 AM UTC link Permalink

Probably, the best way would be creating a new sentence with context (like "bees communicate" instead of "they communicate") and leaving a link to it in comments to the initial sentence, instead of changing and unlinking the initial sentence itself.

{{vm.hiddenReplies[25152] ? 'expand_more' : 'expand_less'}} hide replies show replies
CK CK January 2, 2016, edited October 30, 2019 January 2, 2016 at 11:19:08 AM UTC, edited October 30, 2019 at 10:36:23 AM UTC link Permalink

[not needed anymore- removed by CK]

{{vm.hiddenReplies[25153] ? 'expand_more' : 'expand_less'}} hide replies show replies
pullnosemans pullnosemans January 2, 2016, edited January 2, 2016 January 2, 2016 at 12:14:21 PM UTC, edited January 2, 2016 at 12:19:36 PM UTC link Permalink

the phenomenon is not at all limited to unowned sentences or sentences from the tanaka corpus. I just refreshed the home page about twenty to thirty times, and I would say that about 30% to 40% of random sentences exhibit this kind of thing, mostly in form of deictic expressions without any reference, such as then, there, she, this one, and so on.

however, it is true that bad tanaka sentences tend to be among the most puzzling ones. it's a real pity that the japanese corpus has this huge problem.