clear
{{language.name}} No language found.
swap_horiz
{{language.name}} No language found.
search

Wall (5,665 threads)

Tips

Before asking a question, make sure to read the FAQ.

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

Latest messages subdirectory_arrow_right

TRANG

an hour ago

subdirectory_arrow_right

Aiji

6 hours ago

subdirectory_arrow_right

Pfirsichbaeumchen

16 hours ago

subdirectory_arrow_right

Luiaard

16 hours ago

subdirectory_arrow_right

AlanF_US

20 hours ago

subdirectory_arrow_right

rumpelstilzchen

20 hours ago

subdirectory_arrow_right

Thanuir

20 hours ago

subdirectory_arrow_right

AlanF_US

22 hours ago

subdirectory_arrow_right

AlanF_US

yesterday

feedback

Luiaard

yesterday

doemaar14 doemaar14 16 days ago January 11, 2020 at 11:03 PM link permalink

I think there's an overabundance of the general use vernacular ''American past simple'' and a serious shortage of the past perfect (the present perfect as well but that's less problematic). I mean, is everyone okay with that? Just wondering, because sometimes there's a subtle difference in meaning.

E.g. in the case of sentence nr. 6353363: Tom said he didn't have any idea why Mary did that.
With past perfect tense: ''Tom said he didn't have any idea why Mary had done that.''

In e.g. Dutch this would produce two different translations:
(if past simple)Tom zei dat hij geen idee had waarom Mary dat deed.
Mary was still doing it or did it as a habit when Tom said it.

(if past perfect)Tom zei dat hij geen idee had waarom Mary dat had gedaan.
Mary was finished doing it, before Tom said what he said. It's similar to the past perfect in English which is used to make it clear that one event happened before another

Many translators solve this by providing two different translations, but this isn't being done consistently across Tatoeba.

{{vm.hiddenReplies[33946] ? 'expand_more' : 'expand_less'}} hide replies show replies
Pfirsichbaeumchen Pfirsichbaeumchen 16 days ago, edited 16 days ago January 12, 2020 at 2:18 AM, edited January 12, 2020 at 4:01 AM link permalink

Yes, you are right that the "American simple past" is quite overabundant, but that is because most of those sentences belong to the same contributor, whose way of speaking that is. Other examples include

“Mary said she doesn’t have very much money.” (#6417800)

My recommendation would be to guess the most obvious meaning (if it seems ambiguous) and translate it accordingly into correct indirect speech in your target language. For example, I would assume that “Tom said he didn’t have any idea why Mary did that” means “Tom said he didn’t have any idea why Mary had done that” and not “Tom said he didn’t have any idea why Mary was doing that” or “Tom said he didn’t have any idea why Mary did that occasionally”. Thus, I would translate it as „Tom sagte, er habe keine Ahnung, warum Maria das getan habe“ into German. Maybe I would ask for confirmation as to what he was really trying to express in that particular case before translating it. The answer is indeed usually, “It can be either.”

AlanF_US AlanF_US 16 days ago January 12, 2020 at 2:28 AM link permalink

> I think there's an overabundance of the general use vernacular ''American past simple'' and a serious shortage of the past perfect (the present perfect as well but that's less problematic).

Do you mean that you're having trouble finding sentences in the past perfect, perhaps because you want to translate them? Or that you wish the distribution of sentences was different, maybe more diverse, or more reflective of the frequencies found in some external corpus (if so, where?)?

If you want to find sentences in the past perfect, you can do a search like this:

had << *bed|*ced|*ded|*ved|*ted|*zed

which you can expand to other combinations ending with "ed". It's not a perfect search (so to speak), but it does find some instances of the past perfect. Note that you can't simply write

*ed

because a wildcard like this must be accompanied by a string of at least three characters.

{{vm.hiddenReplies[33948] ? 'expand_more' : 'expand_less'}} hide replies show replies
CK CK 16 days ago, edited 16 days ago January 12, 2020 at 2:47 AM, edited January 12, 2020 at 4:55 AM link permalink

You could try this to have "had" and contracted words like "you'd" and "I'd" followed by commonly-used irregular verbs and all words ending in "ed"

d|had NEAR/1 been|beaten|become|begun|bet|blown|broken|brought|built|burst|bought|caught|chosen|come|cost|cut|dealt|done|drawn|drunk|driven|eaten|fallen|fed|felt|fought|found|flown|forgotten|frozen|gotten|given|gone|grown|hung|had|heard|hidden|hit|held|hurt|kept|known|laid|led|left|lent|let|lain|lit|lost|made|meant|met|paid|put|read|ridden|rung|risen|run|seen|sold|sent|set|shaken|stolen|shone|shot|shown|shut|sung|sunk|sat|slept|slid|spoken|spent|sprung|stood|stuck|sworn|swept|swum|swung|taken|taught|torn|told|thought|thrown|understood|woken|worn|woven|won|written|*aed|*bed|*ced|*ded|*eed|*fed|*ged|*hed|*ied|*jed|*ked|*led|*med|*ned|*oed|*ped|*qed|*red|*sed|*ted|*ued|*ved|*wed|*xed|*yed|*zed

Fewest Words First
https://tatoeba.org/eng/sentenc...io=&sort=words

Newest First
https://tatoeba.org/eng/sentenc...=&sort=created

Sentences with the Most Words First
https://tatoeba.org/eng/sentenc...rt_reverse=yes

Oldest First
https://tatoeba.org/eng/sentenc...rt_reverse=yes

Relevance
https://tatoeba.org/eng/sentenc...sort=relevance


To find sentences not yet translated into your own native language, you can fine-tune any of the above by choosing "exclude" and your own native language under "Translations" on the right side of the advanced search results.

For example, this is the first search above, limited to sentences not yet translated into Dutch.
https://tatoeba.org/eng/sentenc...io=&sort=words


** Some Ideas for Being Able to See More of These Sentences than the 1,000 Sentence Limit.

- Limit the searches to ones with audio first.
- And then, limit the search to ones without audio.
- Do both of the above for all the versions of the searches above.


For more sentences, you can also change the list from "proofread" to "unspecified."


doemaar14 doemaar14 16 days ago January 12, 2020 at 4:48 AM link permalink

>Do you mean that you're having trouble finding sentences in the past perfect, perhaps because you want to translate them?

Nope, but thanks, anyway. I just started wondering about it after seeing so many ambiguous occurrences of the ''past simple'' while translating random sentences. I know that CK uses American English so that's probably why. Maybe the past perfect tense isn't used as much as it is in German, Italian, Dutch, etc. ?
nr. 2790613 is another example, look at the German translation: ''was er getan hat''
So, Pfirsichbaeumchen's reply is a good pointer.

{{vm.hiddenReplies[33950] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US 15 days ago, edited 15 days ago January 12, 2020 at 3:26 PM, edited January 12, 2020 at 3:27 PM link permalink

Just to be clear: The examples we've been talking about have dealt with indirect speech ("He|she said|thought that..."). And yes, at least in US English (and maybe in English from elsewhere), we don't think very much about the distinction between "Mary said she didn’t have very much money" and "Mary said she doesn’t have very much money." The assumption is that if she didn't have much money at the time that she made the statement, she probably doesn't have much money in general. If she had suddenly won the lottery since then, the sentence would emphasize that through other means, such as adverbial phrases ("At the time, Mary said she didn't have very much money"). But in the absence of such context, as Pfirsichbaeumchen said, you can generally translate it either way.

{{vm.hiddenReplies[33959] ? 'expand_more' : 'expand_less'}} hide replies show replies
doemaar14 doemaar14 14 days ago January 14, 2020 at 1:57 AM link permalink

>"Mary said she didn’t have very much money" and "Mary said she doesn’t have very much money."

That's something else entirely, though. In Dutch it works the same way: ''Mary zei dat ze niet veel geld -had-'' and ''Mary zei dat ze niet veel geld -heeft-'' both, in the absence of adverbial phrases, mean the same thing.

Thanuir Thanuir 16 days ago January 12, 2020 at 7:27 AM link permalink

Tatoeba kun on vapaaehtoishanke, yleensä ei kannata valittaa siitä että jotain on liikaa. Sen sijaan kannattaa lisätä itse sitä, mitä näkee liian vähän.

Eli: Kirjoita uusia hollanninkielisiä lauseita, jotka käyttävät aliedustettua aikamuotoa. Kun käännät lauseita, lisää järjestelmällisesti kaikenlaiset käännökset, tai vuorottele niiden välillä, tai lisää ensisijaisesti aliedustettuja käännöksiä.

(Myös nimet ”Tom” ja ”Mary” ovat yliedustettuja monissa kielissä. Suosittelen käyttämään muitakin nimiä ja erityisesti lisäämään lauseita, joissa on hollannille ominaisia nimiä.)

CK CK 16 days ago January 12, 2020 at 6:49 AM link permalink

** Updated **

http://www.manythings.org/bilingual
http://www.manythings.org/anki

Thanks to all of you who have translated proofread English sentences on List 907 into your own native languages.

{{vm.hiddenReplies[33954] ? 'expand_more' : 'expand_less'}} hide replies show replies
marioo marioo 15 days ago January 12, 2020 at 4:41 PM link permalink

CK, I wonder why English-Esperanto doesn't show up in your list?
http://www.manythings.org/bilingual

Given the very nature of Esperanto as a constructed language, there are very few "native" speakers. (There is no "Esperantlando" country yet!) Still, with over 610,000 sentences--the fifth largest in the corpus--it is not negligeable either. (Understandably, not all of them would make it to List 907.)

On your list, there are many languages with only a few hundreds sentences (e.g. Albanian, Cebuano, Tamil etc.) With millions of Esperanto speakers world-wide, it is quite possible that some speakers of those few-sentence-languages know Esperanto. By making English-Esperanto available, it could provide them an additional venue to explore English further.

nwt nwt 16 days ago January 12, 2020 at 8:32 AM link permalink

A question regarding Tatoeba's export of Japanese-English sentence pairs, as hosted on Jim Breen's page ( ftp.monash.edu.au/pub/nihongo ):

Is the script that generates that file public?

Especially the part creating the accompanying "B" lines with furigana, dictionary forms and such.

I'm guessing there's some tokenizer like Mecab and Juman involved, but would it be possible to share the parameters used to be able to recreate the same output format?

There's quite a lot of software available that reads it, it would be excellent if one could use it with custom input.

Thanks!

{{vm.hiddenReplies[33956] ? 'expand_more' : 'expand_less'}} hide replies show replies
Yorwba Yorwba 16 days ago January 12, 2020 at 11:47 AM link permalink

IIRC, the current tokenizer is Jim Breen himself, manually annotating sentences in this format. I guess it would be possible to take the tokenization e.g. Mecab outputs and massage it into something more like the "B lines" format, but it won't be of similar quality.

AlanF_US AlanF_US 15 days ago January 12, 2020 at 3:10 PM link permalink

You could ask Jim via a private message. His username is JimBreen.

Aiji Aiji 16 days ago, edited 16 days ago January 12, 2020 at 5:47 AM, edited January 12, 2020 at 5:47 AM link permalink

I was wondering what the policy concerning sentences containing URLs is, if there is any. I can easily imagine people abusing it, but I also can easily imagine how to control it. So just being curious

{{vm.hiddenReplies[33951] ? 'expand_more' : 'expand_less'}} hide replies show replies
CK CK 16 days ago, edited 16 days ago January 12, 2020 at 6:22 AM, edited January 12, 2020 at 6:23 AM link permalink

It's best to use example.com or http://example.com, a URL set up for this kind of thing.


We have some sentences using this.

https://tatoeba.org/eng/sentenc...ry=example.com

hamzah hamzah 16 days ago January 11, 2020 at 4:44 PM link permalink

Hallo Liebe Leute!
Ich habe eine frage und zwar... wenn ich nach etwas auf Deutsch suche wird mir nur 1.000 Ergebnisse oder Sätze gezeigt. Wie kann ich alle Treffern finden! Die 489.215 Treffern. Danke

{{vm.hiddenReplies[33943] ? 'expand_more' : 'expand_less'}} hide replies show replies
Thanuir Thanuir 16 days ago January 11, 2020 at 8:15 PM link permalink

Jos haluat vain nähdä uusia lauseita, voit etsiä satunnaisia lauseita. Luultavasti et ole nähnyt kaikkia näytettäviä. https://tatoeba.org/eng/sentenc...&sort_reverse=

Mikäli haluat kääntää lauseita, niin kiellä hakua näyttämästä lauseita, joilla on jo käännös. https://tatoeba.org/eng/sentenc...&sort_reverse=

Molemmissa tapauksissa vaihda etsittävää sanaa oikeassa sivupalkissa olevasta valikosta ja etsi myös käyttäen oikean sivupalkin valikkoa (tietokonenäkymässä, en tiedä muista laitteista).

Jos haluat käyttää lauseita johonkin muuhun tarkoitukseen, kannattanee ladata tietokanta ja etsiä lauseita sieltä. Muut osaavat auttaa tässä minua paremmin.

rumpelstilzchen rumpelstilzchen 16 days ago January 11, 2020 at 8:17 PM link permalink

Wenn du alle deutschen Sätze brauchst, musst du dir entweder http://downloads.tatoeba.org/ex...tences.tar.bz2 oder http://downloads.tatoeba.org/ex...tailed.tar.bz2 herunterladen und dann die deutschen Sätze herausfiltern. (siehe https://tatoeba.org/deu/downloads für mehr Infos).

CK CK 17 days ago January 11, 2020 at 2:07 AM link permalink

** Members' Langauges **

http://tatoeba.byethost3.com/20...anguages2.html

Sorted by native language, and can be resorted by username.

Javascript programmers can see how I generated the above data by listing the source of the following page.

http://tatoeba.byethost3.com/20...languages.html

Sorted by username

{{vm.hiddenReplies[33941] ? 'expand_more' : 'expand_less'}} hide replies show replies
CK CK 17 days ago January 11, 2020 at 4:28 AM link permalink

** Sentence Owners' Languages **

http://tatoeba.byethost3.com/20...anguages3.html

This will perhaps be a more useful page, since it's limited to members with declared native languages who own native language sentences.

Ricardo14 Ricardo14 17 days ago January 10, 2020 at 6:20 PM link permalink

My friends, is there a way to look for members which native language is X and speaks Y?

Example: Native people in English which speak Portuguese.

Thanks.

{{vm.hiddenReplies[33936] ? 'expand_more' : 'expand_less'}} hide replies show replies
Ergulis Ergulis 17 days ago January 10, 2020 at 8:58 PM link permalink

I am afraid that at present, there is no way to do that other than checking their profiles.

deniko deniko 17 days ago January 10, 2020 at 9:04 PM link permalink

Not exactly what you're looking for, but you can find all profiles explicitly mentioning English and Portuguese:

https://www.google.co.uk/search...r%2Fprofile%2F

{{vm.hiddenReplies[33938] ? 'expand_more' : 'expand_less'}} hide replies show replies
CK CK 17 days ago, edited 17 days ago January 10, 2020 at 11:36 PM, edited January 10, 2020 at 11:43 PM link permalink

Deniko's idea is nice. I'd never thought of doing it that way.

Another way is to download the user_languages.tar.bz2 from tatoeba.org and browse it offline.

Here is data from that file, sorted by username and then by claimed language level. You can download this and browse it. (5 = native language)

http://tatoeba.byethost3.com/us...2020-01-04.zip

Pfirsichbaeumchen Pfirsichbaeumchen 18 days ago January 10, 2020 at 9:20 AM link permalink

@Deniko

З Днем народження, Денисе! Happy birthday, Denis! 😊

{{vm.hiddenReplies[33925] ? 'expand_more' : 'expand_less'}} hide replies show replies
deniko deniko 18 days ago January 10, 2020 at 9:30 AM link permalink

Thanks a lot :) It's amazing you haven't forgotten to use the vocative case, nice one!

{{vm.hiddenReplies[33926] ? 'expand_more' : 'expand_less'}} hide replies show replies
marafon marafon 17 days ago January 10, 2020 at 3:55 PM link permalink

Денис, с днём рождения!

{{vm.hiddenReplies[33932] ? 'expand_more' : 'expand_less'}} hide replies show replies
deniko deniko 17 days ago January 10, 2020 at 3:58 PM link permalink

Марина, мерсибо!

Ricardo14 Ricardo14 17 days ago January 10, 2020 at 1:34 PM link permalink

Happy birthday, Deniko!!!!

{{vm.hiddenReplies[33928] ? 'expand_more' : 'expand_less'}} hide replies show replies
deniko deniko 17 days ago January 10, 2020 at 3:58 PM link permalink

Thanks a lot Ricardo :)

BakirHamou BakirHamou 17 days ago January 10, 2020 at 2:30 PM link permalink

A l'occasion du nouvel an amazigh 2970 qui coïncide avec la deuxième année depuis sa consécration constitutionnelle par l'Etat Algérien, je transmets mes vœux de bonheur et de prospérité, d'abord aux militants de la grande Tamazgha, ensuite aux linguistes qui ne ménagent aucun effort pour faire de Tamazight l'instrument de l'unité nord-africaine et enfin à toute l'humanité attachée aux valeurs du vivre-ensemble dans le respect de nos différences.

{{vm.hiddenReplies[33930] ? 'expand_more' : 'expand_less'}} hide replies show replies
mellalamellal mellalamellal 17 days ago January 10, 2020 at 2:35 PM link permalink

Aseggaz ameggaz i waytma d sitma!

Happy New Amazigh year to evryone in Algeria and the world.

lbdx lbdx 18 days ago January 9, 2020 at 8:54 PM link permalink

Here is a list of English translations of the twenty most viral sentences on Toatoeba in 2019.

Bedouins live in the desert.
We live in a society.
This is not his handwriting.
Tom has friends in Germany.
My name is Omid.
The cat is big.
My name is Dilshad.
They will not pass!
The queen must die.
Can you understand Tom?
Is she Italian?
Marina is from Russia and Clarissa is from Sweden.
My bicycle is red.
Tom's cat is sick.
Kigali is the capital of Rwanda.
Never.
My cat meows a lot.
Tom works at a hospital.
India is my country.
My father didn't know her.

The big difference with Youtube is that on Tatoeba, Tom is even more popular than cats! No mention of Mary in the top 20 though...

{{vm.hiddenReplies[33923] ? 'expand_more' : 'expand_less'}} hide replies show replies
CK CK 18 days ago, edited 18 days ago January 9, 2020 at 9:29 PM, edited January 9, 2020 at 9:34 PM link permalink

These may not have really been viral, since almost all translations for the first sentence are by one member.

[#7697543] Bedouins live in the desert.

I didn't check the others.

Thanuir Thanuir 17 days ago January 10, 2020 at 1:22 PM link permalink

Half of the names for people, though two-thirds of the often male names (Omid and Dilshid seem to not be gender-specific), are 'Tom'. This is still far too many for purposes of name diversity.

On the other hand, 'Mary' and other too often used 'standard' words did not propagate severely, according to this list.

{{vm.hiddenReplies[33927] ? 'expand_more' : 'expand_less'}} hide replies show replies
Aiji Aiji 17 days ago January 10, 2020 at 2:04 PM link permalink

Tom is the most used non stopword of the French corpus.