menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search
Saney {{ icon }} keyboard_arrow_right

Profile

keyboard_arrow_right

Sentences

keyboard_arrow_right

Vocabulary

keyboard_arrow_right

Reviews

keyboard_arrow_right

Lists

keyboard_arrow_right

Favorites

keyboard_arrow_right

Comments

keyboard_arrow_right

Comments on Saney's sentences

keyboard_arrow_right

Wall messages

keyboard_arrow_right

Logs

keyboard_arrow_right

Audio

keyboard_arrow_right

Transcriptions

translate

Translate Saney's sentences

Saney's messages on the Wall (total 6)

Saney Saney January 3, 2012 January 3, 2012 at 2:05:44 PM UTC link Permalink

Are there any vegans here? I started a list called "negotiation for a vegan meal", which can be tiring even if you use your mother tongue. Feel free to add and translate.

http://tatoeba.org/deu/sentences_lists/show/862

Sind hier noch andere Veganer_innen? Ich habe die Liste "negotiation for a vegan meal" (Verhandlung über eine vegane Mahlzeit) angefangen, was selbst in der Muttersprache ziemlich schwer sein kann. Viel Spaß beim Hinzufügen und Übersetzen.

Saney Saney December 27, 2011 December 27, 2011 at 3:07:42 PM UTC link Permalink

Can you follow these connections on the website? I had a similar problem with finding translations, it turned out it was a bug in my script.

Saney Saney December 20, 2011 December 20, 2011 at 8:55:07 AM UTC link Permalink

Sorry for pulling this up again, but I just wanted to say to things: First, thanks for all your comments! There are really some points worth considering. Also I wasn't aware of universalsubtitles.org - I'll check that out sometimes. Second, I guess eventually I will try to realize the proposed script, but it may not happen during the next few weeks. If I can achieve something that may be of use for Tatoeba, I'll leave another comment. While I'm at it: Thanks to everyone contributing here, this has become my one and only source for learning material - it's awesome to have native speaker audio and sentences that are batch-processable :-)

Saney Saney December 14, 2011 December 14, 2011 at 11:52:53 AM UTC link Permalink

In order to get some sentences with interrelated context and audio, I plan to write a (linux shell) script that takes a movie and two subtitle files as argument. The idea is to link the sentences from the subtitle files and have sox (an audio editing program) cut out the audio from the movie, by looking at the time given in the subtitle files. All these information would be fed into a file that can be imported to anki.

Now, couldn't this be used for the tatoeba corpus as well (text only, not the audio due to obvious licensing problems)? I guess there's some kind of bulk import on the admin side. I quickly checked the first subtitle page I could come up with, opensubtitles.org, and found this on their homepage:

"As a name suggest OpenSubtitles.org is trying to be as open as possible. You can code application, script, utility or whatever you think is nice. First check which applications exists here [a link], then follow this [another link]."

What do you think? Or did someone have the idea already anyway?

Saney Saney November 30, 2011 November 30, 2011 at 7:27:34 AM UTC link Permalink

Hey everyone, a few days (weeks?) ago I posted something about a script I wrote that helps me find mandarin chinese sentences that I can easily understand because I already know all the characters. I've finally written a documentation and put it on my blog:

http://www.der-patriotische-kap.../findsentences

Here's the abstract:

This script helps to find example sentences for mandarin chinese on tatoeba.org, based on a file with all the characters that the sentences may (exclusively) consist of. It outputs a file with the found sentences which can be more or less conveniently importet into Anki, a spaced repetition learning software.

In addition to the sentences, the script searches a copy of the chinese-english cedict dictionary to fetch translations for words consisting of two or three characters which occur in the sentences. It searches tatoeba.org for native mandarin audio, and downloads it if there is an audio file.

Furthermore, it is possible to generate a set of audio files: If the sentences have native mandarin audio, it will check the german or english translation for audio files. If those have audio as well, it will concatenate the audio files using the following pattern: chinese - translation - chinese - 1 second pause, while preferring german over english. If there is no native audio file for the german or english translation, it will synthesize an audio file using the text-to-speech-program espeak. Put these files on your mp3-player.

Saney Saney October 26, 2011 October 26, 2011 at 4:41:22 AM UTC link Permalink

Oh, Tatoeba is so helpful to me. Actually, I've never added a sentence but just profited from everyone's work!

I learn Mandarin Chinese. My primary software is Anki, which is a spaced repitition software. With the help of a plugin, I keep track of which characters I already know (or to be more precise, which characters have already been inserted into my deck). A couple of weeks ago I wrote a script which will search the tatoeba corpus (which I downloaded) for sentences which a) do not, as long as I don't explicitly specify, include unknown characters and b) have a translation to English or German (my mother tongue). Then the script will look for the audio tag, download the audio file if there is one and check the cedict dictionary (also downloadable) for any words that can be made out of compounds of the characters of the sentence. Finally, it produces a list of sentences, their translation and compound words which can be easily imported into Anki.

Just an hour ago or so I had the idea to make a slightly different version: It would be awesome to have a script which looks for sentences with audio, searches for translations with audio, and concatenates the two audio files. This would be great to have on the mp3 player.

This is only a short summary. I planned anyway to upload those scripts to my blog and write a more or less detailed explanation about those. I'll write here again once I've finished that :-) !