clear
swap_horiz
search

Tips

Here you can ask general questions like how to use Tatoeba, report bugs or strange behavior, or simply socialize with the rest of the community.

Before asking a question, make sure to read the FAQ.

Wall (4823 threads)

donjon
2018-01-22 17:47
Hi,
say, pls, how to download whole taboeba project (like whole wikipedea may be downloaded)?
or at least, how to download all translations between pair of languages?
hide replies
Selena777
2018-01-22 19:49
Hi,
you can download it here:
https://tatoeba.org/eng/downloads
Look "sentences" to download the sentences, and "links" to download the translations between pairs.
hide replies
donjon
2018-01-22 20:02
thank you!
CK
CK
2018-01-22 23:20
You might find what you are looking for here.

http://www.manythings.org/anki/

These are selected bilingual pairs with one language being English and the other language being limited to sentences by native speakers.
hide replies
donjon
30 days ago
thank you.
and where is another pairs like russian-swahili, german-indonesian, etc?
donjon
30 days ago
i.e. how to obtain another dictionaries (russian-swahili, german-indonesian, etc)? what is the procedure?
hide replies
CK
CK
30 days ago
There are no pre-made sets of those. You will need to use the 2 files Selena777 mentioned and create the pairs yourself. If you aren't familiar with working with this kind of data, you may not be able to easily do so.
hide replies
donjon
29 days ago
Selena777 - what is it?
hide replies
Selena777
29 days ago
My username :)
lingvokrolik
2018-01-20 14:02
Why the site is loading so slow?
hide replies
Hybrid
2018-01-20 14:44
It's working normally for me now. Sometimes Tatoeba stops working but then it reboots or something. You just have to be patient.
CK
CK
2018-01-11 03:10
** Many New Audio Files **

English Audio by CK
https://tatoeba.org/sentences_lists/show/4000/und

Spanish Audio by arh
https://tatoeba.org/sentences_lists/show/6685/und

Many of these are new sentences without any translations yet.
Hybrid
2018-01-14 21:29
Tatoeba was unavailable for a while.
hide replies
Guybrush88
2018-01-14 21:57
Today Tatoeba seemed quite unstable
hide replies
Hybrid
2018-01-16 00:19
I agree! It's fine now.
sharptoothed
2018-01-15 11:08
** Stats & Graphs **

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/
hide replies
Guybrush88
2018-01-15 13:25
thanks
DostKaplan
2018-01-07 06:34
When I search for "of the experience" (without quotes), here are some of the results:

The result of the experiment was inconclusive.
Tom did many of the experiments himself.
This is Professor Oak, the director of this experiment.

While there were results containing the word "experience," I don't think the results containing "experiment" should have been included. Is the search algorithm trying to blindly match the first 6 characters of a word?
hide replies
brauchinet
2018-01-07 08:34
Tatoeba uses a stem based search algorithm. Before starting the comparison, search-words are reduced to their "stems" by cutting off common suffixes - like in this case -ment and -ence (leaving just "experi")
hide replies
Selena777
2018-01-07 16:14
Why this approach not work for Serbian? If I search a certain word, the search displays only exact matches, not including inclination and conjugation forms.
AlanF_US
2018-01-07 18:17 - 2018-01-07 18:22
While the stemming algorithm for English is more sophisticated than just matching the first 6 characters of a word, no algorithm is perfect, and one is likely to find cases where any of the stemmers that our search engine provides fails to behave as one might expect.

As the wiki page "How to Search for Text" says ( https://en.wiki.tatoeba.org/art...w/text-search# ), our search engine supports stemming for the following languages: German, English, Finnish, French, Italian, Dutch, Portuguese, Russian, Spanish, Swedish and Turkish. Other languages do not. Writing a stemmer for a search engine is a nontrivial task. However, you can approximate stemming by using wildcards. For instance, "experim*" would find both "experiment" and "experiments".
hide replies
Selena777
2018-01-07 18:50
I see. Thanks a lot for the suggestion!
Btw, Serbian inclination and conjugation are rather similar to Russian. Can the existed algorithm for Russian be modificated for Serbian?
hide replies
AlanF_US
2018-01-08 22:47
Theoretically, yes, someone could do that. Note, however, that we get our search engines and stemmers from a project called Sphinx. For someone to make this change, they'd have to:

- be part of the Sphinx community
- be familiar with both Russian and Serbian
- understand how to do all the configuration work necessary to create a new stemmer
- have a substantial chunk of time available to work on it, including testing

I also note that the list of stemmers offered by the site doesn't seem to change much over time.
hide replies
Selena777
2018-01-09 08:26
I see. Is it nesessary to be a programmer to work on it?
hide replies
AlanF_US
2018-01-13 19:34
Well, you would need to be someone who is comfortable with configuration (writing files with a specific format, and so on).
Eccles17
2018-01-09 16:35 - 2018-01-10 17:27
https://en.wikipedia.org/wiki/M...rds_in_English

Can the Tatoeba devs add a metadata tag for native speakers to tag sentences that they have actually heard others utter?

Not their own inner narration, but timeworn sentences.

"What time is it?"

"Walk the dog please."

If people aren't saying it in real life, it's not relevant. The flip side is those sentences will bore ppl to death.
sharptoothed
2018-01-03 16:59
** Stats & Graphs **

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/
hide replies
deniko
2018-01-04 13:30
As I learned from your stat, we now have a user with 10 native languages. That's impressive.
hide replies
sharptoothed
2018-01-04 16:18 - 2018-01-05 09:05
Well, everyone can state something like that on Tatoeba. :-) (как в том анекдоте, "и вы тоже говорите") :-D
Ricardo14
2018-01-04 13:31
Thank you!
hide replies
sharptoothed
2018-01-04 16:18
You're welcome! :-)
Guybrush88
2018-01-04 17:01
thanks
hide replies
sharptoothed
2018-01-05 09:05
you're welcome :-)
mraz
2017-12-31 18:49
hide replies
Selena777
2017-12-31 19:51
Thanks :)
Hybrid
2018-01-01 20:04
Happy New Year everyone!
hide replies
Selena777
2018-01-01 20:42
Thanks!
maaster
2018-01-02 11:05
Ricardo14
2018-01-04 13:32
Feliz ano novo!
maaster
2018-01-03 11:44 - 2018-01-03 14:03
Can anyone write me a context with the sentence #6587303 ?
Or with this one: #6587294 ?