{{}} No language found.
{{}} No language found.

Wall (5495 threads)

2019-07-01 11:45


Doesn't this sentence only have a meaning ”It is said that it will rain." or "They say it will rain."? In Japaneses, when そうだ follows the complete form, it makes the followed sentence has a meaning as a message. If you'd like it to be "It seems like it will rain.", 雨が降りそうだ。will be right. It seems that there are a lot of wrong translations there....
hide replies
2019-07-01 11:48
These sentenses are from Tanaka Corpus. See Warning/Disclaimer here:
hide replies
2019-07-01 11:54
So how can I fix or remove that wrong translations?
hide replies
2019-07-01 16:41
Regarding wrong translations in general: we have a mechanism to "unlink" sentences. So when two sentences are correct on their own, but not good translations of each other, they should be unlinked. But that is not something regular contributors can do.

As a regular contributor, what you can do however is simply post a comment on the sentence. What you posted here on the Wall could have been posted directly on Then someone with the proper permission (and the proper knowledge) will unlink.

I suspect however that the wrong translations you are talking about are the indirect translations. You will notice that some translations have a blue arrow and others have a grey arrow. The ones with a grey arrow are indirect translations (that is, translation of a translation) and it is normal that they will not always match the main sentence.
2019-07-01 14:21
* Tatoeba Top 30 Languages Interactive Graphs*

Tatoeba Top 30 Languages Interactive Graphs have been updated:
2019-07-01 10:16
How to Use the Tatoeba Corpus to Study English

Several members have already checked this out and I've made several changes based on their feedback.

I'd love to get some more feedback on this.
Are there any obvious things I've overlooked?
Are there any improvements you can suggest?
Did something not work as expected?

Note that this won't really be so useful for your friends who may not be logged-in to, since the audio player on isn't yet shown to non-logged-in visitors in search results.

Please send comments and feedback to me via's private messaging system.
2019-07-01 08:13
#8012086 (maaster)

Nem megengedhető az ilyen mondat
a Tatoebán!
hide replies
2019-07-01 09:20
Lapos Föld után már mind1...
2019-06-30 08:01
Is there a way to make an advanced search on a list that would show the sentences added to the list first?

If not, I would like that feature because going through a list of someone's sentences with audio is a bit difficult when hundreds of sentences are added to the list every now and then, and it takes time for me to find the page where I left off.

Any ideas how to systematically go through a list that is probably growing in the near future, such as this?
hide replies
2019-06-30 22:58 - 2019-06-30 23:42
You can view a list showing the oldest items on the list first. This way, you could bookmark the page where you leave off and continue from there the next time.

This needs to be at the end of the list's URL.
2019-06-30 00:10 - 2019-06-30 00:29
Here is a search that will find the 1,000 longest "sentences" with audio.

Some items are actually multiple-sentence entries.

Even if you jump to the last page, you'll notice that the sentences are still fairly long.
2019-06-28 08:17
Feature suggestion: Sort sentences by what should be translated first.

Rationale: When translating sentences from a given language, it would be nice to get them in an order such that translating the sentences highest up on the list would presumably be more useful than translating those that are farther down.

Ideally, this should be a one-button option, like "Suggest sentences to translate". Conceivably this might also make life easier for newcomers, as they would have a clear thing to start with.


The suggestion algorithm should be language-neutral.

I think the following priorities would be useful. They are not in any particular order.

1. The rated language competence of the sentence owner.
2. The rated language competence of the sentence author.
3. Original sentence is better than a translation. (Reasoning: Original sentences are more likely to be something used in the language, rather than an attempt to express a foreign concept or structure in the language.)
4. Sentence with audio is better than a sentence without audio.
5. Owned sentence is better than an orphan.
6. Normal sentence is better than one redded out.
7. Highly rated sentence is better than lowly rated one. Order: The most ok ratings and no others; the most okay ratings when compared to number of unsure ratings and no "not OK" ratings; no ratings; the best ratio of OK to "not OK" ratings; the least unsure ratings; best ratio of "unsure" ratings to "not OK" ratings; fewest "not OK" ratings.
8. Sentence tagged OK is better than sentences with no quality tags is better than sentences with "check" or "needs native check" is better than "change" is better than "delete".
(Maybe I am forgetting some quality indicators.)

Sentences with many translations are likely to be higher quality or more useful, but translating sentences with no translations is likely to contribute more to the diversity of the corpus. One of these conditions might be added, too. (I would prefer "no translations".)

Or maybe the following would be a relevant hierarchy:

a. Sentences with no transations to any language.
b. Sentences without indirect or direct translations to the target language.
c. Sentences without direct translations to the target language.
d. Sentences with as few as possible direct translations to the target language. Maybe break ties with the number of indirect translations.


With English, I can fairly easily find sentences that satisfy many of the quality criteria. CK also has several tools for this. With smaller languages it is more tricky, as there might, for example, not be any sentences with audio, there might be many non-native contributions, maybe most sentences are translated or maybe almost none are. Hence, if one wants to find high quality or useful sentences to translate, it takes lots of experimentation to find them. If there are only few and one translates them, one needs more experimentation to find the next patch.


So, the idea would be to have a single button that gives, say, a hundred or a thousand high quality sentences in order of quality. The order does not have to include all the points above, but it should include a sufficient amount to prioritize useful sentences.

More importantly, the list should never be empty, if the language has even a single sentence, or maybe even a single untranslated sentence.


Making this would require value judgements from the community or whoever makes the thing. This might cause arguments about what kind of sentences are the most valuable. Maybe it would not be worth it.

This is just a suggestion; feel free to ignore.
hide replies
2019-06-29 13:59
> Making this would require value judgements from the community or whoever makes the thing. This might cause arguments about what kind of sentences are the most valuable. Maybe it would not be worth it.

I think it's worthwhile for people to think about which sentences they find most valuable for their own purposes. But coming to a consensus or even a majority decision would be counterproductive. Not only would you not be able to get different people to agree, you wouldn't be able to get one person to agree with this ranking all the time. People use different criteria for different purposes. Is it a sentence that has been translated fifty times better than one that has not been translated at all? Yes, if your desire is to see how a phrase (typically a very common one) differs across languages. No, if your desire is to see a more complicated sentence (which is likely to be translated less often) that provides more context for a particular word. Is a sentence with audio better than one without? Yes, if you want to listen to audio. But when I don't want to listen to audio (as is generally the case for me), a search that favors audio gives worse results because they're more homogeneous (since the sentences that received audio were generally chosen by one person) and often shorter than I would like.

As I have said before, I favor making default searches as free from assumptions as possible. Adding assumptions decreases the diversity of the sentences that people see (and translate), and increases processing time.

By the way, I have never found a problem with the overall quality of English sentences in the Tatoeba corpus as a whole, so the concern over low-quality sentences has never resonated with me. I certainly come across poor individual sentences, which I flag or correct, but I never get the sense that they predominate in our collection of sentences. Occasionally I've done informal experiments where I've measured the fraction of sentences with mistakes that come up in a default or random search, and I've generally been impressed by the low proportion.
hide replies
2019-06-29 14:11
Note that I am not suggesting doing anything to the search function, but rather giving a simple way for someone to just get good and useful sentences to translate. This should not affect search results in any way.
2019-06-27 07:15 - 2019-06-27 09:23
Here's a list with ~2156 duplicates in the corpus (there are 4362 sentences in total). They differ only on spacing or some ~equivalent punctuation. It means, Horus can't merge them.

If you own any of these sentences, please consider changing the spacing or the punctuation if it does not change the meaning (or correctness) of the sentence.

This is how the file looks like: #id1 #id2 lang sentence1 lang sentence2


And here's the number of duplicates by language

fra (1419)
rus (252)
epo (61)
tur (54)
bre (54)
ber (46)
deu (39)
ukr (26)
eng (25)
kab (25)
hun (17)
ara (17)
ita (16)
fin (16)
spa (7)
heb (6)
pol (6)
ell (6)
cor (6)
por (4)
lat (4)
bul (4)
tat (4)
eus (4)
thv (4)
mar (3)
srp (3)
pes (3)
oci (3)
jpn (2)
swe (2)
hin (2)
vie (2)
uig (2)
khm (2)
nld (1)
dan (1)
ina (1)
tlh (1)
afr (1)
slk (1)
urd (1)
ceb (1)
swg (1)
cym (1)
hide replies
2019-06-27 08:42
Jos kielessä on useita tapoja kirjoittaa lainausmerkit tai välimerkit, en näe mitään syytä olla sisällyttämättä kaikkia käytäntöjä myös Tatoebaan. Toisaalta, kielen normin vastaiset käyttötavat voisi korvata oikeammilla.

Muutamassa suomalaisessa lauseessa oli oikeinkirjoituksen vastaisia lainausmerkkejä; korjasin tai pyysin korjaamaan. Osa sen sijaan johtuu siitä, että kielessä on kolme erilaista tapaa ilmaista lainauksia tai puheenvuoroja. Ne ovat kaikki oikein.


If the language has several different and correct ways of writing something - quotes in Finnish, space or no space before certain punctuation in French - then all of these should be fine in Tatoeba. On the other hand, incorrect use of punctuation should be corrected.
hide replies
2019-06-27 08:50 - 2019-06-27 08:52
> ...then all of these should be fine in Tatoeba.

I don't think I agree with this, since this would lead to a lot of redundancy and also mean that we do not get as many languages linked to the same series of sentences.

For example, if we had the following, and the first one had a French translation, and the second one had a Japanese translation, there would be no indirect link between the French and the Japanese.

I can’t sing.
I can't sing.

(Two different apostrophes)

hide replies
2019-06-27 09:13
If the symbols look almost the same and are used in the same way, then it might not do too much damage to standardize them.

On the other hand, if the symbols (or the use of punctuation) look markedly different, or if the different uses follow national, socio-economic or other boundaries, then standardization is not a good idea.

If there are various different appearances, then knowing all of them is helpful.

If there is a regional or other difference between the conventions, than choosing one over the others will elevate one cultural practice over another. This is a highly political act and not something should be done by Tatoeba.

For reference, here are the ways of quoting/expressing speech in Finnish, recovered from :
”Tule mukaan!” hän pyysi.
»Tule mukaan!» hän pyysi.
– Tule mukaan! hän pyysi.

All three ways are visually distinct and the third way works differently than the others. The first and the third are frequently used. The second is met in literature, but is rare elsewhere, as far as I know.
2019-06-27 13:49
It is a common interrogation, but when thinking thoroughly, that is actually not such a big issue.
7+ millions sentences, 4000+ duplicates
That is not even 0.1% of the corpus.

Most of them are due to the French spaces. That discussion happened in the past. Numbers were even calculated and posted on github. Suppose that each of them would have 50 direct translations (and they don't), that would still represent only around two percent of the corpus.

In my opinion, it is not worth putting more effort than human screening to address this point. And as Thanuir mentioned, some of them cannot be de-duplicated for cultural reasons (among others).
2019-06-23 00:14 - 2019-06-23 00:16
**Suggestion (Feature request?)**

I've some problems with that and I'd like to share with you.

From times to times I want to contact some people who speak, understand a specific language to translate a (some) sentence(s) I want into other languages. It can be for my self studies, for my students, for other projects I have...

On "Language of members" and "Native speakers" I can find people to help me but...

1st - Are they active on Tatoeba? When is this user's last log? If I go into their profile I'll get an answer but it's not that clear (it's for me because I've been around for quite a long time)

2nd - Are they willing to help me? Do they reply to messages often? Always? Never? If they are not around and also set up their settings to don't receive emails from Tatoeba, it'll be hard to contact them, don't you think?

OK, but don't we have the "My vocabulary" feature?
Yes and it's great! However some problems...
-people don't look at it very much (better saying: I don't see people looking at the requests.. Maybe because they don't have time? Interest? Or even they don't remember, know that there are requests over there?

Perhaps it'd be good to have a kind of "advanced search" which we could look for
- active members in x language who understand y language
- people that speak x and y language but that they've not been around
- people who set up/agreed to receive emails from Tatoeba - like PMs!

That's it. Thanks.
hide replies
2019-06-23 07:43
You can ask me if you need to translate something from English into Russian.
hide replies
2019-06-24 15:46
Thank you so really much! :D
2019-06-24 07:08
Ricardo, I agree wholeheartedly with you.

I also think that Tatoeba sorely lacks ways to allow members sharing interest in the same language to reach out to one another. I remember in my last UX test, the first thing the user wanted to do is to find other German contributors. And she had such a hard time doing it. She even looked at the Wall and was disappointed to only see English messages.

I think it would be terribly awesome if we had separate spaces for each language. Members having interest in that language, be it natives or learners, could reach out to one another, follow what’s going on, exchange messages, actively *use* that language etc.

When I first joined Tatoeba, I remember I was happy to see other French active members, and happy to get Japanese messages from tommy_san about my translations. Looking back, it may have been a key point explaining why I’m still around now.
hide replies
2019-06-24 14:04
> I think it would be terribly awesome if we had separate spaces for each language. Members having interest in that language, be it natives or learners, could reach out to one another, follow what’s going on, exchange messages, actively *use* that language etc.

How about a forum like this one?
hide replies
2019-06-24 23:28 - 2019-06-24 23:29
> How about a forum ...

Perhaps it's time for to add a forum.
A forum could be used in addition to the Wall or perhaps instead of the Wall.
A forum could be used for what Ricardo14 is looking for.

Other advantages:
Most forums are searchable. The Wall isn't.
Forums could be sub-divided into categories. The Wall isn't.

There is apparently at least one CakePHP forum plugin available. Perhaps there are others.

Or, possibly use another open-source forum platform, after figuring out how to interface it using the same usernames and passwords. (However, our Wiki isn't interfaced with the main website yet.)

Related Issute on GitHub:
2019-06-24 15:50
Thanks for your reply, Gillux.

It has been suggested to me to look for answers in another websites but
1 - I really like and *trust* Tatoeba. Everyone here are willing to have an awesome corpus;
2 - By helping me (and other members) we'd be feeding Tatoeba
3 - I don't feel like going to other websites for now. I prefer Tatoeba.

> She even looked at the Wall and was disappointed to only see English messages.

How about "encouraging" members to post on 2+ languages or having a "machine translation" that we'd be corrected - if necessary - by a native speaker?

> I think it would be terribly awesome if we had separate spaces for each language. Members having interest in that language, be it natives or learners, could reach out to one another, follow what’s going on, exchange messages, actively *use* that language etc.

+ 1,000. + sentences on Tatoeba! More hands to help each other!
hide replies
2019-06-25 18:14
Useammalla kielellä kirjoittaminen lienee hyvä käytäntö. Mutta konekääntäminen suomesta ei toimi kovin hyvin. Toiminee paremmin suurempien kielten kanssa.

Å skrive på mange språk er fint, synes jeg. Men maskinoversettelse virker ikke så bra med finsk. Kanskje er det bedre med større språk.

2019-06-24 09:00
** Stats & Graphs **

Tatoeba Stats, Graphs & Charts have been updated:
hide replies
2019-06-24 12:14
2019-06-24 15:22
I dunno, looks like nonsense.