AmarMecheri AmarMecheri yesterday October 22, 2024 at 8:38:39 AM UTC link Permalink

Who could pass on our bagpipe-loving music lover @AlanF_US from Here, Celts (Breton) performing a famous Ḥnifa song.

Anwa izemren ad yazen taɣect-a n Ḥnifa i d-sɛeddan iKaltiyen iBruṭunen, i umeddakel-nneɣ @AlanF_US akken ad s-isell?

Qui pourrait transmettre à notre mélomane @AlanF_US de qui aime la cornemuse ? Ici, des Celtes (bretons) interprétant une chanson célèbre de Ḥnifa.

imalaqvayli imalaqvayli 2 days ago, edited 2 days ago October 21, 2024 at 4:24:03 PM UTC, edited October 21, 2024 at 4:40:48 PM UTC link Permalink

Hi guys, which url filter can I use to see only sentences for a specific language that not having any translation in another language?

exemple: see kablye sentences without any translation at all


DJ_Saidez DJ_Saidez 2 days ago October 21, 2024 at 8:32:23 PM UTC link Permalink

Randomly assorted sentences in Kabyle, excluding sentences that already have any translation into any language.

imalaqvayli imalaqvayli 2 days ago October 21, 2024 at 10:23:08 PM UTC link Permalink

great, thanks!

sharptoothed sharptoothed 10 days ago October 13, 2024 at 4:04:24 PM UTC link Permalink

Tatoeba Stats, Graphs & Charts have been updated:

sharptoothed sharptoothed 24 days ago September 29, 2024 at 6:16:47 AM UTC link Permalink

Tatoeba Stats, Graphs & Charts have been updated:

always always August 14, 2024, edited August 14, 2024 August 14, 2024 at 7:22:41 AM UTC, edited August 14, 2024 at 7:36:22 AM UTC link Permalink

I may have breached the website's rules with my previous account, but this time, I just don't get it. Tatoeba certainly needs to clarify this both to me and the rest of the contributors and users of this website.

always always August 14, 2024 August 14, 2024 at 7:24:15 AM UTC link Permalink

If you suspended me for posting sentences with the Amazigh name "Ziri," I find this (and I'm sure that others would also find that) totally unfair given the fact that others give themselves the right to post hundreds of thousands of sentences with the name "Tom."

always always August 14, 2024 August 14, 2024 at 7:27:20 AM UTC link Permalink

Just for your knowledge (everyone): I was encouraged to settle this issue with the admins via private messages, but it turned out that it wasn't worth it since several of my messages were ignored or made fun of.

always always August 14, 2024, edited August 14, 2024 August 14, 2024 at 7:34:59 AM UTC, edited August 14, 2024 at 8:14:22 AM UTC link Permalink

Again, this is a serious discussion and I'm here to try and find a solution (if there ever can be a solution), not to quabble with those people who have nothing constructive to add to this discussion and whose only goal is to gloat over my suspension. Still, solutions should be reached through true, sincere, transparent, and serious dialogue, not through arbitrary and unjustified measures.

lbdx lbdx August 14, 2024 August 14, 2024 at 7:43:35 AM UTC link Permalink

The blocking of Amastan/always accounts is excellent news! Many thanks to the admins for their moderation work.

always always August 14, 2024 August 14, 2024 at 7:53:34 AM UTC link Permalink

Note: This is a serious discussion and I'm here to try and find a solution (if there ever can be a solution), not to quabble with those people who have nothing constructive to add to this discussion and whose only goal is to gloat over my suspension.

always always August 14, 2024 August 14, 2024 at 7:56:56 AM UTC link Permalink

You're only here to gloat. I want to discuss the issue with the admins and any sensible person who could explain to me and everyone what is actually wrong and how to fix it. Fixing this issue might prevent other issues in the future. I'd like to ask gloaters to keep away from this discussion. I'm sure that the admins as well as the general public here wouldn't be interested in wasting their time reading a lengthy argument and a long series of messages of people gloating over the suspension of a contributor. So please save your energy and go do something more useful.

soridsolid soridsolid September 17, 2024 September 17, 2024 at 10:02:26 PM UTC link Permalink

Is that Amastan? His sentences were always brimming with political undertones and you could smell the bitterness from a mile away, especially in sentences pertaining to his beloved Algeria and Islam.

I'm probably one of the few who'd rather we stay away from political messages and keep sentences as generic as possible without using real names. That's probably what got another lady banned (I don't remember her name, but she knew Yiddish and even had sentences calling for the death of certain political figures).

That's not to say we should get rid of sentences about politics. Not at all. Sentences like ''Tom is a left-wing politician.'' ''Tom disappointed the members of his party.'' are all perfectly fine and useful in my opinion.

Thanuir Thanuir August 14, 2024 August 14, 2024 at 2:22:05 PM UTC link Permalink

En tiedä miksi sinut on torjuttu, mutta tietyn nimen (yli)käyttäminen lauseissa kuulostaa erittäin oudolta ja riittämättömältä syyltä, juurikin kun ottaa huomioon Tom-lauseet. rivikäyttäjästä tämä näyttää oudolta. Siksi oletan/luulen, että taustalla on jotain muutakin.

sharptoothed sharptoothed September 15, 2024 September 15, 2024 at 7:14:01 AM UTC link Permalink

Tatoeba Stats, Graphs & Charts have been updated:

Ergulis Ergulis September 8, 2024 September 8, 2024 at 5:33:28 AM UTC link Permalink

Tatoeba website has been running very slowly since yesterday's shutdown. Do any of you have similar experience?

DJ_Saidez DJ_Saidez September 8, 2024 September 8, 2024 at 5:58:00 AM UTC link Permalink

It is slow for me too.

LanguageExpert LanguageExpert September 8, 2024, edited September 8, 2024 September 8, 2024 at 2:17:01 PM UTC, edited September 8, 2024 at 2:19:20 PM UTC link Permalink

Yes, it's been slow for me too.

PaulP PaulP September 9, 2024 September 9, 2024 at 3:33:53 AM UTC link Permalink

Not just slow. Mostly I get the message „Tatoeba is currently unavailable. We are sorry for the inconvenience. You can check our blog or Twitter for more information.”
But nor the blog nor Twitter gives any information. Yesterday I hardly did 10 % of the work I used to do in one day.

gillux gillux September 9, 2024 September 9, 2024 at 9:19:17 AM UTC link Permalink

There was a little problem indeed! Tatoeba should run smoothly now.

small_snow small_snow September 10, 2024, edited September 10, 2024 September 10, 2024 at 10:41:08 AM UTC, edited September 10, 2024 at 10:45:10 AM UTC link Permalink


maaster maaster September 6, 2024 September 6, 2024 at 1:18:04 PM UTC link Permalink

I think adopting-unadopting of sentences doesn't really work.
One must beg in order his(/her) unadopted sentence(s) to be adopted.
(I may suppose that's why the sentences aren't checked.)
And I think simply unadopted sentences remain rather unchecked.

(As for me, I don't like adopt sentences added by other ones.)

PaulP PaulP September 8, 2024 September 8, 2024 at 5:01:41 AM UTC link Permalink

I don't understand, Maaster. I adopt and correct sentences in „my languages” regularly. If an autor unadopts sentences, there is no need to be begged to change them. You can use a simple link to find them all:

(change "epo" to the code of your language).

superduperimpose superduperimpose September 2, 2024 September 2, 2024 at 9:53:48 PM UTC link Permalink

Some sentences have this info "This sentence is original and was not derived from translation."

Is this information anywhere in the downloadable data?
thank you!

Yorwba Yorwba September 4, 2024 September 4, 2024 at 6:54:50 PM UTC link Permalink

It's in the sentences_base file.

superduperimpose superduperimpose September 4, 2024 September 4, 2024 at 7:07:22 PM UTC link Permalink

You're right. It's right there. Sorry, I just didn't see it.

superduperimpose superduperimpose August 31, 2024 August 31, 2024 at 11:54:06 AM UTC link Permalink

Is the format of transcriptions (japanese if that makes any difference) explained anywhere? (nothing in the Wiki, afaik)

I found three different cases (there may be more):

A: [Kanji|Reading] which makes sense

B: [Kanji1Kanji2|Reading1|Reading2] which is probably short for [Kanji1|Reading1][Kanji2|Reading2]

C: [Kanji1Kanji2|Reading] which probably means the two Kanji combined have this reading

is this correct?
And can I expect to always find something that either fits A, B or C?
That is, can I expect to *never* find something like [Kanji1Kanji2Kanji3|Reading1|reading2], i.e. a number of Kanji and readings which are not equal (in that case, how would I know whether Reading1 belongs to Kanji1Kanji2 or just Kanji1?

I hope my ad-hoc syntax makes sense.

Yorwba Yorwba August 31, 2024 August 31, 2024 at 1:38:45 PM UTC link Permalink

I assume you're asking this question because you want to transform the data programmatically (otherwise you could just handle edge cases whenever you encounter them). If my assumption is correct, it might be easiest to look at Tatoeba's own code for Japanese transcriptions. (Note that Tatoeba is AGPL-licensed, in case that's an issue for you.)

The validation code for user-provided furigana is here: but I think it might not apply to those that are generated automatically using MeCab.

The testcases might also be helpful:

If you just want to display furigana using HTML <ruby> tags, our code for that is here: To be honest, it's not written in an easily readable manner, but I think what it does is basically to assume without validation that there are at least as many kanji as there are readings, and if there is a kanji without reading (|| or end of list) it will merge it with the preceding kanji until the numbers are equal.

So [Kanji1Kanji2Kanji3|Reading1|reading2] would be equivalent to [Kanji1|Reading1][Kanji2Kanji3|reading2], I think.

superduperimpose superduperimpose August 31, 2024 August 31, 2024 at 3:07:10 PM UTC link Permalink

Yes, ruby is a good example. This looks good, thanks!
I will take a look at the code, especially the one where it handles unequal numbers of Kanji and readings.