menu
sisku
language
zbasu lo jaspu pilno lo jaspu
language Lojban
menu
sisku

chevron_right zbasu lo jaspu

chevron_right pilno lo jaspu

catlu

chevron_right Show random sentence

chevron_right lo vrici ne se po'i lo bangu

chevron_right lo vrici ne se po'i lo liste

chevron_right lo vrici ne se po'i lo tcita

chevron_right lo vrici poi se bacru

cecmu

chevron_right bitmu

chevron_right liste lo ro cmima

chevron_right bangu lo cmima

chevron_right lo tavla be co'a lo ka jbena

search
clear
swap_horiz
search

bitmu (to 7136 boxna toi)

te sidju se stidi

i ba'o lo ka retsku vau do tcidu e'o lo cafne se retsku

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

lo romoi se benji feedback

mraz

pu za lo djedi be li 5

feedback

mraz

pu za lo djedi be li 10

subdirectory_arrow_right

mraz

pu za lo djedi be li 10

feedback

Dovud123

pu za lo djedi be li 10

feedback

sharptoothed

pu za lo djedi be li 11

subdirectory_arrow_right

frpzzd

pu za lo djedi be li 18

subdirectory_arrow_right

hecko

pu za lo djedi be li 18

subdirectory_arrow_right

frpzzd

pu za lo djedi be li 19

subdirectory_arrow_right

araneo

pu za lo djedi be li 19

subdirectory_arrow_right

gillux

pu za lo djedi be li 20

deniko deniko May 23, 2025 May 23, 2025 at 2:25:06 PM UTC flag Report link Permalink

What are the odds of finding something like this locally—and for just £1.5?

https://i.imgur.com/v4s03fz.png

atitarev atitarev April 24, 2025, edited April 24, 2025 April 24, 2025 at 7:24:00 AM UTC, edited April 24, 2025 at 7:27:52 AM UTC flag Report link Permalink

At https://tatoeba.org/en/sentences/show/13175064
喂!is incorrectly converted as "餵!". 餵 is only used in other senses (e.g. to "feed")

我餵了貓。/ 我喂了猫。(Wǒ wèi le māo.) - I fed the cat. here 餵/喂 is correct.

But 喂 (wèi) "hello?" (on the phone) has only one form - traditional and simplified. Pls suppress the conversion or make 喂 for both simplified and traditional Chinese.

(In real life 喂 is pronounced with the second tone wéi but the nominal, dictionary pronunciation should be "wèi".)

{{vm.hiddenReplies[41039] ? 'expand_more' : 'expand_less'}} de'a zgana lo se retsku di'a zgana lo se retsku
atitarev atitarev April 28, 2025 April 28, 2025 at 2:30:11 AM UTC flag Report link Permalink

The sentence turned out to be a duplicate but the issue with a wrong conversion remains. 喂 is both traditional and simplified.

https://tatoeba.org/en/sentences/show/6401432

{{vm.hiddenReplies[41045] ? 'expand_more' : 'expand_less'}} de'a zgana lo se retsku di'a zgana lo se retsku
gillux gillux May 18, 2025 May 18, 2025 at 10:04:58 AM UTC flag Report link Permalink

Thank you for reporting this issue, atitarev. We are currently working on a solution, the progress is tracked here: https://github.com/Tatoeba/tatoeba2/issues/2007

{{vm.hiddenReplies[41070] ? 'expand_more' : 'expand_less'}} de'a zgana lo se retsku di'a zgana lo se retsku
atitarev atitarev May 18, 2025 May 18, 2025 at 1:34:28 PM UTC flag Report link Permalink

Thank you, @gillux.

tsunhua tsunhua April 30, 2025 April 30, 2025 at 9:49:57 AM UTC flag Report link Permalink

I'm thrilled to have discovered such an amazing website. I would like to suggest adding the Teochew dialect (a branch of Southern Min/Hokkien language) to the language options list.

{{vm.hiddenReplies[41047] ? 'expand_more' : 'expand_less'}} de'a zgana lo se retsku di'a zgana lo se retsku
lbdx lbdx April 30, 2025 April 30, 2025 at 11:58:52 AM UTC flag Report link Permalink

Sorry, the Teochew dialect cannot be added to Tatoeba as it does not have an ISO 639-3 language identifier. But feel free to add your sentences to our Southern Min/Min Nan Chinese [nan] corpus.

https://en.wikipedia.org/wiki/Teochew_Min

gillux gillux May 18, 2025 May 18, 2025 at 7:20:21 AM UTC flag Report link Permalink

As Ibdx said, unfortunately we cannot add it as we have the strict rule of following the ISO 639-3 standard.

Note that this standard is evolving slowly as people are requesting the addition of new languages, so it could be that Teochew is added at some point, but that would be years in the future, if it ever happen.

In the past there have been several requests made to split Min Nan Chinese into different languages, mostly rejected: https://iso639-3.sil.org/code_c...t_cd_value=nan

The latest request tried to include Teochew ("Tio-Sua"), but it was rejected: https://iso639-3.sil.org/request/2021-045

rdgscratch rdgscratch May 16, 2025 May 16, 2025 at 10:43:20 PM UTC flag Report link Permalink

Can you do recordings of my sentences?

{{vm.hiddenReplies[41067] ? 'expand_more' : 'expand_less'}} de'a zgana lo se retsku di'a zgana lo se retsku
PaulP PaulP May 18, 2025 May 18, 2025 at 4:07:46 AM UTC flag Report link Permalink

You can do it yourself. Here's a short guide:
https://www.manythings.org/tatoeba/audacity.html

@CK will help you if you need assistance. But, btw, I see that you added sentences in about 60 languages. I don't suppose that you know how to pronounce them all, right?

I can do the Dutch and Esperanto sentences for you if they don't come from copyrighted sources.

sharptoothed sharptoothed May 15, 2025 May 15, 2025 at 12:50:31 PM UTC flag Report link Permalink

✹✹ Stats & Graphs ✹✹

Tatoeba Top 30 Languages Graphs since Tatoeba "epoch"
https://tatoeba.j-langtools.com/epoch/

May 15, 2025 May 15, 2025 at 12:45:39 PM UTC link Permalink
warning

le se notci na mapti lo javni pe mi'a gi'e seki'ubo se mipri i le se notci ka'e se tcidu fe po'o lo admine e lo ciska be le se notci

frpzzd frpzzd May 12, 2025, edited May 12, 2025 May 12, 2025 at 10:53:24 PM UTC, edited May 12, 2025 at 11:02:57 PM UTC flag Report link Permalink

Just for funsies, I ran a script to list the languages that are least well represented on Tatoeba, compared to the estimated speaker population sizes of those languages. (Specifically, the languages were restricted to those with >= 1mil speakers sorted by the quotient of the number of sentences on Tatoeba to the speaker population size.)

As you might expect, many of the worst-represented languages by this metric are various different variants of Chinese. Aside from those, the top 10 worst-represented languages are:

1. Sindhi (snd, 6 sentences vs. ~38.4mil speakers)
2. Sesotho (sot, 2 sentences vs. ~6.4mil speakers)
3. Maithili (mai, 8 sentences vs. ~19.3mil speakers)
4. Madurese (mad, 8 sentences vs. ~17.0mil speakers)
5. Libyan Arabic (ayl, 3 sentences vs. ~5.6mil speakers)
6. Western Punjabi (pnb, 72 sentences vs. ~113mil speakers)
7. Aymara (aym, 2 sentences vs. ~2.8mil speakers)
8. Pashto (pus, 47 sentences vs. ~53.0mil speakers)
9. Igbo (ibo, 35 sentences vs. ~28.0mil speakers)
10. Sundanese (sun, 40 sentences vs. ~32.0mil speakers)

If we restrict instead to languages with an estimated number of speakers >= 50mil, then here are the top 5 (excluding Chinese variants):

1. Western Punjabi (pnb, 72 sentences vs. ~113mil speakers)
2. Pashto (pus, 47 sentences vs. ~53mil speakers)
3. Punjabi (pan, 204 sentences vs. ~200mil speakers)
4. Gujarati (guj, 168 sentences vs. ~60mil speakers)
5. Telugu (tel, 271 sentences vs. ~95mil speakers)

On a more cheery note, here are the 5 *best* represented languages (that are not conlangs) with >= 1mil speakers, by the same metric:

1. Kabyle (kab, ~765k sentences vs. ~3.4mil speakers)
2. Macedonian (mkd, ~78k sentences vs. ~1.4mil speakers)
3. Lithuanian (lit, ~123k sentences vs. ~2.3mil speakers)
4. Hungarian (hun, ~420k sentences vs. ~11.8mil speakers)
5. Finnish (fin, ~151k sentences vs. ~5.2mil speakers)

And those with >= 50mil speakers:

1. Italian (ita, ~918k sentences vs. ~65mil speakers)
2. Turkish (tur, ~739k sentences vs. ~76mil speakers)
3. German (deu, ~721k sentences vs. ~92mil speakers)
4. Russian (rus, ~1.1mil sentences vs. ~170mil speakers)
5. French (fra, ~665k sentences vs. ~203mil speakers)

{{vm.hiddenReplies[41063] ? 'expand_more' : 'expand_less'}} de'a zgana lo se retsku di'a zgana lo se retsku
lbdx lbdx May 13, 2025, edited May 14, 2025 May 13, 2025 at 4:13:08 PM UTC, edited May 14, 2025 at 10:18:31 AM UTC flag Report link Permalink

Thanks Franklin. It's interesting to see how Eurocentric the Tatoeba corpus still is.

Based on the 2025 edition of Ethnologue 200, I found that some of the world's 100 most widely spoken languages are still completely unavailable on Tatoeba:

- Nigerian Pidgin [pcm] → 120.7M speakkers
- Dari [prs] → 33.4M speakkers
- Magahi [mag] → 21.0M speakkers
- Chhattisgarhi [hne] → 16.3M speakkers
- Pedi [nso] → 13.7M speakkers
- Chittagonian [ctg] → 13.0M speakkers
- Dyula [dyu] → 12.8M speakkers


All 7 of these languages are spoken either in Africa or South Asia.

May 12, 2025 May 12, 2025 at 3:38:40 PM UTC link Permalink
warning

le se notci na mapti lo javni pe mi'a gi'e seki'ubo se mipri i le se notci ka'e se tcidu fe po'o lo admine e lo ciska be le se notci

sharptoothed sharptoothed May 11, 2025 May 11, 2025 at 7:12:12 AM UTC flag Report link Permalink

✹✹ Stats & Graphs ✹✹

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/

May 8, 2025 May 8, 2025 at 5:22:50 AM UTC link Permalink
warning

le se notci na mapti lo javni pe mi'a gi'e seki'ubo se mipri i le se notci ka'e se tcidu fe po'o lo admine e lo ciska be le se notci