menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search

Wall (7,067 threads)

Tips

Before asking a question, make sure to read the FAQ.

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

Latest messages subdirectory_arrow_right

AlanF_US

9 hours ago

feedback

xilitante

yesterday

feedback

atitarev

yesterday

subdirectory_arrow_right

atitarev

yesterday

subdirectory_arrow_right

deniko

yesterday

subdirectory_arrow_right

atitarev

yesterday

subdirectory_arrow_right

deniko

2 days ago

subdirectory_arrow_right

atitarev

2 days ago

subdirectory_arrow_right

deniko

2 days ago

feedback

atitarev

2 days ago

xilitante xilitante yesterday April 24, 2025 at 1:51:05 PM UTC link Permalink

Hello y'all,

I just wanted to ask for help contacting user tommg whose websites Linguno and ListeningPractice were based on the Tatoeba corpus. I've seen him mentioning a couple of times these projects here, so I thought that maybe someone may know him here.

The thing is that the first page has been unreachable for the past couple of days + the support e-mail address doesn't really respond to any message.

Obviously, if the post is against any rule in here, I'll delete it immediately.

{{vm.hiddenReplies[41040] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US 9 hours ago, edited 9 hours ago April 25, 2025 at 9:10:42 PM UTC, edited April 25, 2025 at 9:13:53 PM UTC link Permalink

I'm happy to say that Linguno is back in operation today.

16 hours ago April 25, 2025 at 2:01:54 PM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

atitarev atitarev yesterday, edited yesterday April 24, 2025 at 7:24:00 AM UTC, edited April 24, 2025 at 7:27:52 AM UTC link Permalink

At https://tatoeba.org/en/sentences/show/13175064
喂!is incorrectly converted as "餵!". 餵 is only used in other senses (e.g. to "feed")

我餵了貓。/ 我喂了猫。(Wǒ wèi le māo.) - I fed the cat. here 餵/喂 is correct.

But 喂 (wèi) "hello?" (on the phone) has only one form - traditional and simplified. Pls suppress the conversion or make 喂 for both simplified and traditional Chinese.

(In real life 喂 is pronounced with the second tone wéi but the nominal, dictionary pronunciation should be "wèi".)

atitarev atitarev 2 days ago, edited 2 days ago April 23, 2025 at 11:54:49 PM UTC, edited April 24, 2025 at 12:00:35 AM UTC link Permalink

I am not super fluent in Belarusian but I found a few potential issues with a former self-identified Belarusian native speaker.

The verbs ведаць and знаць are near synonyms (both mean "to know") but the difference is somewhat akin or quite close to French savoir vs connaître, German wissen vs kennen and of course, Polish wiedzieć vs znać.

In Russian and Ukrainian, both Slavic verbs merged into знать (ru) and знати (uk) in the modern usage but it's a case of overcorrection to use ведаць (be) when знаць is more appropriate.

The phrases I refer to are "Do you know him?" and "Do you know her?"

https://tatoeba.org/en/sentences/show/69003
https://tatoeba.org/en/sentences/show/317616

{{vm.hiddenReplies[41032] ? 'expand_more' : 'expand_less'}} hide replies show replies
deniko deniko 2 days ago, edited 2 days ago April 24, 2025 at 6:30:11 AM UTC, edited April 24, 2025 at 6:42:10 AM UTC link Permalink

My intuition was the same, but like you, I’m influenced by Ukrainian and Polish.

I googled "ведаеш яго" and didn’t find much, but I did come across this example from the Bible:

https://www.bible.com/es/bible/...91%D0%91%D0%9B

Ты, Госпадзе, ужо ведаеш яго дасканала.

So it seems like a legit usage? Looks like you can use "ведаць" in Belarusian to mean “to know someone” (though it might be archaic—Biblical language often is—but maybe not. Best bet would be to ask a native speaker).

EDIT: sent PMs to kxadtccpgt, pavuk3 and ssvb, maybe they'll help us.

{{vm.hiddenReplies[41033] ? 'expand_more' : 'expand_less'}} hide replies show replies
atitarev atitarev 2 days ago April 24, 2025 at 6:47:25 AM UTC link Permalink

Thanks, @deniko. This would be a legit usage. Not sure how many quotations would be required to verify.

At the English Wiktionary - three quotations from solid sources for well-documented languages.

The common modern and not so modern usage is:

Do you know him?: Ты знаеш яго? / Вы знаеце яго?
Do you know her?: Ты знаеш яе? / Вы знаеце яе?

The initial "ці" is optional, similar in usage to Ukrainian "чи" or Polish "czy".

{{vm.hiddenReplies[41034] ? 'expand_more' : 'expand_less'}} hide replies show replies
deniko deniko 2 days ago April 24, 2025 at 6:53:12 AM UTC link Permalink

> Not sure how many quotations would be required to verify.

if the source is legit and authoritative (like the official Bible translation, which has been checked, double-checked, triple-checked, and probably blessed too 😄), then I’d say even one quote is enough to treat it as valid, at least as archaic or poetic usage. It doesn’t tell you if it’s used in modern speech, sure, but for Tatoeba I think that kind of usage is totally fine—as long as it's tagged accordingly.

{{vm.hiddenReplies[41035] ? 'expand_more' : 'expand_less'}} hide replies show replies
atitarev atitarev yesterday April 24, 2025 at 6:57:40 AM UTC link Permalink

Thanks, @deniko. What is tagging at Tatoeba? I'm rather new as an active user.

I would probably need to tag my contributions for usage - masculine/feminine, plural, colloquial, formal, etc.

{{vm.hiddenReplies[41036] ? 'expand_more' : 'expand_less'}} hide replies show replies
deniko deniko yesterday April 24, 2025 at 7:11:28 AM UTC link Permalink

Tags are these guys:

https://i.imgur.com/w1VDVT3.png

You can search using them in Advanced Search, or just click on a tag when you're viewing a single sentence to see other sentences with that same tag.

For example, sentences tagged "Australian English":

https://tatoeba.org/en/tags/sho...h_tag/1611/eng

You don't have to tag sentences, but you might want to tag some of them, of course.

To be able to do it, you should be an advanced contributor. You're more than qualified to apply to become one, please do:

https://en.wiki.tatoeba.org/art...d-contributors

That will also help you to easily link your translations to multiple languages.

{{vm.hiddenReplies[41037] ? 'expand_more' : 'expand_less'}} hide replies show replies
atitarev atitarev yesterday April 24, 2025 at 7:14:10 AM UTC link Permalink

Дякую!

3 days ago April 22, 2025 at 11:52:23 AM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

5 days ago April 21, 2025 at 6:04:06 AM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

mraz mraz 5 days ago, edited 5 days ago April 20, 2025 at 11:51:13 AM UTC, edited April 20, 2025 at 11:54:38 AM UTC link Permalink

Plej varmajn paskajn bondezirojn al vi.
Ĝojan Paskon!

{{vm.hiddenReplies[41028] ? 'expand_more' : 'expand_less'}} hide replies show replies
PaulP PaulP 5 days ago April 20, 2025 at 2:47:18 PM UTC link Permalink

Ĝojan Paskon ankaŭ al vi!!

ecorralest101 ecorralest101 9 days ago April 16, 2025 at 5:26:35 PM UTC link Permalink

Hello, do we have any of the admins around? Some people have been posting unrelated things?

{{vm.hiddenReplies[41006] ? 'expand_more' : 'expand_less'}} hide replies show replies
8 days ago April 17, 2025 at 12:38:16 PM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

{{vm.hiddenReplies[41013] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US 6 days ago April 19, 2025 at 1:53:44 PM UTC link Permalink

If you want to report a spammer, please send a private message to TatoebaAdmins (or, if you can't remember that username, any individual admin). Please do not write Wall posts with links to spammers's profiles, messages, or sentences, since this will bring them more attention and encourage them to write more spam.

{{vm.hiddenReplies[41025] ? 'expand_more' : 'expand_less'}} hide replies show replies
PaulP PaulP 6 days ago April 19, 2025 at 2:09:30 PM UTC link Permalink

Oh, I used team@tatoeba.org and didn't get an answer. So I used the wrong address. Thanks for letting me know!

{{vm.hiddenReplies[41026] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US 6 days ago April 19, 2025 at 2:44:00 PM UTC link Permalink

That address works, too. I see that you wrote an e-mail three days ago. The messages were hidden pretty soon after that. Thanks for reporting the problem.

LeviHighway LeviHighway 9 days ago April 17, 2025 at 12:53:22 AM UTC link Permalink

When you search for a Korean word on Tatoeba, such as 오늘, sentences including 오늘은 won't appear. Korean is a language that usually uses many kinds of suffixes. Is it possible for the search engine to recognize words with different suffixes?

{{vm.hiddenReplies[41008] ? 'expand_more' : 'expand_less'}} hide replies show replies
Yorwba Yorwba 8 days ago April 17, 2025 at 6:53:43 AM UTC link Permalink

You can use a * symbol to represent any number of characters. E.g. 오늘* will find 오늘 followed by any suffix: https://tatoeba.org/en/sentence...%EB%8A%98*&to=

You can also use it at the beginning, e.g. *십시오 for polite requests: https://tatoeba.org/en/sentence...C%EC%98%A4&to=

Or somewhere in the middle of a word if you want.

This and other search engine features are explained on the wiki: https://en.wiki.tatoeba.org/art...ow/text-search

{{vm.hiddenReplies[41009] ? 'expand_more' : 'expand_less'}} hide replies show replies
LeviHighway LeviHighway 8 days ago April 17, 2025 at 11:22:28 AM UTC link Permalink

Well, I wish more that Tatoeba treats Korean like Chinese and Japanese... Korean just has too many compounds and suffixes, so considering the parts between each space as independent words is impractical. Also, this makes the 'required vocabulary' have to include all forms of same words...

gillux gillux 8 days ago April 18, 2025 at 1:42:13 AM UTC link Permalink

I understand, and I wish we had better support for Korean, too. But I would like to point out that the handling of Chinese and Japanese is different but not that great neither. Chinese and Japanese characters are all considered independently and the search engine does not recognize word boundaries. This leads to the limitation described here: https://en.wiki.tatoeba.org/art...ord-boundaries

Now we could perfectly enable the same behavior for Korean characters too, if you think it would be overall beneficial despite that limitation. If you'd like to help evaluating such change, we could enable it on our testing server and get your feedback.

{{vm.hiddenReplies[41014] ? 'expand_more' : 'expand_less'}} hide replies show replies
LeviHighway LeviHighway 7 days ago, edited 7 days ago April 18, 2025 at 11:37:19 AM UTC, edited April 18, 2025 at 11:38:43 AM UTC link Permalink

Yes, that would be nice! Then "학생이에요." (I am a student) could be either found by "학생" (student) and "에요" (to be)? "대한민국" (Republic of Korea) could be either found with "민국" (republic) and "대한" (Korea) right? Also, if I add "월" (month) or "술래잡기" (tag) in required vocabularies, and if a sentence includes "5월" (May) or "술래잡기와" (와 means "and"), it will also be considered to include that word? Also I think it's good for Korean than other languages because Korean commonly uses 2,500+ seperate Unicode characters in their language. This would ensure the accuracy (it's not like it would include "apple" when you search "a", if you enabled it for English)

{{vm.hiddenReplies[41020] ? 'expand_more' : 'expand_less'}} hide replies show replies
gillux gillux 7 days ago, edited 7 days ago April 18, 2025 at 4:27:25 PM UTC, edited April 18, 2025 at 4:28:47 PM UTC link Permalink

I have temporarily configured Korean to be treated like Chinese and Japanese on the testing server: https://dev.tatoeba.org/fr/sent...C%EA%B5%AD&to=

Note that the testing server only contains a subset of what is on tatoeba.org, and it is separate. Feel free to add whatever Korean sentences you want on the testing server, so you can test out how the search behave. (You’ll have to create a new account there.) Newly added sentences should appear in search results within 15 minutes.

Once it is confirmed that this change overall improves search in Korean, we can bring it to tatoeba.org, too.

atitarev atitarev 18 days ago, edited 18 days ago April 7, 2025 at 6:55:20 AM UTC, edited April 7, 2025 at 7:31:56 AM UTC link Permalink

Hi,

Do we provide transliteration or full vocalisation for languages such as Arabic?

I have only seen partial vocalisation (for disambiguations) and no transliteration.

If I understand correctly, Mandarin Chinese is the only language featuring transliterations and Japanese can also have furigana.

{{vm.hiddenReplies[40983] ? 'expand_more' : 'expand_less'}} hide replies show replies
Waldelfe Waldelfe 16 days ago, edited 16 days ago April 9, 2025 at 8:38:31 PM UTC, edited April 9, 2025 at 8:54:17 PM UTC link Permalink

It would indeed be very nice if Arabic featured full vocalisation, just as Chinese and Japanese offer such aids as well.

{{vm.hiddenReplies[40986] ? 'expand_more' : 'expand_less'}} hide replies show replies
atitarev atitarev 16 days ago, edited 16 days ago April 10, 2025 at 1:50:02 AM UTC, edited April 10, 2025 at 2:00:21 AM UTC link Permalink

Thanks for the support, @Waldelfe,

By "vocalisation" I actually mean providing vowels (diacritics) (Arabic حَرَكات ḥarakāt), which makes the pronunciation of Arabic words unambiguous. Just in case you or someone reading the thread don't know

Example sentence:
How do I get to the train station?
Unvocalised Arabic text: كيف أصل إلى محطة القطار؟
Vocalised: كَيْفَ أَصِلُ إِلَى مَحَطَّةِ الْقِطَار؟
Transliteration: kayfa ʾaṣilu ʾilā maḥaṭṭati l-qiṭār(i)?

There is more than one way to transliterate Arabic, so this is just an example above.

The level of vocalisation can vary, especially on final vowels or their absence (ʾiʿrāb - إِعْرَاب).
E.g. مَحَطَّةُ الْقِطَارِ (maḥaṭṭatu l-qiṭāri) in the nominative case is more detailed than مَحَطَّة الْقِطار maḥaṭṭat al-qiṭār or when something is considered "obvious" and doesn't require consistent diacritics.

Let's have a discussion on this. I contribute Arabic translations on Wiktionary and the vast majority of Arabic word and sentence translations have diacritics and automated transliterations. It should be doable in Tatoeba as well, although it's not possible to do it automatically. It will require some effort (can also be error-prone!).

{{vm.hiddenReplies[40987] ? 'expand_more' : 'expand_less'}} hide replies show replies
gillux gillux 8 days ago, edited 8 days ago April 18, 2025 at 2:36:44 AM UTC, edited April 18, 2025 at 2:38:08 AM UTC link Permalink

Hello, thank you for bringing this up.
I did a quick search for Arabic vocalizers.
It looks like we could use some of these tools
https://github.com/linuxscout/mishkal
https://github.com/Barqawiz/Shakkala
to automatically generate vocalized versions of the sentences while allowing for manual correction when needed. This would also allow to find Arabic sentences by searching for their words in vocalized form, if that's any useful.

I am also interested in the cultural background and contemporary usage of vocalized Arabic. For example, I know that Japanese reading aids (furigana) is used in signs aimed at kids, child books, teenager books like manga, occasionally in "young adult" litterature, and otherwise in any text that displays very rare or ambigous Chinese characters. What is the current usage of vocalized Arabic?

Also, you mentioned that the vocalized form may vary. Does it only vary in terms of quantity of extra information (diacritics) added to the original word, or could it also vary as in different people use different "orthography" to vocalize?

{{vm.hiddenReplies[41015] ? 'expand_more' : 'expand_less'}} hide replies show replies
atitarev atitarev 7 days ago April 18, 2025 at 2:10:53 PM UTC link Permalink

First of all, all tools may be helpful but not reliable. No tool can produce a reliable vocalised Arabic text from an unvocalised one. They can help to reduce the typing and manual input effort but it all would have to checked by a knowledgeable human. No harm in researching, though.

Vocalised Arabic is used in religious texts, especially Qur'an, otherwise, the usage is very similar to Japan's furigana, mainland China's pinyin and Taiwan's bopomofo (zhuyin fuhao). I possess a few vocalised readers, dictionaries and textbooks. Noteably, Oxford or Larousse Arabic dictionaries, Russian author Ilya Frank's Arabic Joha adventures book, Lingualism pdf bilingual books in MSA have full vocalisations and audio.

On various level's vocalisation. Let's say, we want to write "the large book" - الكتاب الكبير (al-kitāb al-kabīr), also al-kitābu al-kabīr(u) or al-kitābu l-kabīr(u)

1. Full vocalisation in the nominative case can be الْكِتَابُ الْكَبِيرُ - al-kitābu al-kabīru or al-kitābu l-kabīru with markings for cases - ʾiʿrāb - https://en.wikipedia.org/wiki/ʾIʿrab. This can be even more pedantic with الْكِتَابُ ٱلْكَبِيرُ, marking the first alif in the second word as silent (a less common diacritic). The final vowels used in the ʾiʿrāb is often omitted for various reasons. Vowels are often unmarked when they are considered obvious by native speakers or were introduced earlier for learners. E.g. كِتَاب (kitab) can be written as كِتاب.
2. Some diacritics are seldom used or even missing on Arabic keyboards, e.g. هٰذَا (hāḏā , “this) with a dagger alif (a vertical stick). So, you may see in a vocalized text هَذا, which would be haḏā, an incorrect shortening for the lack of the diacritic (or unwillingness to use it).

atitarev atitarev 7 days ago April 18, 2025 at 3:57:58 PM UTC link Permalink

Oh, by the way @gillux. I don't have issues with using English personal or city names in Arabic and problems with transliterations are exaggerated.

So "Tom" is written as توم in Arabic, or تُوم with a vocalisation and no final case ending is added to foreign names like this. It would produce "tūm" but Arabs recognise foreign names and pronounce them imitating the foreign pronunciation for known personal or geographical names (sometimes adjusting to the Arabic phonology). It depends on the speaker, of course and there may be multiple ways of saying foreign names or loanwords.