menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search

Wall (7,124 threads)

Tips

Before asking a question, make sure to read the FAQ.

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

Latest messages subdirectory_arrow_right

sharptoothed

8 days ago

subdirectory_arrow_right

sharptoothed

8 days ago

subdirectory_arrow_right

TATAR1

8 days ago

subdirectory_arrow_right

AlanF_US

9 days ago

feedback

sharptoothed

10 days ago

subdirectory_arrow_right

Shanaz

13 days ago

subdirectory_arrow_right

Qaztat

13 days ago

subdirectory_arrow_right

TATAR1

13 days ago

feedback

Tartar

13 days ago

subdirectory_arrow_right

menaud

15 days ago

April 21, 2025 April 21, 2025 at 6:04:06 AM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

mraz mraz April 20, 2025, edited April 20, 2025 April 20, 2025 at 11:51:13 AM UTC, edited April 20, 2025 at 11:54:38 AM UTC link Permalink

Plej varmajn paskajn bondezirojn al vi.
Ĝojan Paskon!

{{vm.hiddenReplies[41028] ? 'expand_more' : 'expand_less'}} hide replies show replies
PaulP PaulP April 20, 2025 April 20, 2025 at 2:47:18 PM UTC link Permalink

Ĝojan Paskon ankaŭ al vi!!

ecorralest101 ecorralest101 April 16, 2025 April 16, 2025 at 5:26:35 PM UTC link Permalink

Hello, do we have any of the admins around? Some people have been posting unrelated things?

{{vm.hiddenReplies[41006] ? 'expand_more' : 'expand_less'}} hide replies show replies
April 17, 2025 April 17, 2025 at 12:38:16 PM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

{{vm.hiddenReplies[41013] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US April 19, 2025 April 19, 2025 at 1:53:44 PM UTC link Permalink

If you want to report a spammer, please send a private message to TatoebaAdmins (or, if you can't remember that username, any individual admin). Please do not write Wall posts with links to spammers's profiles, messages, or sentences, since this will bring them more attention and encourage them to write more spam.

{{vm.hiddenReplies[41025] ? 'expand_more' : 'expand_less'}} hide replies show replies
PaulP PaulP April 19, 2025 April 19, 2025 at 2:09:30 PM UTC link Permalink

Oh, I used team@tatoeba.org and didn't get an answer. So I used the wrong address. Thanks for letting me know!

{{vm.hiddenReplies[41026] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US April 19, 2025 April 19, 2025 at 2:44:00 PM UTC link Permalink

That address works, too. I see that you wrote an e-mail three days ago. The messages were hidden pretty soon after that. Thanks for reporting the problem.

LeviHighway LeviHighway April 17, 2025 April 17, 2025 at 12:53:22 AM UTC link Permalink

When you search for a Korean word on Tatoeba, such as 오늘, sentences including 오늘은 won't appear. Korean is a language that usually uses many kinds of suffixes. Is it possible for the search engine to recognize words with different suffixes?

{{vm.hiddenReplies[41008] ? 'expand_more' : 'expand_less'}} hide replies show replies
Yorwba Yorwba April 17, 2025 April 17, 2025 at 6:53:43 AM UTC link Permalink

You can use a * symbol to represent any number of characters. E.g. 오늘* will find 오늘 followed by any suffix: https://tatoeba.org/en/sentence...%EB%8A%98*&to=

You can also use it at the beginning, e.g. *십시오 for polite requests: https://tatoeba.org/en/sentence...C%EC%98%A4&to=

Or somewhere in the middle of a word if you want.

This and other search engine features are explained on the wiki: https://en.wiki.tatoeba.org/art...ow/text-search

{{vm.hiddenReplies[41009] ? 'expand_more' : 'expand_less'}} hide replies show replies
LeviHighway LeviHighway April 17, 2025 April 17, 2025 at 11:22:28 AM UTC link Permalink

Well, I wish more that Tatoeba treats Korean like Chinese and Japanese... Korean just has too many compounds and suffixes, so considering the parts between each space as independent words is impractical. Also, this makes the 'required vocabulary' have to include all forms of same words...

gillux gillux April 18, 2025 April 18, 2025 at 1:42:13 AM UTC link Permalink

I understand, and I wish we had better support for Korean, too. But I would like to point out that the handling of Chinese and Japanese is different but not that great neither. Chinese and Japanese characters are all considered independently and the search engine does not recognize word boundaries. This leads to the limitation described here: https://en.wiki.tatoeba.org/art...ord-boundaries

Now we could perfectly enable the same behavior for Korean characters too, if you think it would be overall beneficial despite that limitation. If you'd like to help evaluating such change, we could enable it on our testing server and get your feedback.

{{vm.hiddenReplies[41014] ? 'expand_more' : 'expand_less'}} hide replies show replies
LeviHighway LeviHighway April 18, 2025, edited April 18, 2025 April 18, 2025 at 11:37:19 AM UTC, edited April 18, 2025 at 11:38:43 AM UTC link Permalink

Yes, that would be nice! Then "학생이에요." (I am a student) could be either found by "학생" (student) and "에요" (to be)? "대한민국" (Republic of Korea) could be either found with "민국" (republic) and "대한" (Korea) right? Also, if I add "월" (month) or "술래잡기" (tag) in required vocabularies, and if a sentence includes "5월" (May) or "술래잡기와" (와 means "and"), it will also be considered to include that word? Also I think it's good for Korean than other languages because Korean commonly uses 2,500+ seperate Unicode characters in their language. This would ensure the accuracy (it's not like it would include "apple" when you search "a", if you enabled it for English)

{{vm.hiddenReplies[41020] ? 'expand_more' : 'expand_less'}} hide replies show replies
gillux gillux April 18, 2025, edited April 18, 2025 April 18, 2025 at 4:27:25 PM UTC, edited April 18, 2025 at 4:28:47 PM UTC link Permalink

I have temporarily configured Korean to be treated like Chinese and Japanese on the testing server: https://dev.tatoeba.org/fr/sent...C%EA%B5%AD&to=

Note that the testing server only contains a subset of what is on tatoeba.org, and it is separate. Feel free to add whatever Korean sentences you want on the testing server, so you can test out how the search behave. (You’ll have to create a new account there.) Newly added sentences should appear in search results within 15 minutes.

Once it is confirmed that this change overall improves search in Korean, we can bring it to tatoeba.org, too.

atitarev atitarev April 7, 2025, edited April 7, 2025 April 7, 2025 at 6:55:20 AM UTC, edited April 7, 2025 at 7:31:56 AM UTC link Permalink

Hi,

Do we provide transliteration or full vocalisation for languages such as Arabic?

I have only seen partial vocalisation (for disambiguations) and no transliteration.

If I understand correctly, Mandarin Chinese is the only language featuring transliterations and Japanese can also have furigana.

{{vm.hiddenReplies[40983] ? 'expand_more' : 'expand_less'}} hide replies show replies
Waldelfe Waldelfe April 9, 2025, edited April 9, 2025 April 9, 2025 at 8:38:31 PM UTC, edited April 9, 2025 at 8:54:17 PM UTC link Permalink

It would indeed be very nice if Arabic featured full vocalisation, just as Chinese and Japanese offer such aids as well.

{{vm.hiddenReplies[40986] ? 'expand_more' : 'expand_less'}} hide replies show replies
atitarev atitarev April 10, 2025, edited April 10, 2025 April 10, 2025 at 1:50:02 AM UTC, edited April 10, 2025 at 2:00:21 AM UTC link Permalink

Thanks for the support, @Waldelfe,

By "vocalisation" I actually mean providing vowels (diacritics) (Arabic حَرَكات ḥarakāt), which makes the pronunciation of Arabic words unambiguous. Just in case you or someone reading the thread don't know

Example sentence:
How do I get to the train station?
Unvocalised Arabic text: كيف أصل إلى محطة القطار؟
Vocalised: كَيْفَ أَصِلُ إِلَى مَحَطَّةِ الْقِطَار؟
Transliteration: kayfa ʾaṣilu ʾilā maḥaṭṭati l-qiṭār(i)?

There is more than one way to transliterate Arabic, so this is just an example above.

The level of vocalisation can vary, especially on final vowels or their absence (ʾiʿrāb - إِعْرَاب).
E.g. مَحَطَّةُ الْقِطَارِ (maḥaṭṭatu l-qiṭāri) in the nominative case is more detailed than مَحَطَّة الْقِطار maḥaṭṭat al-qiṭār or when something is considered "obvious" and doesn't require consistent diacritics.

Let's have a discussion on this. I contribute Arabic translations on Wiktionary and the vast majority of Arabic word and sentence translations have diacritics and automated transliterations. It should be doable in Tatoeba as well, although it's not possible to do it automatically. It will require some effort (can also be error-prone!).

{{vm.hiddenReplies[40987] ? 'expand_more' : 'expand_less'}} hide replies show replies
gillux gillux April 18, 2025, edited April 18, 2025 April 18, 2025 at 2:36:44 AM UTC, edited April 18, 2025 at 2:38:08 AM UTC link Permalink

Hello, thank you for bringing this up.
I did a quick search for Arabic vocalizers.
It looks like we could use some of these tools
https://github.com/linuxscout/mishkal
https://github.com/Barqawiz/Shakkala
to automatically generate vocalized versions of the sentences while allowing for manual correction when needed. This would also allow to find Arabic sentences by searching for their words in vocalized form, if that's any useful.

I am also interested in the cultural background and contemporary usage of vocalized Arabic. For example, I know that Japanese reading aids (furigana) is used in signs aimed at kids, child books, teenager books like manga, occasionally in "young adult" litterature, and otherwise in any text that displays very rare or ambigous Chinese characters. What is the current usage of vocalized Arabic?

Also, you mentioned that the vocalized form may vary. Does it only vary in terms of quantity of extra information (diacritics) added to the original word, or could it also vary as in different people use different "orthography" to vocalize?

{{vm.hiddenReplies[41015] ? 'expand_more' : 'expand_less'}} hide replies show replies
atitarev atitarev April 18, 2025 April 18, 2025 at 2:10:53 PM UTC link Permalink

First of all, all tools may be helpful but not reliable. No tool can produce a reliable vocalised Arabic text from an unvocalised one. They can help to reduce the typing and manual input effort but it all would have to checked by a knowledgeable human. No harm in researching, though.

Vocalised Arabic is used in religious texts, especially Qur'an, otherwise, the usage is very similar to Japan's furigana, mainland China's pinyin and Taiwan's bopomofo (zhuyin fuhao). I possess a few vocalised readers, dictionaries and textbooks. Noteably, Oxford or Larousse Arabic dictionaries, Russian author Ilya Frank's Arabic Joha adventures book, Lingualism pdf bilingual books in MSA have full vocalisations and audio.

On various level's vocalisation. Let's say, we want to write "the large book" - الكتاب الكبير (al-kitāb al-kabīr), also al-kitābu al-kabīr(u) or al-kitābu l-kabīr(u)

1. Full vocalisation in the nominative case can be الْكِتَابُ الْكَبِيرُ - al-kitābu al-kabīru or al-kitābu l-kabīru with markings for cases - ʾiʿrāb - https://en.wikipedia.org/wiki/ʾIʿrab. This can be even more pedantic with الْكِتَابُ ٱلْكَبِيرُ, marking the first alif in the second word as silent (a less common diacritic). The final vowels used in the ʾiʿrāb is often omitted for various reasons. Vowels are often unmarked when they are considered obvious by native speakers or were introduced earlier for learners. E.g. كِتَاب (kitab) can be written as كِتاب.
2. Some diacritics are seldom used or even missing on Arabic keyboards, e.g. هٰذَا (hāḏā , “this) with a dagger alif (a vertical stick). So, you may see in a vocalized text هَذا, which would be haḏā, an incorrect shortening for the lack of the diacritic (or unwillingness to use it).

atitarev atitarev April 18, 2025 April 18, 2025 at 3:57:58 PM UTC link Permalink

Oh, by the way @gillux. I don't have issues with using English personal or city names in Arabic and problems with transliterations are exaggerated.

So "Tom" is written as توم in Arabic, or تُوم with a vocalisation and no final case ending is added to foreign names like this. It would produce "tūm" but Arabs recognise foreign names and pronounce them imitating the foreign pronunciation for known personal or geographical names (sometimes adjusting to the Arabic phonology). It depends on the speaker, of course and there may be multiple ways of saying foreign names or loanwords.

gillux gillux April 18, 2025 April 18, 2025 at 1:16:58 PM UTC link Permalink

Hello, the search problem is solved now, thank you for reporting it and sorry for the inconvenience!

Beshir Beshir April 18, 2025 April 18, 2025 at 11:33:52 AM UTC link Permalink

Hi, the system has been throwing errors today, we are not able to search or translate. I even tried to log out and log in again

GrizaLeono GrizaLeono April 18, 2025 April 18, 2025 at 11:17:06 AM UTC link Permalink

Okazis eraro "68023405f1d53"

SulemanRaka SulemanRaka April 18, 2025 April 18, 2025 at 10:55:41 AM UTC link Permalink

Hallo zusammen! 🙂
Ich habe beim Laden der Tatoeba-Seite einen Fehler erhalten.
Der Fehlercode lautet: 68022d82ce05a.
Kann mir jemand in dieser Angelegenheit weiterhelfen?

TATAR1 TATAR1 April 18, 2025 April 18, 2025 at 7:14:15 AM UTC link Permalink

Во время поиска возникла ошибка. Если она будет повторяться, пожалуйста, свяжитесь с нами и сообщите код ошибки «‎6801fb7d1ba2d».