menu
Tatoeba
language
Inscriber te Aperir session
language Interlingua
menu
Tatoeba

chevron_right Inscriber te

chevron_right Aperir session

Percurrer

chevron_right Monstrar phrase aleatori

chevron_right Percurrer per lingua

chevron_right Percurrer per lista

chevron_right Percurrer per etiquetta

chevron_right Percurrer audio

Communitate

chevron_right Muro

chevron_right Lista de tote le membros

chevron_right Linguas del membros

chevron_right Parlantes native

search
clear
swap_horiz
search

Wall (1 discussion)

Consilios

Ante de poner un question, assecura te de haber legite le FAQ.

Nostre intention es mantener un atmosphere salubre pro discussiones civilisate. Per favor lege nostre regulas contra mal conducta.

Ultime messages feedback

tonygarcia9976

un hora retro

feedback

TATAR1

un hora retro

subdirectory_arrow_right

Thanuir

un hora retro

subdirectory_arrow_right

boracasli2

heri

subdirectory_arrow_right

rdgscratch

heri

subdirectory_arrow_right

boracasli2

heri

feedback

sharptoothed

heri

feedback

rdgscratch

heri

subdirectory_arrow_right

Vortarulo

heri

subdirectory_arrow_right

frpzzd

heri

atitarev atitarev 7 de april 2025, modificate le 7 de april 2025 7 de april 2025 a 06:55:20 UTC, modificate le 7 de april 2025 a 07:31:56 UTC link Permaligamine

Hi,

Do we provide transliteration or full vocalisation for languages such as Arabic?

I have only seen partial vocalisation (for disambiguations) and no transliteration.

If I understand correctly, Mandarin Chinese is the only language featuring transliterations and Japanese can also have furigana.

{{vm.hiddenReplies[40983] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
Waldelfe Waldelfe 9 de april 2025, modificate le 9 de april 2025 9 de april 2025 a 20:38:31 UTC, modificate le 9 de april 2025 a 20:54:17 UTC link Permaligamine

It would indeed be very nice if Arabic featured full vocalisation, just as Chinese and Japanese offer such aids as well.

{{vm.hiddenReplies[40986] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
atitarev atitarev 10 de april 2025, modificate le 10 de april 2025 10 de april 2025 a 01:50:02 UTC, modificate le 10 de april 2025 a 02:00:21 UTC link Permaligamine

Thanks for the support, @Waldelfe,

By "vocalisation" I actually mean providing vowels (diacritics) (Arabic حَرَكات ḥarakāt), which makes the pronunciation of Arabic words unambiguous. Just in case you or someone reading the thread don't know

Example sentence:
How do I get to the train station?
Unvocalised Arabic text: كيف أصل إلى محطة القطار؟
Vocalised: كَيْفَ أَصِلُ إِلَى مَحَطَّةِ الْقِطَار؟
Transliteration: kayfa ʾaṣilu ʾilā maḥaṭṭati l-qiṭār(i)?

There is more than one way to transliterate Arabic, so this is just an example above.

The level of vocalisation can vary, especially on final vowels or their absence (ʾiʿrāb - إِعْرَاب).
E.g. مَحَطَّةُ الْقِطَارِ (maḥaṭṭatu l-qiṭāri) in the nominative case is more detailed than مَحَطَّة الْقِطار maḥaṭṭat al-qiṭār or when something is considered "obvious" and doesn't require consistent diacritics.

Let's have a discussion on this. I contribute Arabic translations on Wiktionary and the vast majority of Arabic word and sentence translations have diacritics and automated transliterations. It should be doable in Tatoeba as well, although it's not possible to do it automatically. It will require some effort (can also be error-prone!).

{{vm.hiddenReplies[40987] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
gillux gillux 18 de april 2025, modificate le 18 de april 2025 18 de april 2025 a 02:36:44 UTC, modificate le 18 de april 2025 a 02:38:08 UTC link Permaligamine

Hello, thank you for bringing this up.
I did a quick search for Arabic vocalizers.
It looks like we could use some of these tools
https://github.com/linuxscout/mishkal
https://github.com/Barqawiz/Shakkala
to automatically generate vocalized versions of the sentences while allowing for manual correction when needed. This would also allow to find Arabic sentences by searching for their words in vocalized form, if that's any useful.

I am also interested in the cultural background and contemporary usage of vocalized Arabic. For example, I know that Japanese reading aids (furigana) is used in signs aimed at kids, child books, teenager books like manga, occasionally in "young adult" litterature, and otherwise in any text that displays very rare or ambigous Chinese characters. What is the current usage of vocalized Arabic?

Also, you mentioned that the vocalized form may vary. Does it only vary in terms of quantity of extra information (diacritics) added to the original word, or could it also vary as in different people use different "orthography" to vocalize?

{{vm.hiddenReplies[41015] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
atitarev atitarev 18 de april 2025 18 de april 2025 a 14:10:53 UTC link Permaligamine

First of all, all tools may be helpful but not reliable. No tool can produce a reliable vocalised Arabic text from an unvocalised one. They can help to reduce the typing and manual input effort but it all would have to checked by a knowledgeable human. No harm in researching, though.

Vocalised Arabic is used in religious texts, especially Qur'an, otherwise, the usage is very similar to Japan's furigana, mainland China's pinyin and Taiwan's bopomofo (zhuyin fuhao). I possess a few vocalised readers, dictionaries and textbooks. Noteably, Oxford or Larousse Arabic dictionaries, Russian author Ilya Frank's Arabic Joha adventures book, Lingualism pdf bilingual books in MSA have full vocalisations and audio.

On various level's vocalisation. Let's say, we want to write "the large book" - الكتاب الكبير (al-kitāb al-kabīr), also al-kitābu al-kabīr(u) or al-kitābu l-kabīr(u)

1. Full vocalisation in the nominative case can be الْكِتَابُ الْكَبِيرُ - al-kitābu al-kabīru or al-kitābu l-kabīru with markings for cases - ʾiʿrāb - https://en.wikipedia.org/wiki/ʾIʿrab. This can be even more pedantic with الْكِتَابُ ٱلْكَبِيرُ, marking the first alif in the second word as silent (a less common diacritic). The final vowels used in the ʾiʿrāb is often omitted for various reasons. Vowels are often unmarked when they are considered obvious by native speakers or were introduced earlier for learners. E.g. كِتَاب (kitab) can be written as كِتاب.
2. Some diacritics are seldom used or even missing on Arabic keyboards, e.g. هٰذَا (hāḏā , “this) with a dagger alif (a vertical stick). So, you may see in a vocalized text هَذا, which would be haḏā, an incorrect shortening for the lack of the diacritic (or unwillingness to use it).

atitarev atitarev 18 de april 2025 18 de april 2025 a 15:57:58 UTC link Permaligamine

Oh, by the way @gillux. I don't have issues with using English personal or city names in Arabic and problems with transliterations are exaggerated.

So "Tom" is written as توم in Arabic, or تُوم with a vocalisation and no final case ending is added to foreign names like this. It would produce "tūm" but Arabs recognise foreign names and pronounce them imitating the foreign pronunciation for known personal or geographical names (sometimes adjusting to the Arabic phonology). It depends on the speaker, of course and there may be multiple ways of saying foreign names or loanwords.

gillux gillux 18 de april 2025 18 de april 2025 a 13:16:58 UTC link Permaligamine

Hello, the search problem is solved now, thank you for reporting it and sorry for the inconvenience!

Beshir Beshir 18 de april 2025 18 de april 2025 a 11:33:52 UTC link Permaligamine

Hi, the system has been throwing errors today, we are not able to search or translate. I even tried to log out and log in again

GrizaLeono GrizaLeono 18 de april 2025 18 de april 2025 a 11:17:06 UTC link Permaligamine

Okazis eraro "68023405f1d53"

SulemanRaka SulemanRaka 18 de april 2025 18 de april 2025 a 10:55:41 UTC link Permaligamine

Hallo zusammen! 🙂
Ich habe beim Laden der Tatoeba-Seite einen Fehler erhalten.
Der Fehlercode lautet: 68022d82ce05a.
Kann mir jemand in dieser Angelegenheit weiterhelfen?

TATAR1 TATAR1 18 de april 2025 18 de april 2025 a 07:14:15 UTC link Permaligamine

Во время поиска возникла ошибка. Если она будет повторяться, пожалуйста, свяжитесь с нами и сообщите код ошибки «‎6801fb7d1ba2d».

Rusydy Rusydy 7 de april 2025 7 de april 2025 a 06:09:34 UTC link Permaligamine

Hi,

I am Rusydy, and I would like to add my native language, Mandar, which is spoken in Indonesia, to Tatoeba. Following the instructions in this [wiki](https://en.wiki.tatoeba.org/art...age-request#), I created a [list in Tatoeba](https://tatoeba.org/en/sentence...s/show/173571) and sent a private message on the Tatoeba website about a month ago, but I have not received a response.

Today, I read the [wiki for adding a new language](https://github.com/Tatoeba/tato...-new-language) on GitHub. Therefore, I am wondering if I can open a new PR to add my endangered native language.

Thank you,
Rusydy

{{vm.hiddenReplies[40982] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
sacredceltic sacredceltic 14 de april 2025 14 de april 2025 a 20:42:36 UTC link Permaligamine

Good job, Rusydy ! It's so sad to see languages disappear along with all the specific cultural knowledge that goes with it. I'm sure Tatoeba will be the right place to preserve your memory.

sacredceltic sacredceltic 14 de april 2025, modificate le 14 de april 2025 14 de april 2025 a 20:57:18 UTC, modificate le 14 de april 2025 a 21:20:48 UTC link Permaligamine

If I may give you a warning, there's a hidden flaw in this type of collective global endeavour. That is "fitting to the norm/vibe". Each culture, and subsequently each language possess its own cultural specificities. So please do not try to "fit" to the "global culture"'s vibe, by submitting sentences that are merely translations of known sentences from modern popular languages. Please also submit sentences that are entirely specific to your language and culture, so that translators will have to really dig into it, in order to be able to translate properly if they ever can. That will precisely pinpoint the specific value of your language and culture. Learning what Tom and Mary do in Boston in your language, is not that interesting anyway and doesn't give credit to your unique cultural contribution.

{{vm.hiddenReplies[40999] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
Arkylie Arkylie 14 de april 2025 14 de april 2025 a 22:53:59 UTC link Permaligamine

Indeed. There's a very useful site for learning Kanji and Japanese terms (JPDB)... that uses Tom as its go-to name, and it bugs me so much. I could be learning various Japanese names, or even imprinting on a couple specific baseline Japanese names, but instead I get TOMU for what seems to be no good reason. I prefer to see native sentences with native names.

{{vm.hiddenReplies[41000] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
sacredceltic sacredceltic 15 de april 2025 15 de april 2025 a 07:38:40 UTC link Permaligamine

You can filter out sentences based on such criteria or their authors. That’s what I do.

{{vm.hiddenReplies[41001] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
Arkylie Arkylie 15 de april 2025 15 de april 2025 a 07:56:36 UTC link Permaligamine

On JPDB, you can filter out example sentences? Or are you talking on this site here?

I mean the rest of the sentence typically teaches me something useful; it's just the out-of-place name that's jarring. "今井先生は教えている" is fully native and feels right; "スミス先生は教えている" feels off because it's injecting a non-Japanese character into the action for no useful reason. (And one whose name breaks the Japanese phonology to boot.)

On this site, it makes sense when it's a translation of a sentence where the characters are named natively to the original sentence. But I always like to see whichever characters are native to the sentence's original language.

hecko hecko 16 de april 2025 16 de april 2025 a 18:23:58 UTC link Permaligamine

you'll never guess where it gets its sentences from :p (it's this website)

the reasoning behind using one name for everything is that it helps avoid duplicates, e.g. "tom eats pears" vs "akira eats pears", which would otherwise have to be translated twice and would disconnect potentially useful indirect translations
but yeah it definitely has its flaws, and not just cultural; e.g. i've heard that the second wildcard "mary" doesn't decline at all in russian which reduces the information in russian translations

my preferred solution is to instead be creative enough with my sentences that there's no way they'll be duplicates of existing ones :p (which i know doesn't work for all types of contributors, but i like what i like)

ecorralest101 ecorralest101 16 de april 2025, modificate le 16 de april 2025 16 de april 2025 a 01:30:36 UTC, modificate le 16 de april 2025 a 21:14:01 UTC link Permaligamine

Dear Rusydy, it is great you are requesting a new language, but once it's approved here on tatoeba, make sure to write sentences in that language, at least over a thousand phrases. There are many languages that have less than 20 phrases, so it doesn't make sense to me.

16 de april 2025 16 de april 2025 a 12:04:41 UTC link Permaligamine
warning

Le contento de iste message infringe nostre regulas e ha dunque essite celate. Illo es monstrate solmente al administratores e al autor del message.

ecorralest101 ecorralest101 16 de april 2025 16 de april 2025 a 01:28:13 UTC link Permaligamine

Hello, CK told me that system Horus takes care of exact duplicated sentences, it hasn't happened until now. Please delete Sentence #13151024, it was a typo.

sharptoothed sharptoothed 13 de april 2025 13 de april 2025 a 07:21:40 UTC link Permaligamine

✹✹ Stats & Graphs ✹✹

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/

{{vm.hiddenReplies[40991] ? 'expand_more' : 'expand_less'}} celar responsas monstrar responsas
sacredceltic sacredceltic 14 de april 2025 14 de april 2025 a 20:37:31 UTC link Permaligamine

Most interesting