menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search
belkacem77 belkacem77 July 9, 2018 July 9, 2018 at 9:55:49 PM UTC link Permalink

Hi,
Kabyle is not yet available on tatoeba. We hope getting it to send corpora and translate from other languages.

{{vm.hiddenReplies[29440] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US July 10, 2018 July 10, 2018 at 3:12:28 AM UTC link Permalink

Are you familiar with the process for requesting a new language?

https://en.wiki.tatoeba.org/art...guage-request#

{{vm.hiddenReplies[29442] ? 'expand_more' : 'expand_less'}} hide replies show replies
cueyayotl cueyayotl July 10, 2018 July 10, 2018 at 5:11:37 AM UTC link Permalink

They have already requested the language properly and Kabyle is all set to be added on the next update. Belkacem77, we just need to be a patient a LITTLE bit longer for the site to be updated.

{{vm.hiddenReplies[29443] ? 'expand_more' : 'expand_less'}} hide replies show replies
belkacem77 belkacem77 July 10, 2018 July 10, 2018 at 8:14:52 AM UTC link Permalink

Hello,
Thank you a lot!

{{vm.hiddenReplies[29445] ? 'expand_more' : 'expand_less'}} hide replies show replies
Amastan Amastan July 10, 2018 July 10, 2018 at 3:22:54 PM UTC link Permalink

belkacem77

Kabyle has already been around here since 2012 with the generic title of Tamazight (Berber)

https://tatoeba.org/eng/sentenc...ne/indifferent

We have already published more than 100,000 sentences in Tamazight, most of them are in Kabyle, but there are also hundreds of sentences in Mozabite and the Nefusa dialect of Libya.

The language is considered as the same by its speakers and the North African nations that recognize Tamazight as an official language (Algeria and Morocco) refer to it as "Tamazight".

This is why I would recommend that we continue to follow this convention. In case we want to specify which dialect of Tamazight a sentence belongs to, then we could tag those sentences as "Kabyle."

All of my Tamazight sentences on this website are in Kabyle but I prefer to continue to work for the Tamazight language and not for a local dialect of the language.

Having a Kabyle language might eventually lead to a situation where countless sentences would be identical under two different language names.

It is just as having 7 different flags for English simply because US English is slightly different from UK English that's, in turn, different from Australian English that's in turn different from New Zealand English. I think we, as Tamazight-language activists, should rather work for the unification of our language.


{{vm.hiddenReplies[29451] ? 'expand_more' : 'expand_less'}} hide replies show replies
belkacem77 belkacem77 July 10, 2018 July 10, 2018 at 5:10:56 PM UTC link Permalink

@Amastan

thanks for your message.

We launched series of localizations and NLP tools for the kabyle language only and we can't manage all flexions of the whole language berbers with one corpora. You know, every berber dialect has it's own functions, derivational morphems and inflectional morphemes, grammar system. The system of these morphemes (derivational/inflectional) make changes to the meaning of the stem and the grammatical information for each berber language. Pronounciation as well.

Yes I know that a lot of words are shared between them beacause they belong to the same family like latin languages, but there is a gap within pronounciation, grammar structure..... and so on.

Iso 639-2 and Iso 639-3 listed all the codes for each language. Some of them are listed on iso 639-2 such as: Kabyle (kab), Tamasheq (tmh), Standard Moroccan Tamazight (zgh), please see http://www.loc.gov/standards/is.../code_list.php and the others listed on iso 639-3 such as Mozabit, Chawi, ...and the others : please see : https://iso639-3.sil.org/code_tables/639/data Example: for the mozabit the code page can be reached here: https://iso639-3.sil.org/code/mzb.

We also launched Kabyle language on CLDR (Common local Data Repository, a project of the UNICDE Consortium, please see: http://st.unicode.org/cldr-apps/v#locales/kab), the project is now maintained under the "kab" code (Kabyle language). Another team is maintaning the kabyle language on Wikimedia (wikipedia and wikitionary: please see https://kab.wikipedia.org/ and https://kab.wiktionary.org , wikitionary is under incubation). 27 Mozilla projects are also localized into the kabyle language: Firefox for all plateforms, Tunderbird,... are now available in kabyle language, ... I can't list all the projetcs we launched for the kabyle language (LibreOffice, OpenOffice, LinuxMint.... and some others are comming in kabyle language).

Anyway, we are know launching series of NLP tools dealing only with the kabyle language. We launched Common Voice along with Mozilla and we started to gather sentences and voices (please see https://voice.mozilla.org/kab), but we need more corpora to launch other processings (morpho-syntactic, grammar, semantics, sentiments analysis .... and so on), thats why we want to launch our locale on tatoeba to gather some specific corpora for these comming NLP tools, the tools are Python-based scripts, and available for free on Gitlab: https://gitlab.com/belkacem77/KabyleNLP , a kabyle morpho-syntactic corpus is also available for free based on a set of 48 tags .

I'd like if you give us a permission to reuse the kabyle corpora you gathered here on tatoeba but also use them on Common Voice and add them for our next corpora on tatoeba. What do you think?

Thanks for your message

M. Belkacem
L10N kab Localizer
Admin of the kabyle locale (Mozilla, Evernote, LibreOffice, VK)

belkacem77 belkacem77 July 16, 2018 July 16, 2018 at 2:31:19 PM UTC link Permalink

Hi again,

Is there someone to enable the request to join the localization project on transifex?
We would like to move forward before going back to work since we are on vacation period. :) Thanks

TRANG TRANG July 22, 2018 July 22, 2018 at 7:39:13 PM UTC link Permalink

Kabyle has been added.

https://tatoeba.org/eng/sentenc...ne/indifferent

{{vm.hiddenReplies[29485] ? 'expand_more' : 'expand_less'}} hide replies show replies
Ricardo14 Ricardo14 July 22, 2018 July 22, 2018 at 10:06:32 PM UTC link Permalink

That's a such good news! Welcome aboard, Kabyle!
Thanks, Trang.

belkacem77 belkacem77 July 23, 2018 July 23, 2018 at 1:14:35 AM UTC link Permalink

Thanks for tatoeba

There is an article about it on an online news paper. We are also discussing about it on our FB and VK pages. I plan to make some recorded video to assist people to undestand with the UI when we get the Wesite fully localized.

{{vm.hiddenReplies[29487] ? 'expand_more' : 'expand_less'}} hide replies show replies
Amastan Amastan July 23, 2018 July 23, 2018 at 6:46:55 AM UTC link Permalink

I don't understand, and I am utterly surprised by the fact that some of my sentences are under the "Kabyle language" while I am here to write and support the Amazigh (Tamazight, Berber) language. I don't remember ever having intentionally contributed sentences under the name "Kabyle language".

Could you please stop doing that to my sentences, whoever is doing it?

Amastan Amastan July 23, 2018 July 23, 2018 at 6:50:06 AM UTC link Permalink

The flag of this sentence was apparently modified by someone who has the capacity to do so on Tatoeba:

https://tatoeba.org/eng/sentences/show/1597933

I will keep this sentence as an example. But since I am a corpus maintainer responsible of the Amazigh/Tamazight (Berber) corpus, I am not going to accept this. This is why, I am immediately changing the flag for my Amazigh/Tamazight sentences back to the Amazigh flag/symbol.

Please don't change the flag of my own sentences.

{{vm.hiddenReplies[29489] ? 'expand_more' : 'expand_less'}} hide replies show replies
Amastan Amastan July 23, 2018 July 23, 2018 at 7:10:02 AM UTC link Permalink

I have finished changing the flag my own Amazigh/Berber sentences back to the Amazigh flag/symbol. Please don't change the flag of my sentences again. Like almost all North Africans (and particularly Amazigh speakers), I consider my language as Amazigh/Berber and not "Kabyle." Kabyle is the dialect of my home area in Algeria, but I can read, understand, and even write other dialects, just as many other Amazigh speakers. The difference between Amazigh dialects isn't at all as significant as the difference between French and Spanish or Italian. Besides, Amazigh/Berber is currently recognized as an official language in both Algeria and Morocco and it is clearly and unambiguously referred to as Tamazight (the Berber language), as one single language, and not as a "language family/family of languages/group of languages" or "the big Berber family". This is why I'd respectfully, but also, quite formally, ask you not to touch my personal corpus on Tatoeba. I have spent years working on it, translating for countless days and months. People have joined my efforts, including Moroccans, Mozabites (Algerian Sahara), and Libyans, and they never questioned the unity of our language. Therefore, those who are willing to promote other dialects should just leave my corpus untouched and deal with their own work by themselves.

{{vm.hiddenReplies[29490] ? 'expand_more' : 'expand_less'}} hide replies show replies
belkacem77 belkacem77 July 23, 2018 July 23, 2018 at 8:05:10 AM UTC link Permalink

Hi Amastan,

You should contact the Admins for this. :)

As you now, segmentation is needed if you yould like your copus to be used within language processing (NLP). We can't put together serveral sentences from several languages and assume that's is one language. I talked about Tatoeba to Shawi, Mozabit and Rifain speakers to ask Tatoeba open their locale here. More than 12 berber languages are codified within ISO 639-3, and it's a chance for them to get their own Corpuses.

I we need to process these languages, we should define and codify some languistics rules, Berber is not a standardized language and has a set of 22 dialects, each language has it's own rules (phonology, pronouns, flexion and derivational schemes....).

We would like use your the corpus you gathered within Common Voice for the Kabyle language as we started to collect voices and we can't, because you corpus contains sentences that Kabyle spoken people can't read. More and not last, I noticed that you used two scripts within your corpus: Tifinagh and Latin. As you now, Kabyle is a latin-based script. Algorithms can't deal with such a language.

As a comparison: Imagine you have a corpus named "Latin" where you put together French, Italian, Spanish, Portugeese and Romanian and other Roman languages, how could a machine apply the same rules to process such a corpus?

I suggest you to tag your sentences with the right berber language to allow people use them later within language processing algorithms .

{{vm.hiddenReplies[29491] ? 'expand_more' : 'expand_less'}} hide replies show replies
amazigh84 amazigh84 July 23, 2018, edited July 23, 2018 July 23, 2018 at 9:59:27 AM UTC, edited July 23, 2018 at 10:42:51 AM UTC link Permalink

Azul fellawen

J'ai suggéré ce changement de flag pour tatoeba et il apparaît que la plupart des admins sont assez d'accord avec cette proposition et m'ont même signifier qu'avec Amastan la question s'était déjà posée.

Sans vouloir créer aucune polémique ici, permettez moi de vous dire que les phrases ne sont pas les nôtres, les traductions viennent de nous mais on ne peut pas s'approprier aucune phrase, par ailleurs j'ai pris beaucoup de plaisir à rajouter des audios a beaucoup de phrase d'Amastan.

Tamazight est très riche par ses diverses composantes et le kabyle n'est qu'une de ces composantes, comme sur wikipedia, il me semble plus judicieux d'avoir une version kabyle de tatoeba au lieu d'une version globale tamazight.

{{vm.hiddenReplies[29492] ? 'expand_more' : 'expand_less'}} hide replies show replies
Amastan Amastan July 23, 2018 July 23, 2018 at 10:16:20 AM UTC link Permalink

Amazigh84,

I will continue to write in English for the sake of the Admins so that they understand what is going on here, although I would have loved to discuss the whole thing in Tamazight/Berber, our common language. Thanks for adding audio files to my sentences. You have been doing a good job. There are a few errors but we will deal with them later because I am currently extremely busy working on something else.

However, I don't remember having said to any admin or any other person that it was OK to change the language flag of my sentences. What I tell people is that if they want to work for something, they need to put in their own energy and effort instead of simply using somebody else's job for the promotion of their own views.

Regarding Wikipedia, I am already personally involved in the setting up and promotion of an Tamazight version of the encyclopedia. Mozabite and Moroccan volunteers are already on that fledgling project and, to tell the truth, even on Wikipedia, I wouldn't like that my articles be copied and pasted on the Kabyle Wikipedia that I don't endorse at all. I don't endorse Wikipedia in any individual Amazigh dialect. I work hard to unify the language, not to promote a dialect in particular.

Let it be clear that I am clearly, formally and unambiguously not taking part in a project for the Kabyle language. This is an irrevocable decision. I simply and clearly don't want to promote Kabyle as a separate language. I have been promoting it as part of the Amazigh language all my life but I consider Tamazight as my language, not Kabyle. I imagine that I have the right to say no. So others need to accept it too.

No hard feelings, but this is my stance as far as this issue is concerned.

belkacem77 belkacem77 July 23, 2018 July 23, 2018 at 10:33:30 AM UTC link Permalink

Azul ay atmaten,

L'objectif ultime est d'aller vers le NLP. Nous avons d'ailleiurs lancé plusieurs projets, reconnaissance vocale, analyse morpho-syntaxique, segmentation, lemmatisation...

On ne peut pas traiter de la même façon toutes les langues berbères, c'était ça le souci.

Si on arrive à séparer chaque langue à part, l'utilisation des ces corpus à des fins de TAL sera plus simple pour les informaticiens (je suis ingénieur informaticien et impliqué dans le TAL).

De plus, il est plus aisé de réunir automatiquement des phrases étiquetées: kab, shy, mzb,rif....etc que de tenter de tenter des séparer un corpus étiqueté uniquement en utilisant le "ber".

Je n'ai pas compris la démarche de Amastan et pour quel objectif, de plus en utilisant à la fois la graphie tifinagh et latine.

Pour faire aboutir nos projets de NLP sur le Kabye, nous avons besoin de corpus exclusivement en kabyle notés en latin (voir CLDR d'Unicode).

Un autre risque également qui est important: Le Kabyle dans le corpus berbère domine les autres parlers, un autre risque de plus qui s'ajoute et qui menace les autres dialecte, or, ces corpus devront aussi servir pour donner des moyens de sauvegarde pour les autres langues berbères minoritaires et dépouvue de moyens pour évoluer d'une façon autonome. On ne doit pas reproduire le même schéma qui se pratique dans l'enseignement où l'on enseigne le kabyle déguisé en berbère tuant ainsi toutes les autres variantes. Il faut sauvegarder toutes les langues berbères d'où la nécessité de séparer.


{{vm.hiddenReplies[29494] ? 'expand_more' : 'expand_less'}} hide replies show replies
Amastan Amastan July 23, 2018 July 23, 2018 at 10:50:47 AM UTC link Permalink

belkacem77,

I am opposed to the use of the word "language" to refer to regional variants (dialects) of the same language, Tamazight. I have already said that, if I remember. The speakers of the language (all dialects included) refer to their language as Tamazight.

Apparently, you want to separate each dialect as a language on its own for technical reasons. I understand that, but I am not willing to participate in it. I prefer to view Tamazight, like millions of other Amazigh speakers and language activists, as one and only language.

This why, in case some proponents of the Kabyle language on this website want to use my corpus of over 100k sentences for their project, I would politely say no since this goes against my aims as a language activist who has worked for more than 2 decades to unify the language. You could make your own corpora for each "language", I think.

As for the differentiation of the different dialects, what I suggest and I already use is tags, not a different language flag.

{{vm.hiddenReplies[29495] ? 'expand_more' : 'expand_less'}} hide replies show replies
Meksems Meksems July 23, 2018 July 23, 2018 at 6:14:59 PM UTC link Permalink

We are not using and will not use your sentences. We are providing a thousands of sentences in kabyle language. We've done it before in Github and still continue to do it in parallel here. We feel, we consider, we know that kabyle is a language, so we are doing whatever it takes to promote it like that.
As said by our administrator in name of Belkacem77, who have being doing a such great job, that it more practical.
tanemmirt

{{vm.hiddenReplies[29504] ? 'expand_more' : 'expand_less'}} hide replies show replies
Selyan Selyan July 26, 2018 July 26, 2018 at 2:23:14 PM UTC link Permalink

Thank you, I appreciate your decision made by scientific arguments.
Kabyle is an integral language. and I do not understand the attitude of those who want to reduce it to a dialect. that each one works for his language.
we have our language and we will continue to work fot it.

amazigh84 amazigh84 July 23, 2018 July 23, 2018 at 12:30:19 PM UTC link Permalink

Thank you belkacem77 for your technical explanations, can't agree more.
Thank's to all tatoeba admins for their efforts.

https://github.com/Tatoeba/tatoeba2/issues/1529
https://github.com/Tatoeba/tatoeba2/pull/1531

{{vm.hiddenReplies[29498] ? 'expand_more' : 'expand_less'}} hide replies show replies
Amastan Amastan July 23, 2018 July 23, 2018 at 12:50:32 PM UTC link Permalink

@amazigh84 @cueyayotl

This is what Cueyayotl said on the Github page:

"I discussed this issue with Amastan (the original contributor of sentences in "Berber") a long time ago and I agree with vujma's proposition. Amastan contributed mainly in the Kabyle language, with a few exceptions (which he DOES mention in their respective comment sections). I propose the addition of "Kabyle" as a language, and a FULL migration of all of Amastan's "Berber" sentences (and possibly ALL "Berber" sentences) into "Kabyle" and eventually individually changing the sentences that are not in Kabyle. Vujma, the eventual goal should be to not have a "Berber" language in our language list, but to have all Berber languages added individually instead. The issue is that, as you mention above, there are many of these languages, and we currently don't have the users to add sentences in all of them. We hopefully will, someday."

But he didn't mention that I agreed. He said that he agrees with somebody else that my sentences be "migrated" into "Kabyle". I am totally opposed to that, gentlemen. I never agreed in the past. My goal isn't the promotion of a "Kabyle language." Sorry, guys. Find someone else to give you 100k sentences in Kabyle, but this certainly won't be me.

{{vm.hiddenReplies[29499] ? 'expand_more' : 'expand_less'}} hide replies show replies
amazigh84 amazigh84 July 23, 2018 July 23, 2018 at 12:59:33 PM UTC link Permalink

Ulac fellas Amastan, i understand your point of view, just said that you've discussed this possibility in the past, i didn't mean you've agreed,

Anyway, we have both berber (mainly kabyle variant) and we now have kabyle which as belkacem mentioned it will be very helpfull for our community especially nowadays with IT technologies.

belkacem77 belkacem77 July 23, 2018 July 23, 2018 at 8:46:46 PM UTC link Permalink

@Amastan
Please take a look to the tokenization Algorithm on GitLab:

https://gitlab.com/belkacem77/K...kennization.py

The Algorithm is Python-based. I'm using it to tokenize kabyle texts before POS tagging (Part Of Speech Tagging), it's based on a set of affixes used in kabyle language.
The Kabyle Affixes are listed on this file based on Dr Kamel Nait Zerrad (INALCO - France)
https://gitlab.com/belkacem77/K...ixescolles.txt

The question is: how would you like to tokenize all the 22 berber languages with this set of affixes?
The answer is: We can add all the affixes. How many? Who should know all these berber affixes? Is there any study (grammar & Syntax) covering all the berber languages?
What about ambiguity?

It's just an example of the basiest processing and what about syllabation? Stop words? derivations? Flexions? Symantics and so on...

There is a lot of ambiguity within the Kabyle language to fix. We can't consider all language berbers within one corpus, one NLP set of Algorithms an so on.
I know that you did a good job since years, I think it's time to separate every lanague with it's own corpus. There is a solution. All berber languages are codified on ISO 639-3 and Tatoeba supports it. We are going to give a chance for everyone to progress and save it's own heritage. With this way, Kabyle language is going to be the standard and it's not a good way to save berber languages. We need them separately.

{{vm.hiddenReplies[29505] ? 'expand_more' : 'expand_less'}} hide replies show replies
Selyan Selyan July 26, 2018 July 26, 2018 at 2:25:12 PM UTC link Permalink

Kabyle is an integral language. and I do not understand the attitude of those who want to reduce it to a dialect. that each one works for his language.
we have our language and we will continue to work fot it.

Amastan Amastan July 23, 2018 July 23, 2018 at 12:58:54 PM UTC link Permalink

I also found this message on this Github page:
https://github.com/Tatoeba/tatoeba2/issues/1529

"I've a question please, I noticed that there is another berber language defined in your database which is Tarifit from north of Morocco and of course it is "different" from mine (kabyle) but all belongs to the berber family which is in fact Tamazight, just like you may say french, italian, Spanish ... are belonging to Latin language so my question is: is it possible to change the Berber to Kabyle since all of the sentences i read are from kabyle variant ? I am asking you this because in Wikipedia this is what we have, kabyle version and other berber languages versions, which make by the end berber languages more rich i guess."

If you guys want to transfer Tamazight sentences to the "Kabyle language" corpus, please avoid touching my sentences. The author of the sentences (ie me) formally and publicly refuses this. Thank you very much for your understanding.


{{vm.hiddenReplies[29500] ? 'expand_more' : 'expand_less'}} hide replies show replies
amazigh84 amazigh84 July 23, 2018 July 23, 2018 at 1:04:52 PM UTC link Permalink

I send you these messages myself because i wanted to share all the discussions and proposals with you, and of course you've all the rights of your sentences, i understand that.

{{vm.hiddenReplies[29502] ? 'expand_more' : 'expand_less'}} hide replies show replies
belkacem77 belkacem77 July 23, 2018 July 23, 2018 at 5:35:41 PM UTC link Permalink

Amazigh,

You can change yourself the flag of your own sentences as I'm doing. But if you recorded some sentences of another contributor, or if some one recorded your sentence under the ber tag, please don't change the flag.

We will build our corpus from scratch. I have thousands of sentences sent by academicians and hosted on github. We are correcting them, so feel free to copy from them. And don't forget to translate from/to other laguages as it's recommended by Tatoeba.

I will give training sessions next September for the students of the two departments of amazigh languages (Tizi and Bgayet), they are interested to join the kab tatoeba team. so forget you old contributions if you can't change them and let's begin together new ones.

Remember that the goal is to produce corpuses for the net step, NLP algorithms. So we need only sentences in Kabyle language.

Thanks for help

{{vm.hiddenReplies[29503] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US July 25, 2018 July 25, 2018 at 1:23:26 AM UTC link Permalink

@belkacem77, @cueyayotl, @Amastan, @amazigh84, @Meksems

I'm glad that we have people who care so deeply and are willing to put so much energy into contributing sentences to the corpus. However, there are a few things about this conversation that make me uneasy. Clearly, there's a basic disagreement about which language identifier should be used to identify sentences contributed to Tatoeba that could be considered either Kabyle or Berber (also known as Tamazight; I'm using "Berber" because it's in the current list of languages at Tatoeba). An argument can always be made in favor of using more specific language identifiers, and a counterargument can always be made in terms of using a smaller number. I urge you to tackle this issue in a conversation with the admin team (team at tatoeba dot org) rather than a long thread on the Wall, and I urge you to address it sooner rather than later, since the longer these disagreements remain unresolved, the more of a mess will result. Where possible, please think pragmatically rather than ideologically, and please think in terms of consistency with the rest of Tatoeba. I hope that we can come to an approach that will satisfy everyone involved, and not lead to people carving out individual territories within Tatoeba. I also urge you to make sure that you all respect the guidelines:

https://en.wiki.tatoeba.org/art...or-in-tatoeba#

https://en.wiki.tatoeba.org/art...ood-sentences#

Many thanks.

{{vm.hiddenReplies[29516] ? 'expand_more' : 'expand_less'}} hide replies show replies
Aiji Aiji July 25, 2018 July 25, 2018 at 1:38:02 PM UTC link Permalink

We should have a "+1" system on this wall to avoid adding a comment just to say how much I agree with you!

> I hope that we can come to an approach that will satisfy everyone
> involved, and not lead to people carving out individual territories within
> Tatoeba.
It seems to be a recurrent thing recently...

I'm not well aware of the issues in the berber language(s) but I'm more aware of the issue of NLP and others, and ideology aside, I'd just like to point out that tools are better to stay separate, in the sense that you don't modify the source providing data to fit whatever you want to develop, or to sneak around some messy issues. The source (here Tatoeba) stays a pure provider. If it's messy, your tool(s) clean it (or ignore it). Not the other way around.

And seems

{{vm.hiddenReplies[29517] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US July 26, 2018 July 26, 2018 at 1:49:10 AM UTC link Permalink

Well put.

{{vm.hiddenReplies[29522] ? 'expand_more' : 'expand_less'}} hide replies show replies
Amastan Amastan July 26, 2018 July 26, 2018 at 1:12:42 PM UTC link Permalink

@belkacem77, @cueyayotl, @Amastan, @amazigh84, @Meksems

In a private message that Trang sent me, she suggested that we continue to discuss this topic publicly. In fact, I too prefer this because several of my colleagues are already reading what we have posted here and they might be interested in joining the discussion.

Here is my point: I totally understand that Tatoeba doesn't want to move away from the ISO 639-3 categorization, however, the aspirations of Amazigh language activists (like myself and thousands of other people + millions of speakers) should be taken into consideration, the Algerian and Moroccan states are working on everything related to the standardization of the Amazigh language (requesting an ISO code, etc.) and I think that we better keep the status quo before Tatoeba decides to do anything with the corpus currently available under the Amazigh/Berber flag.

I have spent more than 20 years of my miserable life studying, learning, and teaching the Amazigh language. I did research on various dialects (including the Blida Atlas dialect, Shawi, Amoucha dialect [near Kherrata in the Bejaia/Setif area], the Zenata dialect spoken in Timimoun [Adrar area], etc.) and I can confirm that there is an undeniable mutual intelligibility between the speakers of all those dialects that I can speak myself together with many of my colleagues. I have also studied linguistics for 4 years and taught basic linguistics for a year to would-be field researchers (2005-2006 [my wife took part in the training]).

I am well aware about the issues regarding the distinction between what is a language and what is a dialect (regional variant). Brilliant linguists like Salem Chaker have always claimed and continue to claim that, because of its unified grammatical structure, the Amazigh language should be considered as one language.

Please have a look at this article (in French) authored by Pr Salem Chaker regarding this issue:

https://www.facebook.com/notes/...0303391649292/

What I can confirm myself is that there is a true language continuum stretching, at least, from southern Tunisia to central northern Morocco, including ALL the northern Saharan oases (Figuig, Timimoun, Bechar, Boussemghoune, Mzab, Ouargla, etc.).

All the people of those areas speak the dialects of the very same language and it would take an Amazigh speaker only a week of direct linguistic exchange to learn how to speak any of those dialects.

Contrary to what Belcakem claims, the prefixes (there are not many) are the same for all the dialects (there could only be slight differences in the pronunciation) and the basic vocabulary is the same almost everywhere.

Last but not least: the conjugation is the same and it makes inter-dialectal communication amazingly easy. You know very well that even in Kabylie we have people that don't even deign to learn the expressions used in the next village, so how would we get them to listen and get familiar with dialects spoken some 200 miles away from their village? Yet other Kabyles can speak at least 1 other dialect, and many Shawis and Chenouis and even more Moroccans can understand Kabyle in addition to their own dialects, and they would use words of various dialects in their texts, poetry, or novels. It took me only 10 days to learn Shawi. I can also speak Spanish and it took me 9 months to learn it.

Conjugation markers are the same. The ventive particle is the same. The predicative particle is the same. The factitive verb prefix is the same. The passive verb prefix is the same. The marks of the feminine are the same. Almost all the structural grammatical elements are the same.

CONJUGATION:
All the dialects have the following conjugation markers:

-ɣ --- Ssneɣ - I know
t-d ---- Tessned - You know
y- ----- Yessen - He knows
t- ----- Tessen - She knows
n- ----- Nessen - We know
-n ----- Ssnen - They know (masculine)
-nt ----- Ssnent - They know (feminine)

This conjugation is the same in virtually every dialect. Only minor phonological differences could exist and those phonological differences exist even inside Kabylie.

VENTIVE PARTICLE:
It indicates the direction of the action and it is "d" in every Amazigh dialect including in the Zenaga of Mauritania (that I am currently and painstakingly documenting):

Yerwel ---- He ran away.
Yerwel-d ---- He ran away (towards us/He came running).

PREDICATIVE PARTICLE:
It is used to introduce something in a sentence and it's the equivalent of "it is" or simply the verb "to be" in English. This particle is "d":

D argaz. ---- He is a man/It is a man.

This sentence is the same everywhere.

FACTITIVE VERB PARTICLE (s-):

The factitive verb prefix is the same. In fact all the verb prefixes are the same, and some of them include:

"S-" The prefix of the factictive verb (to make sb do something):

Ečč (to eat)
Ssečč (to feed, to make sb eat sth)

You guys know very well that even in Kabylie, there are people who pronounce it "ccečč" but this is just a regional variant of the same prefix.

"TTW-" The prefix of the passive verb (to be done):

Ečč (to eat)
Ttwečč (to be eaten)

Ssen (to know)
Ttwassen (to be known)

This prefix is the same everywhere.

The word order is absolutely the same in all the dialects. That's what Salem Chaker calls the "deep unity" of the language.

Even my own mother can understand the Shawi and the Chenoua dialects although she was born and raised in Kabylie.

I have always wanted this website to be a place for all the dialects to flourish, yet I would love to see that taking place under the same language flag. As I am documenting the Zenaga dialect, I am also anxious to start publishing the wonderful sentences of my main Mauritanian informant on this website (and on his behalf), making thus Tatoeba, a website that could play a significant role in the documentation of dying dialects like Zenaga.

Complaining about not being able to unify the dialects or considering each dialect as a separate language reminds me of some of my Kabyle students and colleagues who would, in order to excuse themselves from speaking in Kabyle/Tamazight (preferring to speak in French or Arabic with Kabyles), they would tell me that "the Kabyle of your village is quite different from that of my village, so let's just communicate in French or Arabic". Such an excuse is also give by some Kabyle speakers from Tizi-Ouzou who would prefer to speak in Arabic or French just to avoid speaking in Kabyle with Kabyle speakers from Bejaia (this is very common in Algiers, the city where I live). This also reminds me of some of my English students who, before even they learn how to conjugate a verb in present simple, they would be asking me if what I'm teaching them is UK or US English, and before even they listen to anything in English, they'd complain that American English is "impossible to understand" (as if they had ever really tried to listen and understand it).

So I categorically refuse to endorse the division of the Amazigh language into various separate languages for the reasons I have stated above. My suggestion is that we keep things just as they are now, waiting for some upcoming development that I don't think would take a long time to come. As for how to differentiate between the dialects of the Amazigh corpus of which all my sentences are part, as I said, I'd use tags for each dialect.

I hope you understand that.

{{vm.hiddenReplies[29523] ? 'expand_more' : 'expand_less'}} hide replies show replies
belkacem77 belkacem77 July 27, 2018 July 27, 2018 at 11:17:50 AM UTC link Permalink

@Amastan , @cueyayotl, @Amastan, @amazigh84, @Meksems

Thanks for your reply.

Please refer to publication of Pr Muhand Tilmatine (https://www.academia.edu/263714...s_dominantes).

Dr Mohand Tilematine is a Kabyle sociolinguist from the Department of Philology, Universidad de Cádiz, Spain. He says that the word Tamazight or Amazigh refers to an abstract idea without any real existence. The word refers to a set of dialects/languages spread over on a large territory (North Africa), coming from one language family, but they developed differences because of their history, geography, culture, human potential, economic, cultural and linguistic vitality, political action and social demand.

Another sociolinguist Dr Dourari, the director of CNEPLET recommends also to consider Berber languages separately not as a whole within the next Algerian Berber Academia.

The same opinion is shared by linguists from the Amazigh Culture and Language Departments from Tizi ouzou, Béjaia and Bouira and among them Dr Mohand Akli Salhi, Dr Said Chemakh, Pr Kamel Bouamara, Dr Moussa Imarazène...We are working with some of them together on projects such as spellcheckers, morph syntactic analyzers, syllabication tools...but only for the Kabyle language not for the Amazigh Language which refers to an abstract word.

Except intra-berber comparative linguictics studies published within the DLCAs, all publications from theses departments deal with the kabyle language. You can refer to the websites of theses universities to see the publications. The whole of these publications was a raw material for our localization projects, programs dealing with human language (kabyle), copora…

Pr Kamel Nait Zerrad, a linguist from the INALCO institute, also published an advanced kabyle grammar and a Kabyle conjugation books dealing only with the kabyle language. Pr Kamel Bouamara also published the first Kabyle-Kabyle dictionary.
The father of the the amazigh studies, Mouloud Maméri, who published the first grammar, entitled it Berber Grammar – Kabyle variant.

Until now, there is no a common (shared) grammar for all the Berber languages. I have on my hand 4 grammars: Kabyle, Chawi, Mozabit and Rifan and they are all different except the metadata language. Each grammar describe its own pronouns, affixes…etc. There are many gaps. You can also refer to Dr Salem Chaker - Linguistique Berbère – Tome II – where he described some differences within the vocabulary (“taddart” in kabyle means village, in Mozabit it means home).

I wonder if a tourist from Germany would visit Aheggar in Algeria, and wants to use tatoeba to know how to say in Berber: “Please, show me how to go home”. He will find a lot of translations, suggestions. Which one should he consider? And you know that we can’t understand each other between Berber linguistic groups.

As I said before, you can maintain one corpus for the Berber languages and we encourage you as it can be a good tool to save our linguistic heritage especially the dying ones. Our team wants to maintain a corpus for the kabyle language because of some coming projects where we need only to deal with one language not a family of languages.


belkacem77 belkacem77 July 26, 2018 July 26, 2018 at 1:26:00 PM UTC link Permalink

Hi,

Please refer to the page https://en.wikipedia.org/wiki/Berber_languages , there is information to understand the context.

Note that berber studies are very recent. They came from the social demand since the 80's (berber spring).

There is no a standard language called Berber (or Tamazight) and spoke/learned by all people from the Siwa Oasis to the Canary Islands, and from Burkina Faso, Mali, Niger to the north of Africa.

There are more than 22 known languages derived from the same family called the berber (iso 639-2 code for the berber is ber).
The structure of these languages differs, and people can't understand each other except if someone learned them.

The best way is to separate them since most of them have their own iso 639-3 code such as Chawi, Mozabit, ....

We are maintaining our kab locale on CLDR, localized into kabyle Mozilla's Projects (and among them Common Voice), LibreOffice, OpenOffice, Evernote, Wikipedia, Wikitionnary (under incubation), OpenStreeMap, Oppia, VK...
We also started the localization of 2 OS linux-based distrubutions,and we look to localize FB (ongoing process). Kabyle has it's own grammar rules and one script (latin-based script).

Our friend Amastan can keep working on one copus for all the berber languages, and we encourage him to contribute on such a corpus to save berber dying/ threatened berber languages. And from time to time, we will also help on the berber corpus but we will focus on the Kabyle one.

I asked the contributors who were working under the ber locale and who are now working under the kab locale to keep their old contributions within the ber locale if they want.

Thanks

Selyan Selyan July 26, 2018 July 26, 2018 at 2:30:18 PM UTC link Permalink

Hello Muhend Belkacem,
let's start our own corpus and start work for Kabyle.
let us follow our work by taking only the scientific approach in consideration.