Wall - Tatoeba

Wall (6,960 threads)

Tips

Before asking a question, make sure to read the FAQ.

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

Latest messages

subdirectory_arrow_right

marafon

3 days ago

feedback

CK

4 days ago

feedback

sharptoothed

9 days ago

subdirectory_arrow_right

Cangarejo

9 days ago

subdirectory_arrow_right

Cangarejo

12 days ago

subdirectory_arrow_right

Thanuir

13 days ago

subdirectory_arrow_right

ondo

13 days ago

subdirectory_arrow_right

ddnktr

13 days ago

feedback

ondo

13 days ago

subdirectory_arrow_right

AlanF_US

17 days ago

doemaar14 February 3, 2024 February 3, 2024 at 5:52:25 PM UTC

link

Permalink

Over the years I have witnessed, as many of you undoubtedly have, too, the hard work done and being done by all the Tamazight contributors on Tatoeba, and I commend them for it, ...but I have to ask this question:

Where in the real world can we actually find written content in Standard Berber/Tamazight?
I don't mean websites that just explain the grammar, but rather: monolingual websites that are constantly updated, textbooks, novels (fiction), wikipedia articles, news websites, blogs, real Berber-language content in the wild. That sort of thing.
(Of course the Berber dialects do have a rich history of spoken content, especially when it comes to music and movies, much of which can be found on YouTube).
I also know that some of you have tried to get Wikipedia to accept the language code for Berber, but they declined the request, didn't they? It seems to me the language merely exists as a spoken one.

hide replies show replies

doemaar14 February 5, 2024 February 5, 2024 at 4:53:36 AM UTC

link

Permalink

I take it the Berber contributors have no answer to my question?
If so, what is the point of adding all these thousands of sentences, when people interested in these languages can't even find real-world, frequently updated content in said languages? I have scoured the web and found nothing: almost nothing in the way of fiction, news websites, science, technology, just zilch, except for a Bible translation and some random PDFs posted on French websites.

hide replies show replies

Cangarejo February 5, 2024, edited February 5, 2024 February 5, 2024 at 11:07:34 AM UTC, edited February 5, 2024 at 3:14:52 PM UTC

link

Permalink

> What’s the point?

Language preservation.

hide replies show replies

imalaqvayli February 5, 2024 February 5, 2024 at 7:59:48 PM UTC

link

Permalink

Berber is a familly of langages, we re raising it since several years now to tatoeba
And it is why you have a kabyle language for example in tatoeba, which is a real language with rules, courses, Books, articles, websites, songs, poems etc

Most of the ber sentences are a kabyle ones

The flag also is a berber one, we asked the admins to change the kabyle ones with the right kabylian flag but they refused and put the ber one...

hide replies show replies

doemaar14 February 8, 2024 February 8, 2024 at 5:26:02 PM UTC

link

Permalink

That makes sense. ''Berber'' is a unified, artificial language, after all.
Thank you.
To prove my assumptions wrong, could you point to any frequently updated websites that exist exclusively in Kabyle?

hide replies show replies

Igider February 9, 2024, edited February 10, 2024 February 9, 2024 at 6:01:34 PM UTC, edited February 10, 2024 at 12:39:09 PM UTC

link

Permalink

There are several sites devoted to the Kabyle language, starting with Wikipedia. Almost 7000 articles.

https://kab.m.wikipedia.org/wiki/Asebter_agejdan

hide replies show replies

doemaar14 February 11, 2024 February 11, 2024 at 4:01:01 PM UTC

link

Permalink

Wonderful. Kabyle in the real world! Tanemmirt! Hopefully the amount of Kabyle content keeps growing.

hide replies show replies

Igider February 11, 2024, edited February 13, 2024 February 11, 2024 at 7:24:21 PM UTC, edited February 13, 2024 at 8:16:44 AM UTC

link

Permalink

Tanemmirt. It's very kind of you.

imalaqvayli February 14, 2024 February 14, 2024 at 4:31:00 PM UTC

link

Permalink

there are several kabyles websites having articles in Kabylian language and in french, like:

https://isahliyen.com/
https://kabyle.com/kab/
https://vava-innova.com/
siwel.info
tamurt.info
https://apprendrelekabyle.com/

lbdx February 5, 2024 February 5, 2024 at 4:02:02 PM UTC

link

Permalink

According to linguists, Berber is not a language but a group of languages [1]. Consequently "ber" is an ISO 639-5 language code but not an ISO 639-3 language code. That is probably why Berber has been declined by Wikipedia.

Tatoeba also does not accept languages that do not have an ISO 639-3 code, but an exception was made for Berber. In hindsight, this was probably not a good idea. It creates overlap and harmful competition with other Berber languages' corpora such as Kabyle.

[1] https://en.wikipedia.org/wiki/Berber_languages

jan_Ketalija February 6, 2024 February 6, 2024 at 6:47:35 AM UTC

link

Permalink

I've also been wondering. Why is there such a large Kabyle speaking community on Tatoeba? I haven't seen Kabyle anywhere else on the internet really, even on linguistic-based websites. What about Tatoeba has such a draw for Kabyle speakers? (The same goes for other Berber languages, but Kabyle is the one I see the most.)

hide replies show replies

Thanuir February 6, 2024 February 6, 2024 at 8:27:34 AM UTC

link

Permalink

Tatoeba on vapaaehtoisuuteen perustuva verkkosivu. Käytännössä, jos joku innostuu siitä ja värvää aktiivisesti muita, voi syntyä hyvä kierre jossa kyseinen kieli tai kulttuuri tuntuu yliedustetulta. Vastaavasti moni muu kieli voi olla raskaasti aliedustettuna, koska kukaan ei ole vain sattunut innostumaan verkkosivusta tai innostus on hiipunut.

Mutta tämä on aivan luonnollista, koska osallistujamäärät ovat pieniä. Tällöin satunnaisvaihtelu on suhteessa suurta.

Igider February 6, 2024, edited February 6, 2024 February 6, 2024 at 9:19:41 AM UTC, edited February 6, 2024 at 9:19:58 AM UTC

link

Permalink

You've created the profil very recently.

:-)

Cabo February 8, 2024, edited February 8, 2024 February 8, 2024 at 6:26:13 PM UTC, edited February 8, 2024 at 6:26:21 PM UTC

link

Permalink

They found this place to conquer while ppl let it to happen.

hide replies show replies

Thanuir February 9, 2024 February 9, 2024 at 5:07:44 PM UTC

link

Permalink

Ei Tatoebaa voi vallata. Toisen kielen lauseet eivät vaikuta elämääsi mitenkään, jos et halua.

hide replies show replies

February 9, 2024, edited February 9, 2024 February 9, 2024 at 6:54:32 PM UTC, edited February 9, 2024 at 6:58:58 PM UTC

link

Permalink

warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

hide replies show replies

Cangarejo February 10, 2024 February 10, 2024 at 11:07:10 AM UTC

link

Permalink

On the Advanced Search page, there's an option that allows you to limit results to a single user. That allows you to only see sentences that belong to your friends.

hide replies show replies

Cabo February 10, 2024 February 10, 2024 at 3:36:38 PM UTC

link

Permalink

I know what the search query can do.
And you said false information.
"...limit results to a single user."
as you said, to a SINGLE USER
"That allows you to only see sentences that belong to your friends. "
That doesn't allow me to see ONLY THOSE sentences.

And who talked about a single user???
I talked about a whole group of users.

And what more, ignoring something what made the corpus less balanced, as Ibdx said: https://tatoeba.org/en/wall/sho...#message_40485
"The years 2017 and 2018 were years in which Tatoeba's main English-speaking contributor added hundreds of thousands of sentences in bulk.These sentences were mostly built according to syntactic patterns and used wildcards to avoid creating paraphrases that differ only in their named entities. These massive additions have greatly reduced the lexical diversity of the English corpus and increased the proportion of sentences containing pervasive words from 20% to 40%. This sudden change coincides with a sharp drop in the number of active contributors to Tatoeba."
... doesn't help.

hide replies show replies

Cangarejo February 10, 2024, edited February 10, 2024 February 10, 2024 at 3:45:38 PM UTC, edited February 10, 2024 at 10:20:32 PM UTC

link

Permalink

You can do multiple searches if you want multiple users; and Tatoeba could improve the search engine in the future to allow restricting results to given groups of users or to allow excluding groups of users. Don’t you usually translate Pfirsichbaeumchen’s sentences?

I think any imbalance in the corpus can be fixed with improved search features and using lists. There’s room for everyone.

hide replies show replies

sundown February 11, 2024 February 11, 2024 at 8:06:36 AM UTC

link

Permalink

> I think any imbalance in the corpus can be fixed with improved search features and using lists. There’s room for everyone.

The idea that improved search will sort out the imbalance in the corpus I find overoptimistic, to say the least.

hide replies show replies

Cangarejo February 12, 2024, edited February 12, 2024 February 12, 2024 at 8:58:07 AM UTC, edited February 12, 2024 at 3:48:41 PM UTC

link

Permalink

What about Ibdx’s solution, or hand-crafted lists?

hide replies show replies

sundown February 14, 2024 February 14, 2024 at 8:29:23 AM UTC

link

Permalink

I agree with lbdx that a sentence limit should be introduced. It's the least that should be done.

Cabo February 11, 2024 February 11, 2024 at 1:26:35 PM UTC

link

Permalink

If you don't want to see it, it's still there.
Well, just wear blinders then.

"Don’t you usually translate Pfirsichbaeumchen’s sentences?"
Huh? So you think I'm maaster now. I'm Cabo, it's written on the top of the message block. And what if someone has a dedicated user whose sentences he/she likes to translate? Does he/she have no right for opinion? Just translate and care for nothing else?

And thank for the one who blocked my message?
Was it the word shite? Or just not happy what I had to say?

hide replies show replies

Cangarejo February 12, 2024, edited February 12, 2024 February 12, 2024 at 8:54:46 AM UTC, edited February 12, 2024 at 2:52:02 PM UTC

link

Permalink

> So you think I'm maaster now?

I did confuse you with maaster.

> And what if someone has a dedicated user whose sentences he/she likes to translate?

I wasn’t criticizing maaster. I was just saying that you can already restrict the sentences you see to those belonging to any one user you want.

February 12, 2024, edited February 12, 2024 February 12, 2024 at 8:49:48 AM UTC, edited February 12, 2024 at 8:50:05 AM UTC

link

Permalink

warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

sharptoothed February 11, 2024 February 11, 2024 at 4:38:40 PM UTC

link

Permalink

✹✹ Stats & Graphs ✹✹

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/

CK February 10, 2024 February 10, 2024 at 9:44:39 PM UTC

link

Permalink

🍎 Stats : An attempt at counting the number of active contributing usernames per year

2024 (268) * in progress
2023 (1165)
2022 (1225)
2021 (1366)
2020 (1520)
2019 (1304)
2018 (1465)
2017 (1414) * the year the SSD died. https://blog.tatoeba.org/2017/
2016 (1899)
2015 (2248)
2014 (2040)
2013 (2289) * the peak
2012 (2179)
2011 (2032)
2010 (1512)
2009 (155)
2008 (85)
2007 (20)

Here are the number of usernames owning sentences without valid dates (early entries in the database).
0000 (1515)

** Notes

This is based on data harvested from th 2024-02-10 sentences_detailed.csv file.

Note that this is not actually the number of active usernames from each year.

The usernames counted are the ones who currently "own" sentences added in those years, not really the contributors, since "orphan" sentences can be adopted. This means that the year the Tanaka Corpus sentences were imported shows a lot more contributors than there actually were. Many of those who have adopted these sentences joined the project much later.

hide replies show replies

Yorwba February 11, 2024 February 11, 2024 at 4:37:21 PM UTC

link

Permalink

If you want to count contributors active in a given year, you should probably analyze the contributions.csv file instead.

February 7, 2024 February 7, 2024 at 11:30:18 AM UTC

link

Permalink

warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

Igider February 5, 2024, edited February 9, 2024 February 5, 2024 at 11:22:47 PM UTC, edited February 9, 2024 at 5:59:23 PM UTC

link

Permalink

You've raised a very good question. By the way, the Berber language as defined in Tatoeba is a catch-all, because 80% of its phrases are Kabyle phrases but with a mixture of other Berber words. This mixture is not based on any linguistic reality, only ideology. The "Kabyle Berberists" who created this ideology think that by imposing their Kabyle language and mixing it with 20% of other languages, they will be able to unite all Berbers.

By the way, everything you can find in the way of novels, poetry, theatre, websites, music... It's 70% Kabyle. Because the Kabyle language has been transcribed into Latin since the 18th century. And there are 12 million Kabyles. Even the flag that Tatoeba's admins have imposed on the Kabyle language is not the right one, but rather the flag of all Berbers. The Kabyle flag has been removed from Tatoeba, since that. So the ideology feeds on another ideology. But the Kabyle language will progress, that's for sure. You only have to look at the digital fields in which the Kabyle language is used for localisation, learning and so on.

There are several sites devoted to the Kabyle language, starting with Wikipedia.

https://kab.m.wikipedia.org/wiki/Asebter_agejdan

sharptoothed January 28, 2024 January 28, 2024 at 8:23:31 AM UTC

link

Permalink

✹✹ Stats & Graphs ✹✹

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/

CK January 27, 2024 January 27, 2024 at 8:49:45 AM UTC

link

Permalink

🍎 Tatoeba.org Native Speakers with Native Language Sentences

http://tatoeba.ueuo.com/stats-2024-01-27.html

Find native speakers of languages you are studying and get links to their native language sentences.

The link above is to a page that shows only the 3,999 contibutors with 20 or more native-speaker sentences.

These members have contributed 99.7% of the native-speaker sentences.

If you want to see all such contributors, try the following link.

There is a lot of data, so the page will be slower to load, and may possibly not work on some devices.

http://tatoeba.ueuo.com/stats-2024-01-27all.html

Updated: 2024-01-27

eeyinn January 17, 2024 January 17, 2024 at 1:30:44 AM UTC

link

Permalink

Hello,
I would like to offer translations in 2 cousin languages i am studying, of which i seem to be the only speaker or student on Tatoeba (L2) and whose way of constructing sentences is so different from other language corpuses I've contributed to on Tatoeba that I'm not sure how to proceed.
In Muscogee and in Hitchiti, a sentence like "Tom had an idea" isn't a sentence. Sentences start and end with entire "scenes" like in a movie. In a paragraph-length text, you don't have many sentences back to back, each conjugated for person and tense and such. They are meant to start and end with the "scene" you are you describing, composed of many short thoughts stitched together, with barely conjugated verbs throughout a typically very long sentence, and fully conjugated verbs only appear at the very end of the sentence where the "scene" you're narrating finally concludes a paragraph or two later. So when I get a sentence like "Tom had a great idea", that is not a whole scene and so It doesn't actually feel right to conjugate it as, let's say "Tom-ke vkerrickv herēmahēn hayvtēs." (tom mad a very good idea, conjugated to be very long ago). But, if conjugate it as the "clause/fragment" that Muscogee would treat it as, it wouldn't be a complete sentence, and so it wouldn't really include the period at the end of the sentence, which are always at the end of Tatoeba sentences as a matter of policy.

So I guess, what I'm asking is, what should I do? Should I conjugate these short sentences as all occurring in the recent past and constituting a full story, against the language's actual syntax, or should I write them as the clauses that the language would make them into being in a fuller-fledged sentence, but not actually provide "sentence-bearing complete sentences" as Tatoeba makes all of its entries?

Thankful for the Tatoeba project. 🙏

hide replies show replies

Objectivesea January 17, 2024 January 17, 2024 at 3:53:06 AM UTC

link

Permalink

By its nature, Tatoeba leans towards short, complete utterances, even though such short sentences are obviously removed, in most cases, from a larger context. It is possible to construct scenes by assembling Tatoeba sentences into a defined order, and there are sometimes paragraph-long Tatoeba entries consisting of ten or more sentences, but these are frowned on as being unlikely to ever be tackled for translation. Also in such cases, we can run into a limit on the number of characters to be represented, and this can make a Tatoeba translation into certain languages impossible.

As a practical matter, I like to think of Tatoeba as a tool for the language learner and not as a comprehensive multilingual encyclopædia of all possible utterances. For me, in such a tool, sentences should be as short as possible while still conveying some meaning.

For me, a minimal sentence has a noun and a verb. Then there are transitive sentences with a direct and/or indirect object. A few well-chosen adjectives and adverbs add some spice, and then coordinate and subordinate conjunctions link related thoughts together. This works well for most European languages. Different structures are likely necessary for agglutinative languages, and I understand that North American Indigenous languages make use of a fourth-person pronoun on occasion.

Your contributions in Muscogee and Hitchiti will be valuable to Tatoeba, as I think we are unlikely to find a native speaker, which would have been ideal. I would suggest that you begin with the simplest of sentences which can be unambiguously represented in those languages and which are grammatically correct, even though they may convey only a little of the flavour of an Indigenous Knowledge Keeper , Elder or Matriarch relating the oral history from his or her ancestors.

Thanuir January 17, 2024, edited January 17, 2024 January 17, 2024 at 7:26:45 AM UTC, edited January 17, 2024 at 7:29:43 AM UTC

link

Permalink

With Kven, which I read but can not use fluently, I add simple sentences I am sufficiently certain of and then translate to Finnish, rather than trying to translate from other languages to Kven. Maybe this approach works for you, too - try to put in example sentences you have strong reasons to believe are correct, and then translate those to your native language(s).

As a bonus, you are likely to increase the diversity of the corpus by getting culture-specific sentences and not getting Tom everywhere.

brauchinet January 19, 2024, edited January 19, 2024 January 19, 2024 at 9:53:33 AM UTC, edited January 19, 2024 at 9:56:32 AM UTC

link

Permalink

Very interesting.
I think you are the best person to decide how to do it, and there isn’t much to be done wrong.
Tatoeba states that sentences should be natural sounding. If you feel that a "sentence" would be natural sounding within a specific context, it might be okay to add it that way even when the verb isn’t conjugated und the context isn’t known. If it sounds like „…have an idea and …“ and people would react „huh – and what?“, it would not be a suitable sentence.
The definition of sentence will vary between languages. If grammar requires that you always conjugate verbs, utterances without such a verb might not be considered full sentences. I unconjugated verbs are just normal, such a definition doesn’t make much sense. It’s true that Tatoeba sentences require a full stop, mostly because most languages require a full stop at the end of sentences. So for the sake of consistency and to avoid comments and questions, couldn’t you simply add full stops? I wonder if there is really a Muscogee rule saying you mustn’t use a full stop.

Sometimes one has to “make up” some extra information in the target language that is in the source language. For example, I heard some languages require that speakers choose between different verb forms indicating the degree of certainty. Since this information isn’t in an English sentence you need to decide yourself when translating.
So in your case, you could also add more than one possible translation and annotate it in the comment section, one as whole scene in the distant past and one as part of an (imaginary) scene.

Wall (6,960 threads)

Tips

marafon

CK

sharptoothed

Cangarejo

Cangarejo

Thanuir

ondo

ddnktr

ondo

AlanF_US

Need some help?

Developers

About