I would like to start an initiative to have at least one audio recording
in every language on Tatoeba. At the moment there are sentences
for 359 languages but only 29 languages do have audio recordings!
Languages like Italien ranking 3rd in the sentence list with over 3/4 of
a million sentences do not even have one single little phrase recorded -
that’s a shame...
So I do have two questions:
1. What would be a perfect everyday sentence to have in every language,
not biased (Bible phrase etc.), not too short but also not an entire paragraph,
just a little glimpse to get a nice idea of how the language sounds in an
2. How can I go about it to encourage as many people as possible to send me
their contributions which I would happily volunteer for to cut, edit and prepare,
for handing them over to CK for the Tatoeba audio database?
Is there - beside the Wall - some kind of Newsletter etc. that reaches a bigger
number of contributors without having to assemble a list of users of different
languages myself and contacting them via private messages etc.?
I’d really like to have the audio portion of Tatoeba catch up a little bit with the
text module and shine a little more - in quality and quantity ;-)
Hoping for your input, thanks!
It's currently not easy to record audio for Tatoeba, and the currently-existing guidelines discourage people from doing so unless they have Windows and can use CK's program. Among other things this runs contrary to FOSS and free-content philosophy.
Also, existing guidelines require people to be native speakers, which doesn't make sense for languages like Brithenig or Latin or even Esperanto, which despite its relatively high number of native speakers for a constructed language, does not have the sort of native speaker pool that most non-constructed languages have.
But the guidelines do only exist because it would be impossible to cut and edit thousands and thousands of recordings in post production.
However, almost everyone who has a computer, phone or tablet is perfectly able to record one ‚pre-cooked‘ sentence without having to use Shtooka or without needing a pre-formatted list supplied by CK in order to match the input format for Shtooka lists.
And as I said already, I would volunteer to cut, edit and format these 359 contributions for Tatoeba. So it would take literally only a few seconds for somebody to record this phrase.
> Among other things this runs contrary to FOSS and free-content philosophy.
Sorry, don‘t know what you mean by that!
> Native speakers
I think it doesn’t need much imagination to accept that there is a handful of ‚constructed’ and ‚dead’ languages out there that will have samples spoken by a living creature applied, even non-native ;-)
What are we talking about, five languages, or ten...?
And a pool of 359 samples of one generic poster sentence could easily be maintained if e.g. a rare language is spoken by a non-native and later on replaced by another sample if someone finds the contribution inappropriate.
And anyways, these are guidelines and as far as I can tell not written in stone yet.
I am just convinced that this would be a beautiful opportunity for Tatoeba to attract newcomers even more to get on the train - be it text or audio...
Unfortunately, there are too many languages without native speakers.
Sometimes it's even difficult even though you're a native speaker.
I would love to contribute for Dutch and Gronings, however, I speak Dutch with a regional accent and I prefer pure Dutch (or Flemish) accents for Tatoeba.
Same goes for Gronings, I'm the only contributor, but my accent is again too much influenced by the Dutch language.
So in my opinion, I'm not even suitable for my own two native languages and there are probably more cases like this.
Well by that standard you‘d have to get rid of 80% of the german sentences (which are spoken by Austrians)! And even the german contributors have sometimes strong regional or dialectal influences which makes them ‚unusable’ for contributing samples in Standard German which btw. doesn’t exist anyways, at least not in one variant across Austria, Switzerland and Germany, although we all uniformly agree that all our slightly different sounding variants are pure GERMAN anyways.
There are IIRC thousands of sentences in the argentine Version of Spanish, and don‘t get me even started on the english contributions of our fellow Australiens, and even CK‘s AE intonation compared to other AE contributors and British English!
So why bother at all about Audio on Tatoeba, unless it meets the very strong requirements of some restriction catalog - a platform where ‚diversity‘ is the most used word when talking about its ‚Credo‘...
These two things, IMHO, don‘t go very well together. Of course, the subtlety of the spoken word is always more fragile than the bluntness of the written word that can be twisted and interpreted much easier to anyone’s liking. But that should not prevent us from trying to create a beautiful Audio implementation into the Tatoeba corpus, despite difficulties that may arise on the way...
Just my two cents ;-)
> What would be a perfect everyday sentence to have in every language
Je ne pense pas qu’une telle phrase existe parce que chaque langue a une culture propre. Mais cela ne devrait pas vous décourager dans votre noble quête. Je pense que ce sont aux locuteurs eux-mêmes de trouver la phrase idéale en fonction de leur langue.
À défaut, peut-être que nous pourrions nous appuyer sur les "phrases favorites" pour déterminer des phrases intéressantes à faire enregistrer, même sans en connaître le sens. Les phrases les plus typiques et les plus intéressantes sont souvent les plus difficiles à traduire.
> How can I go about it to encourage as many people as possible to send me their contributions which I would happily volunteer for to cut, edit and prepare, for handing them over to CK for the Tatoeba audio database?
> Is there - beside the Wall - some kind of Newsletter etc. that reaches a biggernumber of contributors without having to assemble a list of users of different languages myself and contacting them via private messages etc.?
Non, mais c’est un besoin récurrent que d’avoir des forums individuels par langue. Dans l’idéal, vous voudriez pouvoir envoyer un message à tous les locuteurs d’une langue donnée, c’est bien ça ?
Well, the more people know about this little quest the bigger chances are to get as close to 359 audio samples possible ;-)
Just for the sake of simplicity for managing and maintaining the contributions I was thinking of using one single phrase only for all languages. The content doesn’t necessarily have to reflect some specific cultural etc. features or characteristics of the language.
Just to see the language list with contributed audio samples having the same length than the list of languages for contributed sentences should be inspiring to people of every language. To see that every contribution counts even if the quantitative supremacy of languages like English, Spanish, Russian etc. is overwhelming.
And maybe by just speaking one example sentence for this little project people could end up being encouraged to contribute more in the long run.
Just a little show-off to bring back into people’s mind that language is all about speaking and the written word is just a by-product. Granted a very useful one but nevertheless second to the spoken word. In my opinion there shouldn‘t even exist a single sentence on Tatoeba without an audio sample attached.
Ideally it should be all audio with the text attached as a little convenience ;-)))
I think by now you should get my point ;-)
Oui, je comprends votre démarche, mais il se trouve que l’historique de Tatoeba a suivi une logique inverse : les phrases textuelles sont en premier, et l’audio est accessoire. Cet historique explique pourquoi l’audio est très en retard en termes de convivialité et de fonctionnalités. D’ailleurs, il y a quelques temps, j’avais bricolé une preuve de concept pour s’enregistrer depuis le navigateur : https://dev.tatoeba.org/tatorec/ Des tas de choses sont possibles avec un peu d’énergie, de connaissances et de temps libre.
Quoi qu’il en soit, je me réjouis de savoir qu’il y a des contributeurs enthousiastes comme vous désireux d’améliorer les choses. Si je peux me permettre, pouvez-vous nous en dire un peu plus sur vos motivations ? Pourquoi tenez-vous tant à ce que l’audio soit un point plus central de Tatoeba ? Voudriez-vous personnellement utiliser l’audio pour apprendre ou enseigner des langues, ou bien est-ce plus par idéologie qu’autre chose ?
Au fait, vous n’avez pas répondu à ma question : dans l’idéal, vous voudriez pouvoir envoyer un message à tous les locuteurs d’une langue donnée, c’est bien ça ?
My answer is hidden in the first sentence ;-)
Ideally I would like to reach all Tatoeba members at once.
A public appeal, as it were ;-)
I don’t have a clue in how many different languages we had to write it
in order to reach the most people possible, assuming that not everybody
is speaking English.
So is there actually a facility (at least for you as admin or programmer) to address
ALL people of a certain language at once?
No, but having better ways to get in touch with the community (or part of the community) is a recurrent need that I heard from other members. That’s why I wanted to know more about your use case.
No matter how good your intentions are, "The more people get the message, the better" is a kind of approach that can irritate members and make you look like a spammer. On the other hand, private message and the Wall are certainly quite limited ways of getting in touch with the community.
Yes, I am gonna spam my ‚Audio Gospel‘ all over Tatoeba ;-)))
All I wanna achieve is to have 359 audio samples in 359 languages publicly available and a list of languages with contributed spoken sentences of the same length than the list of languages with contributed written sentences.
If I only had to ask 359 people to achieve this goal I‘d be even happier...
And the very next time I get to meet someone who speaks Italien I will pull out whatever crappy audio recording device I have in reach and, if need be, even coerce him/her into speaking one italian sentence for an upload to Tatoeba - Also as a proof of concept, as it were... ;-)
> Just a little show-off to bring back into people’s mind that language is all about speaking and the written word is just a by-product. Granted a very useful one but nevertheless second to the spoken word. In my opinion there shouldn‘t even exist a single sentence on Tatoeba without an audio sample attached.
Olen tästä erittäin eri mieltä Tatoeban ja kielten oppimisen kannalta. On erittäin hyviä syitä opetella lukemaan kieliä, joita ei osaa puhua, ja kenties kirjoittamaankin. Kirjoitus on nopeudeltaan ja selkeydeltään ylivertainen monimutkaisen tiedon välittämisessä ja suuren tietomäärän välittämisessä.
Vieraassa maassa molemmista on hyötyä eikä kumpikaan korvaa toista.