menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search
TRANG TRANG May 13, 2018, edited September 22, 2018 May 13, 2018 at 5:09:58 PM UTC, edited September 22, 2018 at 8:30:46 PM UTC link Permalink

**MOSS award for Tatoeba**

http://blog.tatoeba.org/2018/05...r-tatoeba.html

I can finally share some big news with you. Tatoeba will be receiving $25,000 via the Mozilla Open Source Support[1] (MOSS) program. This was a long process, but it's now finally official :)

A little bit of background story.

Back in October last year, folks from Mozilla got in touch with us to explore possible ways of collaboration. They're working on a project called Common Voice[2] and with this project they basically want to collect people's voice. A lot of it.

To achieve this, they need sentences for people to read. Someone told them about Tatoeba... And that's how it started.

But it's not that simple.

One of the requirements of Common Voice is to be able to release their data under CC0 (the Creative Commons version of public domain). Tatoeba's data is CC-BY. Common Voice cannot reuse CC-BY sentences to record audio that they'll publish as CC0. They can only reuse sentences that are in the public domain or CC0.

So there's quite some work to do there, if we want to let Common Voice reuse sentences from Tatoeba. This is what the MOSS award is for. We cannot change our CC-BY license for the data we've released so far. But we can evolve Tatoeba to handle more licenses than just CC-BY.

I'll be explaining more in details later on what changes we plan to do exactly. But until then, I would really like to have an idea where the Tatoeba community stands on this matter.

Would you consider putting part (or all) of your sentences under CC0? Why, or why not? Let me know via this form: https://goo.gl/forms/Nd6FcAoyd1zkfB4I2

---

[1] https://www.mozilla.org/en-US/moss/
[2] https://voice.mozilla.org/en

Edit: I've been writing "CC-0" this whole time, but it seems CC0 (without the dash) is the correct acronym.

{{vm.hiddenReplies[29186] ? 'expand_more' : 'expand_less'}} hide replies show replies
deniko deniko May 14, 2018 May 14, 2018 at 8:18:53 AM UTC link Permalink

> Would you consider putting part (or all) of your sentences under CC-0?

That's probably obvious from your post, so sorry for a dumb question, but do you mean written sentences or voice recordings?

{{vm.hiddenReplies[29187] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG May 14, 2018 May 14, 2018 at 3:10:28 PM UTC link Permalink

It's not a dumb question, it could indeed be ambiguous.

I mean written sentences, because they are the source for the recordings. The recordings cannot be CC-0 unless the source (the text) is also CC-0.

AlanF_US AlanF_US May 14, 2018 May 14, 2018 at 8:09:57 PM UTC link Permalink

For the benefit of those who are not familiar with Creative Commons licenses, here is my understanding of the difference between CC0 and CC-BY, based on a quick read of the "Creative Commons license" page on Wikipedia [1]: CC-BY requires users to provide attribution, while CC0 does not. In other words, if someone wants to use the contents of the Tatoeba corpus, which is covered by a CC-BY license, within a project of their own, they need to provide users of their project with the information that some of the content came from Tatoeba. If I understand it correctly, if some sentences are covered under CC0, anyone can use them for anything without explaining where they came from.

A move from CC-BY to CC-0 does not change whether users can use a project for commercial purposes (they can in both cases), or whether they can make licensing for their own project more restrictive than Tatoeba's licensing (they can in both cases).

So basically, it all comes down to the difference between "I'm okay with someone taking my sentences and doing anything they want with them, as long as they say that they originally came from Tatoeba" and "I'm okay with someone taking my sentences and doing anything they want with them, period."

Is that correct?

[1] https://en.wikipedia.org/wiki/C...ommons_license

{{vm.hiddenReplies[29189] ? 'expand_more' : 'expand_less'}} hide replies show replies
CK CK May 14, 2018, edited October 31, 2019 May 14, 2018 at 11:43:49 PM UTC, edited October 31, 2019 at 3:12:58 AM UTC link Permalink

[not needed anymore- removed by CK]

{{vm.hiddenReplies[29190] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG May 15, 2018 May 15, 2018 at 8:31:49 AM UTC link Permalink

We chose CC-BY indeed due to the specificity of the French law regarding the public domain. But we were not law experts, and still aren't. We chose CC-BY mainly it felt safer to do so.

There may definitely be cases where CC0 is not an option, but I wouldn't say that using CC0 within Tatoeba will never be possible. We'll have to see.


I'll quote what is written in the Creative Commons page for CC0:

https://creativecommons.org/sha...ic-domain/cc0/

"while no tool, not even CC0, can guarantee a complete relinquishment of all copyright and database rights in every jurisdiction, we believe it provides the best and most complete alternative for contributing a work to the public domain given the many complex and diverse copyright and database systems around the world."

TRANG TRANG May 15, 2018 May 15, 2018 at 7:54:39 AM UTC link Permalink

> So basically, it all comes down to the difference between "I'm okay with someone
> taking my sentences and doing anything they want with them, as long as they say
> that they originally came from Tatoeba" and "I'm okay with someone taking my
> sentences and doing anything they want with them, period."
>
> Is that correct?

Yes, that's pretty much it.

Ricardo14 Ricardo14 May 16, 2018, edited May 16, 2018 May 16, 2018 at 6:33:58 AM UTC, edited May 16, 2018 at 6:36:39 AM UTC link Permalink

> Would you consider putting part (or all) of your sentences under CC-0?

Sure I would. I mean, one of the main points of Tatoeba's existence is to be a kind of "multi-sentences-dictionary" that would be a great resource for people who need its sentences. In other words, I true believe we are here to help other people and we will do so if we allow putting all of our sentences under CC-0. Besides, it's a good way to see them being used somewhere.

Aiji Aiji May 21, 2018 May 21, 2018 at 1:27:56 PM UTC link Permalink

Although the reference to Tatoeba has some advantages (that's how I found the website, for example), on the personal level, I wouldn't mind putting my sentences under CC-0.
For audio contributions, it is a little bit different however, as it means that my voice can be used by anyone anywhere out of context (the two first points already exist but the last point seems to be a not-so-negligeable difference).