menu
Tatoeba
language
S'inscriure Connexion
language Occitan
menu
Tatoeba

chevron_right S'inscriure

chevron_right Connexion

Percórrer

chevron_right Afichar la frasa aleatòria

chevron_right Percórrer per lenga

chevron_right Percórrer per lista

chevron_right Percórrer per etiqueta

chevron_right Percórrer los enregistraments àudio

Community

chevron_right Paret

chevron_right Lista de totes los membres

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search
sysko {{ icon }} keyboard_arrow_right

Perfil

keyboard_arrow_right

Frasas

keyboard_arrow_right

Vocabulary

keyboard_arrow_right

Reviews

keyboard_arrow_right

Lists

keyboard_arrow_right

Marcapaginas

keyboard_arrow_right

Comentaris

keyboard_arrow_right

Comentaris sus las frasas de sysko

keyboard_arrow_right

Cabinats

keyboard_arrow_right

Jornals

keyboard_arrow_right

Audio

keyboard_arrow_right

Transcriptions

translate

Translate sysko's sentences

Cabinets de sysko sus la paret (total 1397)

sysko sysko October 16, 2011 October 16, 2011 at 6:15:46 PM UTC link Permalink

When I was in High School, in Latin class we pronounced it /ˈkɪkɛroː/ or /ˈkikeroː/ (I'm not good at IPA), I don't know where you get that French people would pronounced it /siseˈʀo/, maybe some French who have never learnt latin?

sysko sysko October 16, 2011 October 16, 2011 at 6:13:29 PM UTC link Permalink

As said in one of the topic below, for the pinyin, it's a software I've made myself for the need of Tatoeba, it's independant from Tatoeba itself and so can be used separately. Though I plan to make it opensource, I still didn't find time to put it on github or any public face, and anyway the code has for the moment no documentation at all.

Romanization in general (not only for Chinese and Japanese, but also Shanghainese, Cantonese etc.) are not available to download because they are generated on the fly (I know that's not "efficient", but it's due to some legacy code)

In the future it will be possible to have it along with the sentence itself, but not yet.

sysko sysko October 16, 2011 October 16, 2011 at 6:09:55 PM UTC link Permalink

I've used/still using it for:
* helping gather data for a rare language I like (Shanghainese)
* by analyzing the sentence manually, be able to do a basic shanghainese-french word dictionnary
* being able to see the errors in my pinyin generator
* same for the Shanghainese pronunciation generator (though here I rely on the audio)

* replace my paper notebook where i used to input interesting sentences I note when I chat with my Chinese friends (so that it's easier to find them back, rather than searching on the thousand of piece of paper I've written stuff on while on the bus, train, restaurant etc.)
* pretend I speak German, I did learn it a little, so more or less enough to have casual chatting on msn in German, by mixing already existing sentences in Tatoeba
* pratice listening kill by listening to the audio
* learn a lot of interesting trivia when people request us to add new languages, or by reading the comments/wall messages

In the future I plan to use taoteba for:
* giving me a reason to learn how to make an android app
* making a smarter pinyin generator that will "learn" from the correction users will make when it will be possible to edit the romanization
* it looks nice on my resumé
* trying to make a homebrew open source translation memory

After I know some people have reused them in their own dictionnary project.

sysko sysko October 16, 2011 October 16, 2011 at 12:38:12 AM UTC link Permalink

The fact that a Mandarin Chinese sentence is labelled "Mandarin Chinese" (the fact we use the PRC flag is an other issue), the script used doesn't change the fact it's Mandarin. If you pay attention, there's an icon precising the script used
http://flags.tatoeba.org/img/si...ed_chinese.png
or that one
http://flags.tatoeba.org/img/tr...al_chinese.png

We accept both script, because people in Taiwan or Hongkong when writting mandarin will used traditional script, books written before the simplification reform in mainland China are in traditional etc.

For Hokkien/Teochew, actually the problem is the same as with Shanghainese and Suzhouese.

We're using for the moment the international standard ISO 639-3 alpha 3 to put the "limit" between languages (this way it's easier to have an objective point of view, because as you may know, for a lot of Chinese, Cantonese is only a "dialect" of Mandarin, which is linguistically speaking not correct, and it's the case with a lot of languages around the world). According to it, Teochew and Hokkien are both under the code "nan" for Min Nan language http://www.ethnologue.com/show_...e.asp?code=nan , so yep actually it would have been more accurate to name it "Min Nan" in Tatoeba, we were not aware of that when someone proposed us to add Teochew around a year and a half ago. But for the moment it's the only "dialect" of Min Nan we have, a user proposed us to start to Hokkien some months ago, but I was quite busy, and we were talking about this problem, and somehow we lost contact.

In a more general in tatoeba we add a language only when we have somebody with a sufficient level (either a native or someone with pretty strong knowledge in that language, to avoid the "I find this language cool, and discover 5 sentences on a website"). Because as we have no limit on the number of languages we can support (the goal is to have them all), but as there's more than 6000 languages, it would really clutter the interface if we were to add them all know, we prefer to add them when we start to have sentences in it. So for Hokkien, if you're a native speaker or know some, or have strong knowledge in it, we can add it, but it would be a bit special as we have already teotchew with the same ISO code, so maybe I can just rename it "Min Nan" and sentences in Teochew dialect of Min Nan will be tagged so, same for Hokkien, soon a new ISO standard will be released with this time a code for each dialect of each languages, at that time we will be able to make something smarter to be able to not only tell the language but also the dialect it's part of.

For the pronunciation, they are automatically added and for the moment they can't be edited (but in the future it will be possible to do so)

sysko sysko October 14, 2011 October 14, 2011 at 9:08:37 PM UTC link Permalink

actually it's more about improving the search engine capacity to recognize infinitive form of verbs, because right now an advanced user can create any tags (though normal users can't) without any restriction, rather than improving tagging capacity, otherwise we will would finish with hundrends of thousand tags (number of languages we have * number of verbs in each language) which that will uselessly clutter the tags list.

sysko sysko October 14, 2011 October 14, 2011 at 4:07:44 AM UTC link Permalink

If you mean, something which the same as tatoeba on

1 - multi-language with not a single "source" language
2 - possibly to interlink, in order to permit to people to easily create a A to C dictionnary if there's already a A to B and B to C dictionnary by validating translation
3 - Free as in free speech (bab.la is not no?), which permit to reuse it in other project and not keep it captive of one website
4 - code source of the plateform also free to permit contributors to either help to improve it, or to "fork" it, if one day the admin turns crazy

No there's no such a plateform yet (actually even only point 1 2 and 3), the much closer, if you think the "open culture" aspect is important is the wiktionnary, with all the drawbacks we know it have

So actually last year while I was trying to find back a code architecture for the new version of tatoeba, I've made a test on creating a project like you described http://redmine.sysko.fr/project...ict/repository (don't pay attention to the name "shanghainesedict", it's because at first I only wanted to make a shanghainedictionnary website, and soon it turns out that it was actually able to host the same features as tatoeba for a dictionnary)

Of course it will not be a simple "copy paste" of tatoeba code with only a change in the content, but for the moment I'm focusing on the new version of tatoeba itself, so this new project is not going to appear before some months.

sysko sysko October 12, 2011 October 12, 2011 at 2:57:23 AM UTC link Permalink

100 000 phrases en français \o/ allez hop maintenant mon avion pour la Chine ;-)

sysko sysko October 11, 2011 October 11, 2011 at 10:30:53 AM UTC link Permalink

fixed.

sysko sysko October 1, 2011 October 1, 2011 at 4:36:49 PM UTC link Permalink

oui c'est pour cela que je l'ai fait aujourd'hui, là mon script tourne pour générer les changements à appliqué sur la base, et ensuite j'appliquerai les-dits changements.

sysko sysko October 1, 2011 October 1, 2011 at 3:37:24 PM UTC link Permalink

[eng] I've started to import them , they will be included in this list http://tatoeba.org/eng/sentences_lists/show/758

[fra] j'ai commencé à les importer, elles sont présentes dans cette liste http://tatoeba.org/fra/sentences_lists/show/758

sysko sysko September 30, 2011 September 30, 2011 at 10:58:55 PM UTC link Permalink

[eng]A little reminder about copyright (I know it's not something obvious so I take time to reexplain it here, I will see with Trang to put it in the FAQ):

- By default even if a ressource has no copyright sign, it doesn't mean the ressource is "public domain" but that the default rule of copyright applies (actually in France and in many country the copyright sign has no judiciary value, because as said by default copyright applies)

So from this:

Don't insert in tatoeba sentences from external source EXCEPT if you're 100% they are in the public domain

I know sometimes some ressource are widely available from the web and people reproduce them in anki deck/blog/website without really caring. But still we can't do the same way because the risk are not the same. (We're no more a small project, we're not hosted in a country that don't care about copyright violation, the data are not just on our plateform but reused etc. etc.)

So I think you understand, that for this reason I will remove any sentences or set of sentences for which we're not 100% sure we have the right to include them in tatoeba under the term of the CC-by licence.

In the meantime I want to thank all the user that help me in this task, without you it wil be impossible for me to do that all by myself.

sysko sysko September 30, 2011 September 30, 2011 at 6:23:12 PM UTC link Permalink

On peut toujours le faire par plusieurs versement étalé sur le long terme.
Dans l'idée, j'avais aussi commencé à discuter avec des gens de globalvoices http://fr.globalvoicesonline.org/ , leur contenu est aussi sous CC-by et il pourrait y avoir un intérêt réciproque de collaborer.

sysko sysko September 30, 2011 September 30, 2011 at 6:21:22 PM UTC link Permalink

je te remercie Scott de t'être occupé de trouver les conditions d'utilisations

sysko sysko September 30, 2011 September 30, 2011 at 5:54:32 PM UTC link Permalink

si le mec qui a fait ça a pensé à tout.

sysko sysko September 30, 2011 September 30, 2011 at 5:07:24 PM UTC link Permalink

oui exact, plus une liste qu'une étiquette ?

sysko sysko September 30, 2011 September 30, 2011 at 12:17:49 AM UTC link Permalink

Oui, quand j'aurai un peu de temps, je les contacterai.

sysko sysko September 29, 2011 September 29, 2011 at 10:56:28 PM UTC link Permalink

mon message n'ayant été posté qu'il y a 30 minute, je pense que le fait qu'il ne soit pas retraduit en esperanto est plus du au fait que les utilisateurs parlant à la fois français et esperanto sont surement occupé, qu'au fait qu'ils n'y pensaient pas :)

sysko sysko September 29, 2011 September 29, 2011 at 10:21:26 PM UTC link Permalink

thanks for sharing the link.

sysko sysko September 29, 2011 September 29, 2011 at 10:16:57 PM UTC link Permalink

et merci à toi Sacredceltic d'avoir traduit.

sysko sysko September 29, 2011 September 29, 2011 at 10:16:22 PM UTC link Permalink

il faut envoyer un email à team@tatoeba.org (ou alors me donnez un lien vers les données) avec le format suivant

phrase[tab]traduction
phrase2[tab]traduction2

que phrase et traduction soit toujours dans le même ordre de langue

Et aussi me dire à quel compte donner la propriété des phrases

Et voila.