clear
swap_horiz
search

Latest contributions

#6928185 Bonjour la compagnie.

added by gillux, 16 days ago

linked by gillux, 2018-01-01 10:57

linked by gillux, 2018-01-01 10:57

#6584136 J’ai appris le français, pas l’allemand.

added by gillux, 2018-01-01 10:57

linked by gillux, 2017-12-05 08:28

linked by gillux, 2017-12-05 08:28

linked by gillux, 2017-12-05 08:27

linked by gillux, 2017-12-05 08:27

#6539313 Le garçon n'a fait que pleurer toute la journée.

added by gillux, 2017-12-05 08:27

unlinked by gillux, 2017-12-05 08:27

Latest sentencesview all

fra
Bonjour la compagnie.
fra
De quoi vous parlez ?
fra
J’ai appris le français, pas l’allemand.
fra
Le garçon n'a fait que pleurer toute la journée.
fra
C’est Noël avant l’heure !
fra
Personne n’est irremplaçable.
fra
Quels que soient tes malheurs, surtout, ne te suicide pas.
fra
Ne fais rien que tu regretterais après coup.
fra
Quelle est votre émission de télé préférée ?
fra
Il a plu des cordes toute la journée, aujourd’hui.

Latest commentsview all

gillux 16 days ago link permalink

@CK this sentence looks like it has audio (recorded by you), but there is actually no audio file. Could you fix this?

gillux 2018-04-16 01:44 link permalink

Yes it does.

gillux 2017-11-21 10:54 link permalink

I didn’t mean that we should blindly translate tense for tense. It’s just that your translation seems wrong to me. I think something like "Tom is used to TV" is closer to the Japanese. Japanese tend to use 慣れしてる (or 慣れてる) to mean "is used to" (and not "is getting used to"). See for example:
https://detail.chiebukuro.yahoo...l/q12157329247
http://www.njpw.co.jp/122567

gillux 2017-11-21 10:31 link permalink

I don’t really understand what you’re trying to say. "I entered the cafe" is past tense. There is no doubt about that, it’s just basic grammar.

Japanese volitional is something different: https://en.wikipedia.org/wiki/J...C_or_hortative

Also note that your sentence has been already translated into Turkish, so now you changed it, it probably doesn’t match the Turkish sentence any more. The best thing to do in this situation is to revert the change so that it matches the Turkish again, and then unlink the sentence from the Japanese (I can do that).

gillux 2017-11-20 17:07 link permalink

The Japanese sentence uses the present tense.

gillux 2017-11-20 17:04 link permalink

The Japanese sentence doesn’t use the past tense.

gillux 2017-11-20 17:02 link permalink

Petite faute d’accord :

✘ bagages à mains
✔ bagages à main

gillux 2017-11-20 16:47 link permalink

The Japanese sentence doesn’t use past tense.

gillux 2017-11-20 16:13 link permalink

Yes, it obviously doesn’t match. I unlinked. The meaning of this sentence is closer to #230290 or #80885.

gillux 2017-11-20 14:07 link permalink

The furigana is wrong.

✘ 何{なに}
✔ 何{なん}

Latest Wall messagesview all

gillux
2 days ago
This would definitely be a useful feature. Unfortunately, it’s not technically easy to implement it, because of the way the search is currently performed. I created an issue on our bugtracker to keep track of your suggestion: https://github.com/Tatoeba/tatoeba2/issues/1576
gillux
18 days ago
Merci Trang de m’avoir embauché ! C’est un honneur pour moi d’avoir la chance de travailler pour Tatoeba.

Pour ceux qui ne me connaissent pas, j’ai contribué de manière bénévole à l’amélioration du site, en particulier en 2014 et 2015. J’ai principalement travaillé à améliorer la fonctionnalité de recherche de phrases et l’intégration des écritures alternatives et des transcriptions (pour les langues qui ont plusieurs systèmes d’écriture, comme le Chinois, le Japonais, l’Ouzbek etc.). J’ai aussi participé à la maintenance du serveur, et dans une moindre mesure, j’ai travaillé du côté des enregistrements audio, notamment afin d’accorder davantage de reconnaissance dans le site aux contributeurs qui s’enregistrent. Enfin, en tant que membre du site, je suis également un modeste contributeur du corpus français.

Début 2016, j’avais arrêté de contribuer à Tatoeba pour me consacrer à d’autres activités, et je reviens maintenant en tant que salarié. J’ai été embauché dans le cadre d’une collaboration entre Tatoeba et Mozilla pour leur projet Common Voice[1]. Mozilla souhaite pouvoir utiliser les phrases de Tatoeba, mais il y a beaucoup de travail à faire pour rendre cela possible, tant sur le plan technique que légal. Je travaillerai avec Trang qui peut maintenant, elle aussi, se consacrer davantage à Tatoeba. Bien sûr, toute contribution bénévole est aussi la bienvenue.

Je pense que cette collaboration avec Mozilla peut apporter énormément à Tatoeba. Bien qu’un certain nombre de projets utilisent notre corpus[2], dans les faits il est assez difficile et peu pratique de s’en servir à cause de nombreux obstacles techniques (et parfois juridiques). Si nous parvenons à faciliter l’usage du corpus pour Mozilla, alors c’est tous les autres projets qui s’en servent ou voudraient s’en servir qui bénéficieront de ces améliorations. Tatoeba pourrait ainsi devenir une ressource plus connue et plus utilisée, et cela nous pousserait à être plus exigeants avec nous-mêmes. Nous voulons également à terme améliorer la qualité des phrases, et je pense que cela aura beaucoup plus de sens lorsqu’il y aura davantage de gens qui seront demandeurs de cette qualité.

1. https://tatoeba.org/wall/show_m...#message_29186
2. http://a4esl.org/temporary/tatoeba/links.html

=============================================

Thanks Trang for hiring me! It’s an honor for me to be given the opportunity to work for Tatoeba.

For those who don’t know me, I volunteered to help improving the website, especially in 2014 and 2015. I mainly worked on improving the sentence search functionality and the integration of alternative scripts and transcriptions (for languages that have several writing systems, like Chinese, Japanese, Uzbek etc.). I also participated in the maintenance of the server, and to a lesser extent, I worked on the audio recordings side, especially to give more credit in the website to contributors who record themselves. Finally, as a member of the website, I am also a modest contributor to the French corpus.

In the beginning of 2016, I stopped contributing to Tatoeba to focus on other activities, and I’m now back as an employee. I was hired as part of a collaboration between Tatoeba and Mozilla for their Common Voice project[1]. Mozilla wants to be able to use Tatoeba's sentences, but a lot of work has to be done to make this technically and legally possible. I will work with Trang, who can now also devote more time to Tatoeba. Of course, any voluntary contribution is also welcome.

I think this collaboration with Mozilla can bring a lot to Tatoeba. Although some projects do use our corpus[2], in practice it’s rather difficult and impractical to use, because of numerous technical (and sometimes legal) obstacles. If we can facilitate the use of our corpus for Mozilla, then all other projects that use it or would like to use it will benefit from these improvements. Tatoeba could thus become a more known and used resource, and this would push us to be more demanding with ourselves. We also want to eventually improve the quality of the sentences, and I think this will make much more sense when we will have more people asking for such quality.

1. https://tatoeba.org/wall/show_m...#message_29186
2. http://a4esl.org/temporary/tatoeba/links.html
gillux
2017-02-17 03:50
I strongly believe that we should not change our way of writing sentences for technical reasons. Programs should adapt to languages, not the opposite.

How about relating near-duplicate sentences with a fuzzy matching algorithm? So that for example, on a given sentence page, one could see a list of near-duplicates, along with their translations. I believe such an algorithm could be quite effective, even if it can’t be perfect.
gillux
2017-02-13 16:48
Are you using the new address https://tatoeba.org/audio/import ?
gillux
2016-12-26 06:39
*** Improving search for Chinese (Mandarin), Cantonese and Uzbek sentences ***

TL;DR: If you are knowledgeable in Mandarin or Cantonese, I’d appreciate you have a look at this ongoing work: https://github.com/Tatoeba/tatoeba2/pull/1379

On Tatoeba, Chinese sentences can be written using either simplified or traditional characters. While this allows members to use the characters they prefer, it makes it hard to look up sentences, because searching using traditional or simplified characters will only show sentences written as such. So currently, in order to find all the sentences, one has to perform one search using simplified characters and another search using the equivalent traditional characters. Uzbek, which can be written in either Latin or Cyrillic, suffer from the same problem.

Following my previous work on editable transcriptions, I am now trying to address this problem by allowing to find Chinese and Uzbek sentences regardless of their script.

Additionally, this will allow to find sentences by their transcriptions. That is to say, Japanese sentences may be found using kanji readings in kana, Chinese sentences using Pinyin and Cantonese using Jyutping. Regarding this particular point, I’d like to hear the opinion of whoever’s knowledgeable in Chinese and Cantonese about the problems I mentioned there: https://github.com/Tatoeba/tatoeba2/pull/1379
gillux
2016-12-16 05:40
On dev.tatoeba.org, I can see 524 lists public lists, and the drop-down only includes these 524 lists, under the Other lists section.

I think you got that, but I’m just clarifying: unlisted lists don’t show up in the drop-down.

Selecting a list is just as bad as selecting a language, but people seem to live with that.
gillux
2016-12-16 05:28
> what about making the sentences in an unknown language searchable until a proper language for them has been implemented?

While I think it is technically possible, note that it won’t show any results for languages that use a script that is not yet used in any other language included in Tatoeba.
gillux
2016-12-11 08:05
Hello kamitoki,

> can i get the audio files in one download?

No, we don’t provide such functionality. But if you know a bit of scripting, it’s rather easy to automate the download of the files by using the list of sentences with audio from the Downloads page.

I’m curious, though, about the reason you wish to download all the audio files at once, despite them being in various languages. What do you want to accomplish? Maybe one could come up with a different solution for your problem if you describe it.
gillux
2016-12-02 12:54
When I say something is a feature, I mean it has been programmed this way on purpose. I’m not saying it’s good or bad. (The emoticon in my previous message was rather sarcastic, as the sentence “It’s not a bug, it’s a feature” is a popular rhetoric of developer.)

The other problem you’re describing is a feature too. On Tatoeba, on any page you open, you’ll always see the latest keywords you looked up in the search bar. Apparently, this feature has been implemented by Trang in the early days of Tatoeba (beginning of 2009) and it’s still there: https://github.com/Tatoeba/tato...c71c7e3ac31dd9
gillux
2016-11-29 12:46
> Problem:
> Tab #1 now shows From: Turkish To: Malay, taking the language settings from tab #2!

It’s not a bug, it’s a feature. :-)

You have a point, though. I like to assign a type of search for each tab, too. For the time being, you may use Firefox’s private browsing. It allows you to open one more session simultaneously with the non-private one.