clear
swap_horiz
search

gillux's messages on the Wall (total 371)

keyboard_arrow_left 1234567...19
gillux
2 days ago
This would definitely be a useful feature. Unfortunately, it’s not technically easy to implement it, because of the way the search is currently performed. I created an issue on our bugtracker to keep track of your suggestion: https://github.com/Tatoeba/tatoeba2/issues/1576
gillux
18 days ago
Merci Trang de m’avoir embauché ! C’est un honneur pour moi d’avoir la chance de travailler pour Tatoeba.

Pour ceux qui ne me connaissent pas, j’ai contribué de manière bénévole à l’amélioration du site, en particulier en 2014 et 2015. J’ai principalement travaillé à améliorer la fonctionnalité de recherche de phrases et l’intégration des écritures alternatives et des transcriptions (pour les langues qui ont plusieurs systèmes d’écriture, comme le Chinois, le Japonais, l’Ouzbek etc.). J’ai aussi participé à la maintenance du serveur, et dans une moindre mesure, j’ai travaillé du côté des enregistrements audio, notamment afin d’accorder davantage de reconnaissance dans le site aux contributeurs qui s’enregistrent. Enfin, en tant que membre du site, je suis également un modeste contributeur du corpus français.

Début 2016, j’avais arrêté de contribuer à Tatoeba pour me consacrer à d’autres activités, et je reviens maintenant en tant que salarié. J’ai été embauché dans le cadre d’une collaboration entre Tatoeba et Mozilla pour leur projet Common Voice[1]. Mozilla souhaite pouvoir utiliser les phrases de Tatoeba, mais il y a beaucoup de travail à faire pour rendre cela possible, tant sur le plan technique que légal. Je travaillerai avec Trang qui peut maintenant, elle aussi, se consacrer davantage à Tatoeba. Bien sûr, toute contribution bénévole est aussi la bienvenue.

Je pense que cette collaboration avec Mozilla peut apporter énormément à Tatoeba. Bien qu’un certain nombre de projets utilisent notre corpus[2], dans les faits il est assez difficile et peu pratique de s’en servir à cause de nombreux obstacles techniques (et parfois juridiques). Si nous parvenons à faciliter l’usage du corpus pour Mozilla, alors c’est tous les autres projets qui s’en servent ou voudraient s’en servir qui bénéficieront de ces améliorations. Tatoeba pourrait ainsi devenir une ressource plus connue et plus utilisée, et cela nous pousserait à être plus exigeants avec nous-mêmes. Nous voulons également à terme améliorer la qualité des phrases, et je pense que cela aura beaucoup plus de sens lorsqu’il y aura davantage de gens qui seront demandeurs de cette qualité.

1. https://tatoeba.org/wall/show_m...#message_29186
2. http://a4esl.org/temporary/tatoeba/links.html

=============================================

Thanks Trang for hiring me! It’s an honor for me to be given the opportunity to work for Tatoeba.

For those who don’t know me, I volunteered to help improving the website, especially in 2014 and 2015. I mainly worked on improving the sentence search functionality and the integration of alternative scripts and transcriptions (for languages that have several writing systems, like Chinese, Japanese, Uzbek etc.). I also participated in the maintenance of the server, and to a lesser extent, I worked on the audio recordings side, especially to give more credit in the website to contributors who record themselves. Finally, as a member of the website, I am also a modest contributor to the French corpus.

In the beginning of 2016, I stopped contributing to Tatoeba to focus on other activities, and I’m now back as an employee. I was hired as part of a collaboration between Tatoeba and Mozilla for their Common Voice project[1]. Mozilla wants to be able to use Tatoeba's sentences, but a lot of work has to be done to make this technically and legally possible. I will work with Trang, who can now also devote more time to Tatoeba. Of course, any voluntary contribution is also welcome.

I think this collaboration with Mozilla can bring a lot to Tatoeba. Although some projects do use our corpus[2], in practice it’s rather difficult and impractical to use, because of numerous technical (and sometimes legal) obstacles. If we can facilitate the use of our corpus for Mozilla, then all other projects that use it or would like to use it will benefit from these improvements. Tatoeba could thus become a more known and used resource, and this would push us to be more demanding with ourselves. We also want to eventually improve the quality of the sentences, and I think this will make much more sense when we will have more people asking for such quality.

1. https://tatoeba.org/wall/show_m...#message_29186
2. http://a4esl.org/temporary/tatoeba/links.html
gillux
2017-02-17 03:50
I strongly believe that we should not change our way of writing sentences for technical reasons. Programs should adapt to languages, not the opposite.

How about relating near-duplicate sentences with a fuzzy matching algorithm? So that for example, on a given sentence page, one could see a list of near-duplicates, along with their translations. I believe such an algorithm could be quite effective, even if it can’t be perfect.
gillux
2017-02-13 16:48
Are you using the new address https://tatoeba.org/audio/import ?
gillux
2016-12-26 06:39
*** Improving search for Chinese (Mandarin), Cantonese and Uzbek sentences ***

TL;DR: If you are knowledgeable in Mandarin or Cantonese, I’d appreciate you have a look at this ongoing work: https://github.com/Tatoeba/tatoeba2/pull/1379

On Tatoeba, Chinese sentences can be written using either simplified or traditional characters. While this allows members to use the characters they prefer, it makes it hard to look up sentences, because searching using traditional or simplified characters will only show sentences written as such. So currently, in order to find all the sentences, one has to perform one search using simplified characters and another search using the equivalent traditional characters. Uzbek, which can be written in either Latin or Cyrillic, suffer from the same problem.

Following my previous work on editable transcriptions, I am now trying to address this problem by allowing to find Chinese and Uzbek sentences regardless of their script.

Additionally, this will allow to find sentences by their transcriptions. That is to say, Japanese sentences may be found using kanji readings in kana, Chinese sentences using Pinyin and Cantonese using Jyutping. Regarding this particular point, I’d like to hear the opinion of whoever’s knowledgeable in Chinese and Cantonese about the problems I mentioned there: https://github.com/Tatoeba/tatoeba2/pull/1379
gillux
2016-12-16 05:40
On dev.tatoeba.org, I can see 524 lists public lists, and the drop-down only includes these 524 lists, under the Other lists section.

I think you got that, but I’m just clarifying: unlisted lists don’t show up in the drop-down.

Selecting a list is just as bad as selecting a language, but people seem to live with that.
gillux
2016-12-16 05:28
> what about making the sentences in an unknown language searchable until a proper language for them has been implemented?

While I think it is technically possible, note that it won’t show any results for languages that use a script that is not yet used in any other language included in Tatoeba.
gillux
2016-12-11 08:05
Hello kamitoki,

> can i get the audio files in one download?

No, we don’t provide such functionality. But if you know a bit of scripting, it’s rather easy to automate the download of the files by using the list of sentences with audio from the Downloads page.

I’m curious, though, about the reason you wish to download all the audio files at once, despite them being in various languages. What do you want to accomplish? Maybe one could come up with a different solution for your problem if you describe it.
gillux
2016-12-02 12:54
When I say something is a feature, I mean it has been programmed this way on purpose. I’m not saying it’s good or bad. (The emoticon in my previous message was rather sarcastic, as the sentence “It’s not a bug, it’s a feature” is a popular rhetoric of developer.)

The other problem you’re describing is a feature too. On Tatoeba, on any page you open, you’ll always see the latest keywords you looked up in the search bar. Apparently, this feature has been implemented by Trang in the early days of Tatoeba (beginning of 2009) and it’s still there: https://github.com/Tatoeba/tato...c71c7e3ac31dd9
gillux
2016-11-29 12:46
> Problem:
> Tab #1 now shows From: Turkish To: Malay, taking the language settings from tab #2!

It’s not a bug, it’s a feature. :-)

You have a point, though. I like to assign a type of search for each tab, too. For the time being, you may use Firefox’s private browsing. It allows you to open one more session simultaneously with the non-private one.
gillux
2016-08-26 10:39
Transifex allows to see who last-edited the translation any part of the interface. The one you’re mentioning is: https://www.transifex.com/tatoe...uages/89015239

This tells us that Iriep made the change. I don’t know who is that user on Tatoeba, but you can send him/her a message through Transifex.
gillux
2016-08-01 06:46 - 2016-08-01 06:47
Good job for the profile page!

I find it a bit weird that languages with unspecified level have the text "Unspecified" while others have stars (example: http://dev.tatoeba.org/jpn/user/profile/gillux ). I think we should either only use icons or only text. Mixing both feels inconsistent to me. What about not showing anything for unspecified levels (no stars, no text, just the language name and comment)?

Besides, if we switch to stars instead of bars, how do the levels will be represented on http://dev.tatoeba.org/jpn/stats/users_languages ?
gillux
2016-07-28 12:01 - 2016-07-28 12:01
We reached 5 millions of sentences! Yay us!
gillux
2016-07-14 14:05
Can someone remove these invalid sentences, please? #5240259 #5240222
gillux
2016-03-14 05:18
Thanks for confirming, bill. I pushed the fix to the production site.

Note that a wide variety of problems may cause slowdowns, so among all the members who reported slowdowns, some may still experience lags.

As for the problem I just fixed, it’s actually a regression in Firefox 45, which happened to be released around the same time we last updated Tatoeba. I just avoided the problem by modifying Tatoeba to prevent Firefox from running into that particular bug.
gillux
2016-03-14 04:33
I think I found the problem and I tried to fix it. Can you confirm me there is no problem on https://dev.tatoeba.org/?
gillux
2016-03-14 03:10
Alright, I'm able to reproduce the problem after upgrading to Firefox 45.
gillux
2016-03-13 22:01
I’m sorry to hear that. Since none of the developers are able to reproduce the problem at the moment, we have no way to fix it. If you’re willing to help us diagnose the problem, please send me a PM so that we can arrange a call or something. It will be much easier than exchanging messages.
gillux
2016-03-13 02:51
Also, does the problem occur while running Firefox in safe mode? https://support.mozilla.org/en-...sing-safe-mode
gillux
2016-03-13 02:46
Can you (and other people affected by this problem) try to enable or disable the 'smooth scrolling' option, to see if it changes something? I’m talking about this: http://www.pcworld.com/article/221150/Firefox.html

I would like to know if scrolling is still slow when you change this option, regardless of smoothness. Try to compare with other sites.
keyboard_arrow_left 1234567...19