gillux's Wall messages

gillux {{ icon }}

keyboard_arrow_right

Profile

keyboard_arrow_right

Sentences

keyboard_arrow_right

Vocabulary

keyboard_arrow_right

Reviews

keyboard_arrow_right

Lists

keyboard_arrow_right

Favorites

keyboard_arrow_right

Comments

keyboard_arrow_right

Comments on gillux's sentences

keyboard_arrow_right

Wall messages

keyboard_arrow_right

Logs

keyboard_arrow_right

Audio

keyboard_arrow_right

Transcriptions

translate

Translate gillux's sentences

gillux May 9, 2021 May 9, 2021 at 9:37:50 PM UTC

link

Permalink

I have been working on something to allow translations of the website for specific regions (interface translation; not sentences). For most languages, this is not necessary, but for example, when the website is displayed in "Chinese" (actually Mandarin), it is actually "Mandarin as used in mainland China" (with simplified characters), while there are other region-specific Mandarin such as Taiwan (using traditional characters). Similarly, we have a "Portuguese (Brazil)" translation but we cannot provide a version for Portuguese as it is used in Portugal. This feature is probably not so much needed when dialects are mutually intelligible, but it could make the website more friendly to people from other regions.

I installed this new feature on https://dev.tatoeba.org/ and somebody started working on a "Mandarin as used in Taiwan" translation.

gillux May 9, 2021 May 9, 2021 at 8:15:27 PM UTC

link

Permalink

The transcription list should look better now. Thank you for reporting the problem.

I checked the code history and it appears that page was broken for more than two years. When I created that page 5-6 years ago, I didn’t really thought how useful it would be for corpus maintainers. I think you are the first one asking about it. If you have suggestions about how to make this page more useful, please let me know. :-)

gillux May 9, 2021, edited May 9, 2021 May 9, 2021 at 7:47:34 PM UTC, edited May 9, 2021 at 7:53:29 PM UTC

link

Permalink

This was a temporary error. Thank you for letting us know!

gillux May 8, 2021 May 8, 2021 at 3:59:35 PM UTC

link

Permalink

I see. If you are talking about the page https://tatoeba.org/jpn/transcr...of/small_snow, then I admit that there is a problem. This page used to display both transcriptions and sentences. Thank you for pointing this out, I’ll have a look at it.

gillux May 8, 2021 May 8, 2021 at 2:09:38 PM UTC

link

Permalink

I think you will get what you want if you enable the option "Always show unreviewed transcriptions and alternative scripts" in your settings ("未レビューのトランスクリプションおよびフリガナを常に表示する").

Note that it is normally possible to edit transcriptions directly from the page you mentioned, but it’s not possible at the moment because of a bug. The bug should be fixed soon. In the meantime, you will have to go to the sentence page if you want to edit something.

gillux May 3, 2021, edited May 3, 2021 May 3, 2021 at 9:52:49 AM UTC, edited May 3, 2021 at 9:54:16 AM UTC

link

Permalink

This message appear for a few sentences, especially the ones created in the early days of Tatoeba. This sentence origin ("added as translation" or "original") has been added to Tatoeba on August 2018 [1]. Sentences added after this date won’t have this problem, but for sentences added past this date, we calculated the origin by analyzing the logs. However, in some cases, part of the logs are missing and we don’t have enough information to determine the origin. If somebody knows for sure the origin of one of these sentence, we can also add the information manually.

[1] https://github.com/Tatoeba/tatoeba2/pull/1623

gillux May 3, 2021 May 3, 2021 at 9:29:47 AM UTC

link

Permalink

As Thanuir said, tags are the proper way to add this kind of information to sentences. To add tags, you must become an "advanced contributor" first: https://en.wiki.tatoeba.org/art...d-contributors

gillux May 3, 2021 May 3, 2021 at 9:22:34 AM UTC

link

Permalink

No, but there are plans to implement this feature at some point. You can follow the progress here: https://github.com/Tatoeba/tatoeba2/issues/1762

gillux April 26, 2021 April 26, 2021 at 7:41:00 PM UTC

link

Permalink

These were closed because you did not contact the language team first.

gillux April 26, 2021 April 26, 2021 at 7:37:50 PM UTC

link

Permalink

I believe it’s easier to send an email than to open a Github issue (not to mention creating a Github account and figuring out how to open an issue).

No matter whether the language requests would be initially made through Github or email, they need to be reviewed by the language team.

gillux April 26, 2021 April 26, 2021 at 7:28:16 PM UTC

link

Permalink

Github is primarily used by developers to write code. Developers don’t have the skill to process language requests, that’s why the language team should be contacted first. They make sure that everything’s fine (language icon, sentences, naming etc.), and only then they create the relevant Github issue so that developers can implement the new language in the code.

gillux April 24, 2021 April 24, 2021 at 1:31:13 AM UTC

link

Permalink

Voici une piste théorique : https://fr.wikipedia.org/wiki/Word_embedding

gillux April 23, 2021 April 23, 2021 at 7:39:35 PM UTC

link

Permalink

Non, car je pars du principe que les gens sont de bonne volonté. Je crois que nous avons des opinions assez différentes à ce sujet et je n’ai pas envie de poursuivre ce débat avec toi. Si tu le veux bien, revenons-en au calcul de la similitude entre les langues. Je pense que ce calcul serait intéressant pour créer une sorte de nuage de langues, où les langues proches seraient rassemblées en grappes. Ou encore, pouvoir suggérer des langues à apprendre aux polyglottes, genre "si vous connaissez les langues x et y, alors la langue la plus facile à apprendre pour vous est z"

Mais bon, je pense que je rêve un peu car comme le dit Thanuir, l’analyse en dira toujours plus sur Tatoeba que sur les langues elles-même. Cabo pense aussi que l’échantillon est trop petit.

En plus, si on se limite à une bête comparaison des caractères, comme tu le suggères dans ton message initial, l’analyse serait très simple mais très limitée. Il faudrait aussi et surtout analyser selon d’autres critères comme la syntaxe (ordre des mots), la grammaire (je pense au coréen et au japonais qui sont très proches grammaticalement mais éloignés graphiquement), le vocabulaire (si un mot est transparent d’une langue à l’autre mais que la syntaxe est différente, pas évident de faire le lien)…

gillux April 23, 2021 April 23, 2021 at 7:06:40 PM UTC

link

Permalink

Non, je n’en sais rien.

gillux April 23, 2021 April 23, 2021 at 6:22:57 PM UTC

link

Permalink

Ce serait intéressant effectivement. Peut-être faisable avec Tatoeba-playground: https://github.com/agrodet/Tatoeba-playground

gillux April 12, 2021 April 12, 2021 at 1:38:34 PM UTC

link

Permalink

Kabardian and Sranan Tongo.

gillux April 11, 2021 April 11, 2021 at 10:13:34 PM UTC

link

Permalink

Tatoeba now supports more than 400 languages!

gillux April 9, 2021 April 9, 2021 at 9:14:01 PM UTC

link

Permalink

This is definitely a bug, thank you for letting us know. I added it to our bugtracker: https://github.com/Tatoeba/tatoeba2/issues/2685

gillux March 21, 2021 March 21, 2021 at 9:40:16 PM UTC

link

Permalink

#5268528 :-)

gillux March 6, 2021 March 6, 2021 at 7:20:18 PM UTC

link

Permalink

Note that the wiki links showing at the footer of https://dev.tatoeba.org are now pointing to the articles of the dev wiki at https://wiki.dev.tatoeba.org. This effectively shows you a preview of how it will be like next week on tatoeba.org.

Need some help?

Developers

About

gillux's messages on the Wall (total 595)

Need some help?

Developers

About