menu
Tatoeba
language
S'inscrire Se connecter
language Français
menu
Tatoeba

chevron_right S'inscrire

chevron_right Se connecter

Parcourir

chevron_right Montrer une phrase au hasard

chevron_right Parcourir par langue

chevron_right Parcourir par liste

chevron_right Parcourir par étiquette

chevron_right Parcourir les enregistrements sonores

Communauté

chevron_right Mur

chevron_right Liste de tous les membres

chevron_right Langues des membres

chevron_right Langues natales des membres

search
clear
swap_horiz
search

Mur (7 365 sujets)

Astuces

Avant de poser une question, soyez sûr d'avoir lu la FAQ.

Nous cherchons à maintenir une ambiance amicale pour des discussions civilisées. Veuillez lire nos règles contre les mauvais comportements.

Derniers messages subdirectory_arrow_right

LeviHighway

il y a 5 heures

subdirectory_arrow_right

LeviHighway

il y a 5 heures

subdirectory_arrow_right

ssvb

il y a 8 heures

subdirectory_arrow_right

zogwarg

il y a 12 heures

subdirectory_arrow_right

AlanF_US

il y a 21 heures

subdirectory_arrow_right

Vortarulo

il y a 21 heures

subdirectory_arrow_right

DostKaplan

il y a 6 jours

subdirectory_arrow_right

AlanF_US

il y a 6 jours

subdirectory_arrow_right

AlanF_US

il y a 6 jours

feedback

LeviHighway

il y a 6 jours

LeviHighway LeviHighway il y a 6 jours 17 avril 2026 à 07:47:48 UTC flag Signaler link Permalien

I have always got this idea for Tatoeba but I understand how hard it is to achieve or it might doesn't fit the purpose of Tatoeba. But here I want to share it:
I wish that there is a "dictionary" function for Tatoeba. It doesn't mean you have to add definitions or etymology to words, but it should be a translation dictionary.
Here's how I think it should work:
When you search for "Chinese" via this dictionary function, you might get a few sentences like this:
I am Chinese. - 我是中國人。
I speak Chinese. - 我說中文。
I wish the contributors can tag how the word "Chinese" is translated into the target language. for example, you can mark:
I am [Chinese]. - 我是[中國人]。
I speak [Chinese]. - 我說[中文]。
This way, we can get a statistic how the word "Chinese" is translated into the target language. For example, when you look up "Chinese", it can tell you that "Chinese" is most usually translated into "中國人", and then followed by "中文".
I think Tatoeba is the best place to do this. Wiktionary is too complicated, and it can never provide as many sentences as Tatoeba.

{{vm.hiddenReplies[41826] ? 'expand_more' : 'expand_less'}} cacher les réponses montrer les réponses
zogwarg zogwarg il y a 12 heures, modifiée le il y a 12 heures 23 avril 2026 à 08:04:08 UTC, modifiée le 23 avril 2026 à 08:04:54 UTC flag Signaler link Permalien

I think there are many limits to make this work with a "Tag" system:
- It would require a lot of extra work from contributors.
- For most language pairs, there is very poor word for word matching anyway. eg: [#3378275] > [#4919050]

I think pragmatically using advanced search, if can understand both the source and target languages you can already get a good idea of how a word gets translated on tatoeba (tatoeba is not necessarily representative):

https://tatoeba.org/en/sentence...rd_count_min=1

Interestingly for "Chinese" the breakdown seems to more be like:
148 "中文"
63 "汉语"
58 "中国"
42 "中国人"
18 "漢語"
16 "汉字"
16 "中國人"
16 "中國"
14 "漢字"
11 "中"
10 "中餐"
7 "普通话"
4 "国"
2 "华人"
2 "中式"
1 "華"
1 "繁体字"
1 "简体字"
1 "汉"
1 "本地"
1 "國字"
1 "华"
1 "中菜"
1 "中华"
19 "other (mostly implicitly Chinese things)"

{{vm.hiddenReplies[41835] ? 'expand_more' : 'expand_less'}} cacher les réponses montrer les réponses
LeviHighway LeviHighway il y a 5 heures 23 avril 2026 à 15:01:50 UTC flag Signaler link Permalien

I think for such sentences that has poor word to word matching, we should simply add a button to mark them as "implied" or something.

For the example of [#3378275] > [#4919050], we can not match "Chinese" to anything, so we can mark it as "implied". However, we can still create matching like [Chinese characters] > [字] and [characters] > [字]

One of the reasons why I give this idea is that, some people may not be able to understand both languages well. So a matching of how words are translated could help them understand the sentence structures easier.

ssvb ssvb il y a 8 heures 23 avril 2026 à 12:19:42 UTC flag Signaler link Permalien

> I think Tatoeba is the best place to do this. Wiktionary is too complicated,

If Tatoeba implements your suggestions, then it will become more complicated. Possibly even as much complicated as Wiktionary.

BTW, you are not required to provide etymology when contributing to Wiktionary. A lot of other information is also optional. You can try to figure out what is the minimal barebone entry for a Chinese word and maybe discuss this with the other contributors if something is not clear. Start from here: https://en.wiktionary.org/wiki/...try_guidelines

I guess, one of the challenges is that Wiktionary treats Chinese as a big family of many similar languages or subdialects (Mandarin, Cantonese, Wu and the others), each having its own distinctive features. And this may cause friction, because some people may be in favour of eradicating the differences for the sake of simplification and unification, while the other people may be interested in preserving their local dialect as their precious cultural heritage.

> and it can never provide as many sentences as Tatoeba.

Wiktionary surely has stricter quality requirements for its content, so it indeed can't match Tatoeba's quantity of sentences. That said, the Tatoeba's content is also categorized via tags and it's possible to filter out dubious sentences (contributed by non-native speakers, etc.).

{{vm.hiddenReplies[41837] ? 'expand_more' : 'expand_less'}} cacher les réponses montrer les réponses
LeviHighway LeviHighway il y a 5 heures, modifiée le il y a 5 heures 23 avril 2026 à 14:48:16 UTC, modifiée le 23 avril 2026 à 14:50:59 UTC flag Signaler link Permalien

​I don’t frequently contribute to the English Wiktionary, which seems more streamlined due to its large community of maintainers. However, as a contributor to the Chinese Wiktionary, I find the infrastructure much more challenging. Many templates and modules are incomplete or overly complex; editing a single entry often takes hours and is prone to errors.

​Furthermore, Chinese grammar is often a subject of intense debate, frequently leading to edit wars. I’ve also noticed inaccuracies in Korean and Chinese entries, particularly when they are translated from English by contributors who may not be proficient in the target language.

​My point is that we could effectively build a translation dictionary within Tatoeba. For instance, when looking up a word, seeing both the keyword and its translation in bold—similar to how some dictionaries highlight equivalents—would greatly benefit learners in understanding sentence structures.

​Regarding the preservation of Chinese languages, Tatoeba handles this exceptionally well. By treating Mandarin, Cantonese, Wu, and others as distinct entities, it helps maintain the linguistic integrity and "purity" of each language.

il y a 9 heures 23 avril 2026 à 11:27:55 UTC link Permalien
warning

Le contenu de ce message va à l'encontre de nos règles et a donc été caché. Il est seulement montré aux administrateurs et à l'auteur du message.

il y a 18 heures 23 avril 2026 à 02:29:34 UTC link Permalien
warning

Le contenu de ce message va à l'encontre de nos règles et a donc été caché. Il est seulement montré aux administrateurs et à l'auteur du message.

Vortarulo Vortarulo il y a 9 jours 14 avril 2026 à 00:52:53 UTC flag Signaler link Permalien

Is it acceptable for a member who is not fluent in a language (but understands it to a good extend) to use Google Translate (or similar) to translate sentences into that language?
For instance, could someone with an A2 knowledge of Polish use GT to translate sentences from English to Polish, proof-reading them for obvious mistakes, and then putting them into the corpus?

{{vm.hiddenReplies[41809] ? 'expand_more' : 'expand_less'}} cacher les réponses montrer les réponses
LeviHighway LeviHighway il y a 9 jours 14 avril 2026 à 01:58:14 UTC flag Signaler link Permalien

I think machine translation is acceptable when you can assure its quality. Also, not to mention there're a lot of native speakers write bad translations, and those are not even as good as machine translations.

Thanuir Thanuir il y a 9 jours 14 avril 2026 à 06:06:02 UTC flag Signaler link Permalien

En näe tämän tuottavan merkittävää lisäarvoa tietokannalle. Suosittelisin mieluummin kääntämään puolasta omalle äidinkielelle tai muuten riittävän vahvalle kielelle, että voi mennä takuuseen omista lauseistaan, ja linkittämään ymmärtämiään puolalaisia lauseita muunkielisiin, jotka myös ymmärtää.

EugeneGS EugeneGS il y a 9 jours 14 avril 2026 à 06:15:41 UTC flag Signaler link Permalien

I think it is acceptable. Translators (such as GT or DeepL) nowadays are pretty good. Also, it is not such a problem if that person translates simple sentences or translates into language that is pretty similar to their well-known languages.

CK CK il y a 9 jours 14 avril 2026 à 06:50:35 UTC flag Signaler link Permalien

I would rather see people contributing in their own native languages.

Machine translation with AI has become much better than it used to be, so I can understand the temptation to use it.

Using AI to translate from a foreign language you know into your own native language might help you think of wordings you might not have otherwise thought of.

If you, as a native speaker, verify that it sounds natural, and if you know the source language well enough to verify that the meaning is the same, I think AI can be a useful tool.

{{vm.hiddenReplies[41813] ? 'expand_more' : 'expand_less'}} cacher les réponses montrer les réponses
ssvb ssvb il y a 9 jours 14 avril 2026 à 07:19:16 UTC flag Signaler link Permalien

Verifying that the meaning is the same is highly problematic. That's why I started translating sentences with the "by Arthur Conan Doyle" tag and submitted by a native English speaker. The best part about them is that I can look up context (the text before and after them) by searching them on the Internet. This is much better than many of the other short sentences on Tatoeba that are ambiguous regarding the gender of the speaker or other things.

{{vm.hiddenReplies[41814] ? 'expand_more' : 'expand_less'}} cacher les réponses montrer les réponses
ssvb ssvb il y a 9 jours 14 avril 2026 à 07:27:00 UTC flag Signaler link Permalien

> I started translating sentences with the "by Arthur Conan Doyle" tag

Oh, and to make things perfect, I would like to have CC0 license for both the English sentences and my translations. The works of Arthur Conan Doyle are public domain now. Is this possible?

marafon marafon il y a 8 jours 14 avril 2026 à 20:58:08 UTC flag Signaler link Permalien

I agree.

marafon marafon il y a 8 jours 15 avril 2026 à 11:06:47 UTC flag Signaler link Permalien

Contributing in a language that is not your strongest
https://en.wiki.tatoeba.org/art...ow/non-native#

{{vm.hiddenReplies[41818] ? 'expand_more' : 'expand_less'}} cacher les réponses montrer les réponses
AlanF_US AlanF_US il y a 8 jours, modifiée le il y a 8 jours 15 avril 2026 à 12:47:25 UTC, modifiée le 15 avril 2026 à 12:56:03 UTC flag Signaler link Permalien

Thanks for that link, @marafon. I also encourage people to consult the Rules and Guidelines ( https://en.wiki.tatoeba.org/art...ow/guidelines# ). To find it in the future, go to the bottom of any Tatoeba page and click on "Tatoeba Wiki". The first section on that page contains a link to the "Rules and Guidelines" page.

Tatoeba's mission is to serve as a source of high-quality sentences and high-quality translations. This is ensured by having them written, verified, and owned by humans who know the languages well enough to avoid introducing any mistakes, including subtle ones, not just the obvious ones mentioned by @Vortarulo.

If Tatoeba starts also acting as a consumer of sentences that do not come directly, transparently, and legally from humans, we run the very real risk of generating a cycle in which we pass these poor-quality or misappropriated sentences on to the people who get them from us.

Vortarulo Vortarulo il y a 8 jours 15 avril 2026 à 17:26:31 UTC flag Signaler link Permalien

Thanks for your answers, everyone, especially the link to the Guidelines, AlanF_US. I had looked for them but didn't find them. This also goes with my impression.

The question wasn't about me, by the way, but about a user who contributes quite a lot here, but apparently with mostly(?) GT-translated sentences. Perhaps I should point them to the Guidelines.

{{vm.hiddenReplies[41820] ? 'expand_more' : 'expand_less'}} cacher les réponses montrer les réponses
AlanF_US AlanF_US il y a 7 jours 16 avril 2026 à 12:07:41 UTC flag Signaler link Permalien

How do you know that the translations are from Google Translate?

{{vm.hiddenReplies[41822] ? 'expand_more' : 'expand_less'}} cacher les réponses montrer les réponses
ssvb ssvb il y a 7 jours 16 avril 2026 à 13:30:54 UTC flag Signaler link Permalien

> How do you know that the translations are from Google Translate?

It's not a rocket science. Sometimes a translation is obviously wrong. And when the original sentence is fed into Google Translate, the bad translation precisely matches the Google Translate output.

{{vm.hiddenReplies[41823] ? 'expand_more' : 'expand_less'}} cacher les réponses montrer les réponses
PaulP PaulP il y a 6 jours 17 avril 2026 à 05:23:59 UTC flag Signaler link Permalien

> Sometimes a translation is obviously wrong. And when the original sentence is fed into Google Translate, the bad translation precisely matches the Google Translate output.

Yes, that's how I detect AI translations too.
And secondly, when a user adds 10 or more translations in different languages, it's hard to believe that they speak so many languages.

{{vm.hiddenReplies[41824] ? 'expand_more' : 'expand_less'}} cacher les réponses montrer les réponses
kumakyoo kumakyoo il y a 6 jours 17 avril 2026 à 06:57:00 UTC flag Signaler link Permalien

Just as a sidenote: I recently translated a sentence containing "Tatoeba" with DeepL and in the translation "Tatoeba" was replaced by "wiktionary".

AlanF_US AlanF_US il y a 6 jours 17 avril 2026 à 13:43:11 UTC flag Signaler link Permalien

First of all, if you see that a user is contributing a bad translation, then regardless of what you think the source of that translation is, you should be leaving a comment on the translation and tagging it for action (for instance, "@change") if you can. If you see that the user habitually contributes bad translations, and contacting the user is either not feasible or has no effect, you should send a private message or an email to a corpus maintainer, an admin, or the admin team.

There are several ways to come to the conclusion that someone is using AI. One is that the person says so outright. Others are along the lines of what @ssvb and @PaulP have mentioned, namely that a bad translation happens to match, say, Google Translate's output. While this is not proof, if the evidence is pretty strong (for instance, if there are a lot of similar incidences), it's also worth mentioning to the user and/or an administrator.

I urge people to read the Rules and Guidelines in general. I've submitted an issue to request that a link to the document be added to the footer on each Tatoeba page to make it easier to find.

Vortarulo Vortarulo il y a 21 heures 22 avril 2026 à 22:50:46 UTC flag Signaler link Permalien

I checked one of the more complicated sentences (it was in Latin, I believe), because it seemed a bit special. Then I saw that the Google Translate version was exactly the same... and so were Greek, Italic, Galician, Portuguese, Spanish, Turkish, etc. etc.
I asked the user, and they admitted using GT.

{{vm.hiddenReplies[41832] ? 'expand_more' : 'expand_less'}} cacher les réponses montrer les réponses
AlanF_US AlanF_US il y a 21 heures 22 avril 2026 à 23:37:38 UTC flag Signaler link Permalien

> I asked the user, and they admitted using GT.

Then they should stop.

hier 22 avril 2026 à 07:53:32 UTC link Permalien
warning

Le contenu de ce message va à l'encontre de nos règles et a donc été caché. Il est seulement montré aux administrateurs et à l'auteur du message.

DostKaplan DostKaplan il y a 11 jours, modifiée le il y a 9 jours 12 avril 2026 à 14:21:20 UTC, modifiée le 14 avril 2026 à 09:57:27 UTC flag Signaler link Permalien

What is the regexp to search for sentences (Turkish to English) containing " ten" (with a leading space)?

xxxxxx ten xxxxxxx ✅
xxxxxx'ten xxxxxxx 👎🏼
xxxxxxten xxxxxxxx 👎🏼

{{vm.hiddenReplies[41806] ? 'expand_more' : 'expand_less'}} cacher les réponses montrer les réponses
AlanF_US AlanF_US il y a 6 jours, modifiée le il y a 6 jours 17 avril 2026 à 14:13:35 UTC, modifiée le 17 avril 2026 à 14:19:59 UTC flag Signaler link Permalien

I assume you're asking about an expression accepted by Tatoeba's integrated search function, which is similar but not identical to a regular expression ("regexp").

I don't believe there is a way to write an expression that finds sentences where "ten" is a standalone word but excludes sentences containing a word that ends with "ten" preceded by an apostrophe. The reason is that the tokenization and search split words not only at word boundaries indicated by spaces, but also at punctuation marks (including apostrophe), which are then discarded. One hacky way to get what you want would be to search for "=ten" and then use the browser's search function with "Whole Words" enabled to search through the results.

To get the full functionality you're looking for, I think you'd have to download the sentences you want and then use a tool (such as a text editor's search function with regular expression search enabled) that gives you this level of control.

{{vm.hiddenReplies[41829] ? 'expand_more' : 'expand_less'}} cacher les réponses montrer les réponses
DostKaplan DostKaplan il y a 6 jours 17 avril 2026 à 15:15:51 UTC flag Signaler link Permalien

"=ten" would be great if it works as expected (return sentences with standalone "ten"). Unfortunately, it also returns sentences containing "'ten" (with a leading apostrophe).

I want Sentence #4940054:
Onun güzel bir ten rengi var.

I don't want Sentence #5512986:
2013'ten beri buradayız.

Luckily there are only 3 pages of results, so I can just scroll through and eyeball them. But it would have been nice to be able to get only "ten" and not "'ten".

il y a 7 jours 16 avril 2026 à 05:55:23 UTC link Permalien
warning

Le contenu de ce message va à l'encontre de nos règles et a donc été caché. Il est seulement montré aux administrateurs et à l'auteur du message.

il y a 10 jours 13 avril 2026 à 10:47:28 UTC link Permalien
warning

Le contenu de ce message va à l'encontre de nos règles et a donc été caché. Il est seulement montré aux administrateurs et à l'auteur du message.

il y a 12 jours 11 avril 2026 à 02:11:03 UTC link Permalien
warning

Le contenu de ce message va à l'encontre de nos règles et a donc été caché. Il est seulement montré aux administrateurs et à l'auteur du message.

maaster maaster il y a 14 jours 9 avril 2026 à 18:35:01 UTC flag Signaler link Permalien

How can I easily watch my old sentences, e.g. page 2000 ?

{{vm.hiddenReplies[41803] ? 'expand_more' : 'expand_less'}} cacher les réponses montrer les réponses
brauchinet brauchinet il y a 14 jours 9 avril 2026 à 19:47:45 UTC flag Signaler link Permalien

https://tatoeba.org/de/sentence...ster?page=2000

You can change the number in the url.