menu
Tatoeba
language
S'inscriure Connexion
language Occitan
menu
Tatoeba

chevron_right S'inscriure

chevron_right Connexion

Percórrer

chevron_right Afichar la frasa aleatòria

chevron_right Percórrer per lenga

chevron_right Percórrer per lista

chevron_right Percórrer per etiqueta

chevron_right Percórrer los enregistraments àudio

Community

chevron_right Paret

chevron_right Lista de totes los membres

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search

Wall (7239 threads)

Astúcias

Before asking a question, make sure to read the FAQ.

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

Darrièrs messatges feedback

twin68innet

3 hours ago

subdirectory_arrow_right

Babelball

2 days ago

subdirectory_arrow_right

TATAR1

2 days ago

subdirectory_arrow_right

LeviHighway

3 days ago

subdirectory_arrow_right

AlanF_US

3 days ago

feedback

LeviHighway

3 days ago

subdirectory_arrow_right

LeviHighway

3 days ago

subdirectory_arrow_right

gillux

3 days ago

subdirectory_arrow_right

gillux

3 days ago

subdirectory_arrow_right

gillux

3 days ago

gillux gillux October 4, 2025 October 4, 2025 at 5:45:55 AM UTC flag Report link Permalink

Improving search for German sentences, feedback needed

Recently, someone pointed out it is impossible to search for capitalized words on Tatoeba. For example, looking up "People" or "people" always yields the same results. It makes sense for English, but what about German where all nouns are capitalized? In German, the presence of a capital is a precious grammatical indicator that could be used to limit search to nouns only (or to exclude nouns).

Original thread: https://github.com/Tatoeba/tatoeba2/issues/3206

So I tweaked the search engine behavior for German to make the = (equal) prefix case-sensitive. The equal prefix means "exactly that word, but regardless of casing", and my goal is to change that meaning to "exactly that word in that exact casing". This tweak is only available for German, and currently only on https://dev.tatoeba.org for testing. Here is what changes:

1. Searching for laut: return sentences containing laut, lauter, Laut, Laute etc. (case-insensitive)
2. Searching for =laut: return sentences containing laut only, lowercase.
3. Searching for =Laut: return sentences containing Laut only, capitalized.

As a side-effect of this change, it could be that search 1 above (laut) now matches more unrelated words. I cannot evaluate the impact of this side-effect, so please try to do some searches in German on https://dev.tatoeba.org and let me know if anything strange comes up. Note that dev.tatoeba.org contains an entirely separate and smaller corpus, you are free to add any sentence there for testing and it will show up in search results within 15 minutes.

{{vm.hiddenReplies[41292] ? 'expand_more' : 'expand_less'}} hide replies show replies
brauchinet brauchinet October 5, 2025, edited October 5, 2025 October 5, 2025 at 9:52:34 AM UTC, edited October 5, 2025 at 9:57:47 AM UTC flag Report link Permalink

I think that's a useful feature.

Like in most languages, sentences always start with a capital letter in German, no matter whether the first word is a noun or not.
So in your "=Laut" example, the results will include some "laut" that are capitalized for being the first word.
When you search for the German equivalent of "=where", you only get sentences like "Do you know where my stockings are?" but not "Where are my stockings?"

So users should be aware that the search is case sensitive, else they might encounter unexpected or biased search results.

{{vm.hiddenReplies[41293] ? 'expand_more' : 'expand_less'}} hide replies show replies
gillux gillux October 7, 2025 October 7, 2025 at 6:32:06 AM UTC flag Report link Permalink

> When you search for the German equivalent of "=where", you only get sentences like "Do you know where my stockings are?" but not "Where are my stockings?"

That’s correct. I agree that this "improvement" makes it unclear wether the search is case-sensitive or not. I agree that it would be better to somewhat indicate that, probably the wiki is the best option we have so far.

Actually, it is not the search that becomes case-sensitive, but German sentences that become case-sensitive. For example, if you search for =Tag in "Any language", it will be case-sensitive when matching German sentences but case-insensitive in other languages: https://dev.tatoeba.org/sentenc...h?query=%3DTag

Vortarulo Vortarulo October 8, 2025 October 8, 2025 at 7:25:50 PM UTC flag Report link Permalink

Sounds like a useful feature! Two other languages come to mind, which would also need that: Luxembourgish, with the same capitalization rules as German, and Klingon, which is also kind of case sensitive, e.g. 'enghal and 'engHal are different (pseudo-)words, and QoQ and qoq are indeed very different in meaning. So it's also useful there.

Potentially also some of the German dialects we have on Tatoeba.

Tartar Tartar October 2, 2025 October 2, 2025 at 5:09:45 AM UTC flag Report link Permalink

Alğa, duslar!

{{vm.hiddenReplies[41290] ? 'expand_more' : 'expand_less'}} hide replies show replies
TATAR1 TATAR1 October 2, 2025 October 2, 2025 at 7:58:59 AM UTC flag Report link Permalink

😊👍Alğa!

TATAR1 TATAR1 October 1, 2025, edited October 1, 2025 October 1, 2025 at 12:19:20 PM UTC, edited October 1, 2025 at 12:22:18 PM UTC flag Report link Permalink

Хөрмәтле ватандашлар һәм татар телендә сөйләшүчеләр! Туган телебезне яклау, саклау һәм үстерү өчен көчегезне һәм гамәлләрегезне көчәйтүегезне сорыйм. Бу сайтта да аларны эшләргә бик мөмкин.

{{vm.hiddenReplies[41283] ? 'expand_more' : 'expand_less'}} hide replies show replies
Feniks Feniks October 1, 2025 October 1, 2025 at 12:58:09 PM UTC flag Report link Permalink

Афәрин! Хуплыйм! Мең рәхмәт!

{{vm.hiddenReplies[41284] ? 'expand_more' : 'expand_less'}} hide replies show replies
TATAR1 TATAR1 October 1, 2025 October 1, 2025 at 3:17:48 PM UTC flag Report link Permalink

Үзегезгә рәхмәт, кардәш. Бергә һәм бердәм булыйк.

Tatar Tatar October 1, 2025 October 1, 2025 at 1:20:14 PM UTC flag Report link Permalink

Мин каршы түгел. Тиздән эшкә тотынырмын. Барыбызга дә уңышлар телим. Исән-сау булыйк.

{{vm.hiddenReplies[41285] ? 'expand_more' : 'expand_less'}} hide replies show replies
TATAR1 TATAR1 October 1, 2025 October 1, 2025 at 3:18:34 PM UTC flag Report link Permalink

Рәхмәт. Уңышлар теләп калам.

Rok Rok October 2, 2025 October 2, 2025 at 4:18:58 AM UTC flag Report link Permalink

Äydägez, eşli başlıyq.

{{vm.hiddenReplies[41288] ? 'expand_more' : 'expand_less'}} hide replies show replies
TATAR1 TATAR1 October 2, 2025 October 2, 2025 at 4:46:37 AM UTC flag Report link Permalink

👍😊 Ныклап эшләрбез, саулык булсын.

Pfirsichbaeumchen Pfirsichbaeumchen October 1, 2025, edited October 1, 2025 October 1, 2025 at 10:02:47 AM UTC, edited October 1, 2025 at 10:03:19 AM UTC flag Report link Permalink

Korpuspflegerkandidat für Gronings
Corpus Maintainer Candidate for Gronings
Kandidato por la bontenado de la groninga korpuso
Candidat pour la maintenance du corpus groningois
Кандидат в ответственные за корпус для гронингенского диалекта

Tom (Tom9358): https://tatoeba.org/user/profile/Tom9358

🇩🇪 Schickt uns wie immer gerne eine Privatnachricht, um uns eure Meinung mitzuteilen (auf den Link unten klicken).

🇬🇧 As usual, please feel free to send us a private message to share your opinion (click on the link below).

[epo] Kiel kutime, ne hezitu sendi al ni privatan mesaĝon pri via ĉi-rilata opinio (alklaku la ĉi-suban ligilon).

🇫🇷 Comme d’habitude, n’hésitez pas à nous envoyer un message privé pour nous faire part de votre opinion si vous le souhaitez (cliquez sur le lien ci-dessous).

🇷🇺 Вы можете, как обычно, отправить нам личное сообщение, чтобы поделиться своим мнением (кликните по ссылке ниже).

https://tatoeba.org/private_mes...rsichbaeumchen

sharptoothed sharptoothed September 28, 2025 September 28, 2025 at 6:52:51 PM UTC flag Report link Permalink

✹✹ Stats & Graphs ✹✹

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/

CK CK September 26, 2025 September 26, 2025 at 6:18:33 AM UTC flag Report link Permalink

🍎 Updated using last Saturday's exported data.

🥝 Bilingual Sentence Pairs

https://www.manythings.org/bilingual/

🥝 Tab-delimited Bilingual Sentence Pairs

https://www.manythings.org/anki/

StanJones StanJones September 19, 2025 September 19, 2025 at 1:54:48 AM UTC flag Report link Permalink

I am new to this site. It is fantastic. I am studying French and would like to know how to search for sentences containing two words in any order

{{vm.hiddenReplies[41260] ? 'expand_more' : 'expand_less'}} hide replies show replies
gillux gillux September 23, 2025 September 23, 2025 at 4:58:25 AM UTC flag Report link Permalink

Welcome! When making a search, the results already include sentences containing keywords in any order. But, if a sentence has the keywords in the order you entered them, it will be ranked higher in the results. This is the default sort order called "relevance". All the other sort orders, such as "random", won’t prioritize results in this way. You can change the sort order from the right pane.

atitarev atitarev April 23, 2025, edited April 24, 2025 April 23, 2025 at 11:54:49 PM UTC, edited April 24, 2025 at 12:00:35 AM UTC flag Report link Permalink

I am not super fluent in Belarusian but I found a few potential issues with a former self-identified Belarusian native speaker.

The verbs ведаць and знаць are near synonyms (both mean "to know") but the difference is somewhat akin or quite close to French savoir vs connaître, German wissen vs kennen and of course, Polish wiedzieć vs znać.

In Russian and Ukrainian, both Slavic verbs merged into знать (ru) and знати (uk) in the modern usage but it's a case of overcorrection to use ведаць (be) when знаць is more appropriate.

The phrases I refer to are "Do you know him?" and "Do you know her?"

https://tatoeba.org/en/sentences/show/69003
https://tatoeba.org/en/sentences/show/317616

{{vm.hiddenReplies[41032] ? 'expand_more' : 'expand_less'}} hide replies show replies
deniko deniko April 24, 2025, edited April 24, 2025 April 24, 2025 at 6:30:11 AM UTC, edited April 24, 2025 at 6:42:10 AM UTC flag Report link Permalink

My intuition was the same, but like you, I’m influenced by Ukrainian and Polish.

I googled "ведаеш яго" and didn’t find much, but I did come across this example from the Bible:

https://www.bible.com/es/bible/...91%D0%91%D0%9B

Ты, Госпадзе, ужо ведаеш яго дасканала.

So it seems like a legit usage? Looks like you can use "ведаць" in Belarusian to mean “to know someone” (though it might be archaic—Biblical language often is—but maybe not. Best bet would be to ask a native speaker).

EDIT: sent PMs to kxadtccpgt, pavuk3 and ssvb, maybe they'll help us.

{{vm.hiddenReplies[41033] ? 'expand_more' : 'expand_less'}} hide replies show replies
atitarev atitarev April 24, 2025 April 24, 2025 at 6:47:25 AM UTC flag Report link Permalink

Thanks, @deniko. This would be a legit usage. Not sure how many quotations would be required to verify.

At the English Wiktionary - three quotations from solid sources for well-documented languages.

The common modern and not so modern usage is:

Do you know him?: Ты знаеш яго? / Вы знаеце яго?
Do you know her?: Ты знаеш яе? / Вы знаеце яе?

The initial "ці" is optional, similar in usage to Ukrainian "чи" or Polish "czy".

{{vm.hiddenReplies[41034] ? 'expand_more' : 'expand_less'}} hide replies show replies
deniko deniko April 24, 2025 April 24, 2025 at 6:53:12 AM UTC flag Report link Permalink

> Not sure how many quotations would be required to verify.

if the source is legit and authoritative (like the official Bible translation, which has been checked, double-checked, triple-checked, and probably blessed too 😄), then I’d say even one quote is enough to treat it as valid, at least as archaic or poetic usage. It doesn’t tell you if it’s used in modern speech, sure, but for Tatoeba I think that kind of usage is totally fine—as long as it's tagged accordingly.

{{vm.hiddenReplies[41035] ? 'expand_more' : 'expand_less'}} hide replies show replies
atitarev atitarev April 24, 2025 April 24, 2025 at 6:57:40 AM UTC flag Report link Permalink

Thanks, @deniko. What is tagging at Tatoeba? I'm rather new as an active user.

I would probably need to tag my contributions for usage - masculine/feminine, plural, colloquial, formal, etc.

{{vm.hiddenReplies[41036] ? 'expand_more' : 'expand_less'}} hide replies show replies
deniko deniko April 24, 2025 April 24, 2025 at 7:11:28 AM UTC flag Report link Permalink

Tags are these guys:

https://i.imgur.com/w1VDVT3.png

You can search using them in Advanced Search, or just click on a tag when you're viewing a single sentence to see other sentences with that same tag.

For example, sentences tagged "Australian English":

https://tatoeba.org/en/tags/sho...h_tag/1611/eng

You don't have to tag sentences, but you might want to tag some of them, of course.

To be able to do it, you should be an advanced contributor. You're more than qualified to apply to become one, please do:

https://en.wiki.tatoeba.org/art...d-contributors

That will also help you to easily link your translations to multiple languages.

{{vm.hiddenReplies[41037] ? 'expand_more' : 'expand_less'}} hide replies show replies
atitarev atitarev April 24, 2025 April 24, 2025 at 7:14:10 AM UTC flag Report link Permalink

Дякую!

{{vm.hiddenReplies[41038] ? 'expand_more' : 'expand_less'}} hide replies show replies
kxadtccpgt kxadtccpgt September 16, 2025 September 16, 2025 at 8:41:50 PM UTC flag Report link Permalink

> Дякую!
The correct spelling is like this Дзякую!

kxadtccpgt kxadtccpgt September 16, 2025 September 16, 2025 at 8:38:09 PM UTC flag Report link Permalink

The verbs "ведаць" and "знаць" are two synonyms. There is no difference between them.
Here is the photo from a Russian-Belarusian dictionary https://imgbox.com/0O1reHkj

But in everyday conversations, the verb "ведаць" is almost always used.
I do not recall ever hearing or using the verb "знаць". For 40 years, I and all the people I know have used the verb "ведаць".

sacredceltic sacredceltic September 22, 2025 September 22, 2025 at 7:46:23 PM UTC flag Report link Permalink

You should flag the sentences, using the icons above the sentence. Either ?-icon if you think the sentence is doubtful, or !-icon if you think it's wrong.

sacredceltic sacredceltic September 22, 2025 September 22, 2025 at 7:49:11 PM UTC flag Report link Permalink

Of course the sentences you should flag are not the English ones. First click on the wrong translation of it, before you flag.

sharptoothed sharptoothed September 14, 2025 September 14, 2025 at 6:22:54 AM UTC flag Report link Permalink

✹✹ Stats & Graphs ✹✹

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/

sacredceltic sacredceltic September 12, 2025 September 12, 2025 at 7:45:56 PM UTC flag Report link Permalink

Ça m'exaspère de passer mon temps à éviter les phrases contenant "Tom". Je n'en peux plus. Pourrait-on envisager un flag dans le profil, d'évitement des phrases contenant "Tom" ? Voire d'autres mots-clés ?
Ça m'éviterait de sur-consommer de l'énergie et de causer un excès de réchauffement de la planète, car je dois, à chaque fois, passer à la phrase suivante, car je n'ai AUCUNE intention de traduire de telles phrases.

{{vm.hiddenReplies[41243] ? 'expand_more' : 'expand_less'}} hide replies show replies
frpzzd frpzzd September 12, 2025 September 12, 2025 at 7:55:46 PM UTC flag Report link Permalink

This page on the Tatoeba wiki gives examples of some more advanced search queries that are possible, besides what is listed in the Advanced Search interface:

https://en.wiki.tatoeba.org/art...ow/text-search

For instance, it shows how you can perform searches for sentences omitting certain words, such as "Tom":

https://tatoeba.org/en/sentence...rom=eng&to=und

Does this help you?

{{vm.hiddenReplies[41246] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic September 12, 2025 September 12, 2025 at 8:07:13 PM UTC flag Report link Permalink

No it doesn't. it's too complicated because you must set it up each time. it should be a profile's FEATURE .
My idea is, each contributor should be able to dress a list of UNDESIRED keywords. Based on it, the display of sentences should be PERMANENTLY filtered accordingly.

{{vm.hiddenReplies[41247] ? 'expand_more' : 'expand_less'}} hide replies show replies
frpzzd frpzzd September 12, 2025, edited September 12, 2025 September 12, 2025 at 8:20:27 PM UTC, edited September 12, 2025 at 8:24:17 PM UTC flag Report link Permalink

Well, I agree that it's very inconvenient to fill out the advanced search menu each time you want to perform a common search. However, you can also save advanced searches by copying the URL (and deleting the "rand_seed" argument, if you want it to re-randomize each time) and saving it for later. For example, I have several query URLs saved in my user bio for easy access, so that I can just copy-paste them instead of having to fill out the Advanced Search form.

Some sort of "global filter" would be nice too, though. A related feature has been requested in this Github issue, which asks about globally hiding sentences by specific users:

https://github.com/Tatoeba/tatoeba2/issues/3099

{{vm.hiddenReplies[41248] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic September 12, 2025 September 12, 2025 at 8:29:46 PM UTC flag Report link Permalink

I'm still unconvinced...

{{vm.hiddenReplies[41249] ? 'expand_more' : 'expand_less'}} hide replies show replies
Seael Seael September 12, 2025 September 12, 2025 at 11:24:01 PM UTC flag Report link Permalink

I also long for a bit more Tomless Tatoeba so I tried the how-to-exclude examples contained on https://en.wiki.tatoeba.org/art...w/text-search# but they don't seem to work:

https://en.wiki.tatoeba.org/%20...-small+the&to=

https://en.wiki.tatoeba.org/%20...C*%7Cz%5C*&to=

{{vm.hiddenReplies[41250] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US September 13, 2025, edited September 13, 2025 September 13, 2025 at 12:31:57 PM UTC, edited September 13, 2025 at 12:34:02 PM UTC flag Report link Permalink

For some reason, a space was inserted into those links. I've fixed them on the wiki page. Here are the correct links:

https://tatoeba.org/en/sentence...-small+the&to=

https://tatoeba.org/en/sentence...C*%7Cz%5C*&to=

{{vm.hiddenReplies[41253] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic September 13, 2025 September 13, 2025 at 8:48:01 PM UTC flag Report link Permalink

It's very unnatural.
it should be much easier for users to exclude words and names.
Why not make a user's list of undesired words and expressions available?

lbdx lbdx September 13, 2025 September 13, 2025 at 7:43:19 AM UTC flag Report link Permalink

There are lists such as "Rebalanced English ✂️" that prohibit a word from occurring more than 10 times as often as in a reference corpus.

You can browse these lists at https://tatoeba.org/en/sentence...direction=desc

{{vm.hiddenReplies[41251] ? 'expand_more' : 'expand_less'}} hide replies show replies
sacredceltic sacredceltic September 13, 2025 September 13, 2025 at 11:11:40 AM UTC flag Report link Permalink

That’s not the solution since Tom appears across languages.
Well I guess we’ll have to spoil more data-center fuel to dodge these sentences…

sacredceltic sacredceltic September 13, 2025 September 13, 2025 at 1:26:34 PM UTC flag Report link Permalink

Je viens d'accéder à ta liste "rebalanced French" et c'est très curieux:

1) la majorité des phrases sont écrites par des non-natifs.
2) il y a plein de fautes, sans doute conséquence de 1)