menu
تتويبا
language
سجّل لِج
language العربية
menu
تتويبا

chevron_right سجّل

chevron_right لِج

تصفح

chevron_right Show random sentence

chevron_right تصفح حسب اللغة

chevron_right تصفح حسب القائمة

chevron_right تصفح حسب الوسم

chevron_right تصفح ملفات الصوت

المجتمع

chevron_right الحائط

chevron_right قائمة بجميع الأعضاء

chevron_right لغات الأعضاء

chevron_right المتحدثون الأصليون

search
clear
swap_horiz
search
Thanuir Thanuir ٣ أبريل ٢٠٢٠ ٣ أبريل ٢٠٢٠ ٧:٠٠:٥٢ ص UTC flag Report link Permalink

I would like to have some example sentences using the words 'good-man' and 'good-wife'. Based on context the meaning might differ from that of 'a good man/wife'.

1. I added them to my vocabulary. However, the search engine seems to consider 'good man' and 'good-man' as identical expressions. Is there a way to add 'good-man' to my vocabulary?

Remark: In Finnish, when writing a compound word, if the first word ends and the last word begins with the same vowel, we write it thusly: 'ala-aste', 'linja-auto'. Writing 'ala aste' would mean two separate words (and typically be a mistake), though there might be cases where it would be correct language and would have a different meaning. As such, it would be nice if the search engine did not confuse these kinds of expressions, which mean different things.

2. I would appreciate examples of 'good-wife' and 'good-man' in the corpus. Examples illustrating whether these are the same as 'a good wife' (or man) would be particularly appreciated.

{{vm.hiddenReplies[34707] ? 'expand_more' : 'expand_less'}} أخفِ الردود أظهر الردود
gillux gillux ٣ أبريل ٢٠٢٠ ٣ أبريل ٢٠٢٠ ٧:٢٠:١٣ ص UTC flag Report link Permalink

As you discovered, the search engine currently doesn’t make any difference between 'good-man' and 'good man'. If the hyphen were to be treated like a normal character, it means sentences with 'good-man' wouldn’t show up when searching for 'good,' 'man' or any word sharing the same stem.

That said, our search engine (Manticore) has a feature I believe is exactly what we need: blended characters [1]. This feature would allow a sentence containing 'good-man' to be found by searching for 'good-man' as well as 'good' or 'man'.

Now my question is: should we treat the hyphen character as a blended character in all languages by default, or only in Finnish? I feel like other languages such as French or English could benefit from it, but I wonder if it could cause any harm in other languages I don’t know.

By the way, in Finnish, isn’t the colon character also used in a similar way, when you want to decline an abbreviation or something? I’m talking about this: https://en.wikipedia.org/wiki/C...ffix_separator

[1] https://docs.manticoresearch.co...ml#blend-chars

{{vm.hiddenReplies[34708] ? 'expand_more' : 'expand_less'}} أخفِ الردود أظهر الردود
Thanuir Thanuir ٣ أبريل ٢٠٢٠ ٣ أبريل ٢٠٢٠ ١١:٠٤:٢٨ ص UTC flag Report link Permalink

The Wikipedia article seems correct. For the definitive source, see https://www.kielikello.fi/-/kaksoispiste- . Note that colon is not used to combine separate words. (Colon also has other uses such as with quotes.)

Example of genitiv:
metri – metrin
m – m:n

Norwegian (bokmål) uses some kind of dash to the same effect, so that the definite singular of tv is tv-en.
The same kind of dash is used to bind together words in some compound words, it looks like: https://no.wikipedia.org/wiki/Bindestrek

I think that blended characters would not help me here, but it might otherwise be a fine idea. I am not sufficiently knowledgeable about Manticore to have a strong opinion here.

Thanuir Thanuir ٣ أبريل ٢٠٢٠, edited ٣ أبريل ٢٠٢٠ ٣ أبريل ٢٠٢٠ ١١:٠٨:٢٦ ص UTC, edited ٣ أبريل ٢٠٢٠ ١١:٠٨:٤٦ ص UTC flag Report link Permalink

On an unrelated note: Gmail indicated the email notification of your response as suspect. It was sent from the address trang.dictionary.project@gmail.com .

fra: noreply <trang.dictionary.project@gmail.com>
til: *@gmail.com
dato: 3. apr. 2020, 09:20
emne: Tatoeba - gillux has replied to you on the Wall
sendt av: gmail.com
signert av: gmail.com
sikkerhet: Standardkryptering (TLS) Finn ut mer

{{vm.hiddenReplies[34713] ? 'expand_more' : 'expand_less'}} أخفِ الردود أظهر الردود
Ricardo14 Ricardo14 ٤ أبريل ٢٠٢٠ ٤ أبريل ٢٠٢٠ ١:٠٤:٢١ ص UTC flag Report link Permalink

>On an unrelated note: Gmail indicated the email notification of your response as suspect. It was sent from the address trang.dictionary.project@gmail.com .

That happened with me too.

gillux gillux ٥ أبريل ٢٠٢٠ ٥ أبريل ٢٠٢٠ ٥:٥٥:٣٧ م UTC flag Report link Permalink

I recorded the issue: https://github.com/Tatoeba/tatoeba2/issues/2255

{{vm.hiddenReplies[34732] ? 'expand_more' : 'expand_less'}} أخفِ الردود أظهر الردود
Thanuir Thanuir ٥ أبريل ٢٠٢٠ ٥ أبريل ٢٠٢٠ ٦:٠٧:١٨ م UTC flag Report link Permalink

Merci beaucoup.