clear
{{language.name}} No language found.
swap_horiz
{{language.name}} No language found.
search

Wall (5241 threads)

DostKaplan
8 days ago
Searches not responding..... (for most of the day)
Aiji
17 days ago
Can the admins explain why bots are authorized now?

I believe that past experiences have shown that it is generally not a good idea, the process being more often harmful to corpora than helpful. That for a simple reason, the creator of such bots being less motivated by a good contribution to the project than by their own interest. I respectful way to use bot-generated sentences would be to generate them on your local platform, proofread them and upload only the correct ones.

It makes me sad that when people invest time and energy to increase the global quality of a corpus, suddenly a random Johnny comes and insert thousands of sentences beating that down, with zero effort, not showing the minimum respect as to proofread them. (Voltaire + Maxence = 11K, although one of them is supposed to be human)

Now I see that Voltaire has been red-marked. I guess that is CK's doing after one of my comment yesterday. That sill does not explain why it was accepted at first :)
hide replies
PaulP
16 days ago
I agree. There were hundreds of wrong sentences. Even just putting the @change flag on them costs other people a lot of wasted time.
TRANG
14 days ago
> Can the admins explain why bots are authorized now?

Bots were never forbidden on Tatoeba. We actually use a bot ourselves: Horus.

As long as the contributions comply to our quality standards and as long as our server can support the load, it shouldn't matter whether these contributions are from a bot or from a human.

Voltaire was red-marked by myself. CK usually won't meddle with sentences in languages other than English.

Voltaire was also "accepted" by myself, but "accepted" is not exactly the correct word. We technically cannot know if an account is run by a bot or by a human, so we cannot exactly give our approval, nor stop people, from using bots.

In the case of Voltaire, the author was transparent and contacted us to ask our policies about bots. I replied to him what I told you above: it doesn't matter if it's a bot or a human who adds the sentence as long as the quality is good and the quantity doesn't slow down the website.

He wanted initially wanted to run a translation bot, but then shifted to a bot that simply adds sentences in French, probably because he was not confident on the quality of the translation bot.

Since he was transparent with us and had acknowledged the quantity and quality requirements, I did not suspend his bot right off the bat. I, of course, reported to him that his bot is not adding good sentences, and after a couple of days, asked him to stop the bot and start cleaning up the sentences instead of adding more of them. He did stop the bot but did not do much about the clean up.

> It makes me sad that when people invest time and energy to increase the global
> quality of a corpus, suddenly a random Johnny comes and insert thousands of
> sentences beating that down, with zero effort, not showing the minimum respect
> as to proofread them.

Note that it equally takes zero effort to delete all sentences from a specific account. There is, in my opinion, no reason to be upset about massive addition of bad sentences.
hide replies
deniko
14 days ago
> There is, in my opinion, no reason to be upset about massive addition of bad sentences.

Depends on the person, I guess.

I would be rather upset if someone started adding a lot of bad Ukrainian sentences. I feel like the quality of the Ukrainian corpus is my responsibility, and that would definitely hurt, especially if I felt like there were so many of them I can't really proofread them.

hide replies
TRANG
14 days ago
Then perhaps there is a misconception about what is the role or responsibility of a corpus maintainer.

Keep in mind that every contributor is responsible for the quality of the corpus, not just corpus maintainers. You have a little bit more power to maintain a good level of quality, but that doesn't mean everything is on your shoulders.

You are in no way obligated to proofread all the Ukrainian sentences that are added to Tatoeba. You proofread what you can, what you want, and at your own pace. If there's a user who has added too many bad contributions, your first reflex should be to report the user, not to try to fix what they did wrong.
hide replies
deniko
14 days ago
> You are in no way obligated to proofread all the Ukrainian sentences

I know that, thank you.
marafon
14 days ago
> He initially wanted to run a translation bot

Unfortunately, he did run it and we got hundreds of bad links and sentences like this:

https://tatoeba.org/rus/sentences/show/7701109
Quel Pachuca pour Toluca ?
https://tatoeba.org/rus/sentences/show/7702274
C'est la nouvelle secrétaire et elle fait tout pour le chingadazo.
https://tatoeba.org/rus/sentences/show/7701955
Où nous neiges en janvier.
https://tatoeba.org/rus/sentences/show/7702300
Nous avons souvent paviamos.
https://tatoeba.org/rus/sentences/show/7699888
Prenez cette pilule. Ça t'aidera à dormir.
https://tatoeba.org/rus/sentences/show/7698693
Viens avoir soif, s'il te plaît.
https://tatoeba.org/rus/sentences/show/7701273
DepuDepuis quand avez-vous remarqué le mouvement du fœtus ?is quanDepuis quand avez-vous remarqué le mouvement du fœtus ?d avez-vous remarqué le mouvement du fœtus ?
etc.

p.s. Merci @Aiji.
AlanF_US
14 days ago
> Note that it equally takes zero effort to delete all sentences from a specific account.
> There is, in my opinion, no reason to be upset about massive addition of bad sentences.

It may take a small amount of effort for an admin with the proper knowledge and permissions to execute a command that deletes all sentences from a specific account. (Maybe there are two or three such admins.) However, it takes a significant amount of effort to:
- identify the problem
- contact an admin to report it
- decide how to go about solving it
- figure out how to stop additional people from wasting their time reporting or trying to resolve the problem
- figure out how to ignore the bad sentences and look for good ones to translate (or answer a query)

Even more effort is involved if you want to selectively delete only some of the sentences from an account.

I think it would be a good idea for someone to come up with guidelines about:
- the maximum number of sentences that should be contained in a single mass-import spreadsheet
- the maximum rate at which sentences should be added (by a human or bot)
- the maximum number of egregiously bad sentences from a contributor (human or bot) before we temporarily suspend them

And I propose that unless and until we have a quick-response team, we should go on record as having a policy of encouraging mass import (preceded by proofreading) and discouraging or even disallowing bots other than official Tatoeba ones like Horus. I realize that at present, our mass import function is not working, so that means no batch introduction of sentences. But that seems reasonable to me until we're in a better state.
hide replies
Aiji
13 days ago
> And I propose that unless and until we have a quick-response team, we should go on
> record as having a policy of encouraging mass import (preceded by proofreading) and
> discouraging or even disallowing bots other than official Tatoeba ones like Horus.

Alleluia.


My problem is that I tend to go with the last option that you proposed (cleaning the evil to keep the good). While we could simply massively cleanse everything...
TRANG
12 days ago
On the topic of bots.

We cannot reliably distinguish between a bot and a human. There is no easy programmatic way to block bots from the website.

We can rely on some clues, such as the speed at which a user adds sentences (it's unlikely that a human can add more than 1 sentence per second), and we can notice patterns in sentences that are likely to be from bots, but those are human assessments and require some initial inputs from the user.

We could rely on some technical clues (such as the user agent), but if bots creators wanted to bypass our restrictions, they can always figure out something.

Trying to restrict access to website based on the nature of the user (bot or human) is not a productive approach. It would be a time-consuming and never-ending war.

One thing we can and should do, regardless of bots of humans, is to regulate the amount of sentences added within a period of time. For instance no more than 10 sentences per minute and no more than 1000 sentences per day. These numbers could be lower for newer contributors.

This is what you suggested (maximum rate at which sentences should be added) and has been suggested as well in https://github.com/Tatoeba/tatoeba2/issues/1492. There is not yet a final decision on how we'll implement this and what are the limits so feel free to comment over there if you have more specific ideas.

You also mentioned other limitations:

> maximum number of sentences that should be contained in a single
> mass-import spreadsheet

This should rather be limited by the maximum size of the file that can be uploaded (1 MB for instance).

> the maximum number of egregiously bad sentences from a contributor
> (human or bot) before we temporarily suspend them

I don't think it would change much our current processes to decide on such a number. We usually suspend users when they're reported to us and I think it's fine to keep relying on our intuition for that.
TRANG
12 days ago
On the topic of how we deal with bad contributions.

You've mentioned that it takes effort to go from the identification of bad contributors to the resolution the problem.

I don't think the effort really grows proportionally with the amount of contributions. It can happen that the effort spent on dealing with a user who has 100 contributions is higher than the effort spent in dealing with user who has 1000 contributions.

Now if I go through specifically each of your points where effort is needed:

- to identify the problem → Normally, a quick glance at a user's sentences should be enough to identify the problem. If it takes more than two minutes to decide if a user should be reported, then they are probably not that bad.
- to contact an admin to report it → Here as well it shouldn't take too much effort from the reporter. Admins would just need three pieces of information: the username, the languages affected, a rough estimation the % of bad sentences in each language.
- to decide how to go about solving it → I'd say we have already a process for this: we suspend the user, ask them to correct their sentences. If the user doesn't do anything, mark all sentences as unapproved. If the user corrects their sentences, unsuspend them.
- to figure out how to stop additional people from wasting their time reporting or trying to resolve the problem → Suspending a user should be enough to let people know that a user has already been reported. One improvement we could add is to display a warning somewhere on the sentence's page, so that people are more easily aware the user has been reported.
- to figure out how to ignore the bad sentences and look for good ones to translate (or answer a query) → That's more of a problem for people who translate the latest sentences, but shouldn't be a huge issue for people who look for random sentences or for older sentences first. Here as well we can implement some improvement to let people know how old a sentence is, and from there they can decide if they want to risk themselves translating it.
- to selectively delete only some of the sentences from an account → If we ever wanted to selectively delete only some sentences, it would mean that the contributions are somewhat worthy and we have mixed feelings about their uselessness. If we consider that the contributions are only adding extra work that is not worth our time, we would not consider selectively deleting sentences.

When it comes to bots, I don't think we should have much remorse wiping out all sentences. But I must say (again) that I don't think deleting sentences will always helps us.
If we keep the sentences and just mark them as unapproved, they cannot be re-added (at least as long as the deduplication feature works properly). If we delete the sentences, there's always a risk that the bot owner runs their bot again to re-add the same sentences, then we could delete again, but the cycle can continue as long as the bot owner finds ways to bypass restrictive measure we may put in place.
hide replies
CK
CK
12 days ago
> If the user doesn't do anything, mark all sentences as unapproved.
> ...
> But I must say (again) that I don't think deleting sentences will always helps us. If we keep the sentences and just mark them as unapproved, they cannot be re-added (at least as long as the deduplication feature works properly)

In my opinion, it would help a lot to delete all the sentences by such a member for the following reason.

Having a "good" sentence from an untrustworthy source, prevents a trustworthy member from adding the same sentence. This is also one good reason to ask members to limit themselves to contributing sentences in their own native languages. It's a lot easier to trust that a sentence is correct and natural-sounding if it's from a native speaker.


TRANG
12 days ago
On the topic of quality in general.

First, I wrote the following article on the wiki: https://en.wiki.tatoeba.org/art...rove-sentences
I hope it can help contributors become more aware of the quality aspect and how to help more efficiently on that front.

Second, I hope this year we can start shifting our mindset on how we should handle quality. I hope we can stop asking ourselves "how do we prevent users like Maxence from running bot experiments", or "how do we stop contributors from adding sentences languages they are not native of", and instead, try to solve actively the question of how do we let these people contribute. What do we need to change in Tatoeba so that these people can contributing without being an annoyance or a burden?

We already do have beginning of solutions -- this isn't exactly a new problem -- but I hope we can make them more concrete.
hide replies
Aiji
8 days ago
Let's hope that in time "Using a bot to split and add whole classical novels without any added value into the corpus." will not rise to "contribution of quality" :)
But of course, everybody is free as long as they follow the rules.
CK
CK
14 days ago
> There is, in my opinion, no reason to be upset about massive addition of bad sentences.

There may be no reason to be "upset", but bad sentences shouldn't be tolerated if there is an easy way to avoid them. Having bad sentences in the corpus does a disservice to anyone using the data. While it may be impossible to be 100% error-free, that should be what we strive for.

[#6106141] It's bad enough to learn something correctly, and then get it wrong. It's a lot worse to spend time learning something that is incorrect, thinking it's correct. (CK)

hide replies
AlanF_US
13 days ago
I agree.
CK
CK
9 days ago
** Browsing List 907 **

For those of you who like to browse the latest additions to List 907 (English sentences I've proofread), you will need to do the following. Things have changed a bit since the the tatoeba.org was updated to the latest version of CakePHP.


1. Change your "Number of sentences per page" setting to 50 or fewer. If set at 100, you will get an error message. I think you also need to make this change in order to use the advanced search setting to limit searches to the 710,000+ sentences on List 907.

You can access your settings here.
https://tatoeba.org/eng/user/settings


2. Instead of using the old (plain) URL, you will need to sort by created with the newest at the top.

This can be done, by visiting the old URL, then clicking "date added to list" and then after that page loads, click "date added to list" once again.

The faster and easier way to do this is to click the following URL and bookmark it for future use.

https://tatoeba.org/eng/sentenc...direction=desc
The end of the URL will look like this.
/907/und?sort=created&direction=desc
AlanF_US
13 days ago
When I click on a link that contains a language other than the one I've selected for my interface, my interface is displayed according to the language in the URL. For example, when I click on this link:

https://tatoeba.org/rus/sentenc...45016#comments

the interface on that page is displayed in Russian. I think that before the CakePHP upgrade, it would have been displayed in my interface language (English).
hide replies
Thanuir
12 days ago
The interface remains in Finnish for me.
TRANG
12 days ago
Perhaps because you tried to view the dev website in Russian? There might be some cookies interference between the dev and prod website...
hide replies
AlanF_US
11 days ago
When I visit this link:

https://tatoeba.org/fin/wall/sh...#message_31306

the prod interface shows up in Finnish, even though I've never chosen Finnish for the interface for either the prod or dev website.
hide replies
AlanF_US
10 days ago
I deleted my browser cookies for tatoeba.org. Now I see the correct behavior.
CK
CK
13 days ago - 13 days ago
** Browsing by Language, Limited to Sentences with Audio **

The option for browsing by language, limited to audio wasn't kept during the recent code update. However, you can still sort of duplicate the previous URL (tatoeba.org/eng/sentences/show_all_in/eng/und/none/only-with-audio) with the following advanced search URL.

https://tatoeba.org/sentences/s...o=yes&from=eng
(Browse English Sentences with Audio)

You just need to include the following 3 items in the URL, changing the last one to match the 3-letter code of any of the languages with audio files.

sort=created&has_audio=yes&from=eng


[EDIT]

I created a page of links showing all languages with audio files.
You can just click the one you're interested in and then bookmark the resulting page.

http://tatoeba.ueuo.com/browse_..._language.html
sharptoothed
14 days ago
* Tatoeba Top 30 Languages Interactive Graphs*

Tatoeba Top 30 Languages Interactive Graphs have been updated:
https://tatoeba.j-langtools.com/igraph/
https://tatoeba.j-langtools.com/igraph/share.html
CK
CK
15 days ago
Today's "sentences_by_language" page compared with the same page one year ago.

https://prnt.sc/mg86fj
hide replies
Pfirsichbaeumchen
15 days ago - 15 days ago
An interesting comparison. There have been only 1545 new Japanese sentences and 268 new Hebrew sentences in one year. Some languages are sadly stagnating.
hide replies
CK
CK
15 days ago
Here is a table showing all languages, sorted by the change since last year.

http://tatoeba.ueuo.com/190204_...last_year.html

This doesn't represent the audio contributions, though.
Hybrid
16 days ago - 15 days ago
Tatoeba is slow and I keep getting logged out. Sometimes it can take 30secs to 1 min to add a sentence. Sometimes the sentence is never added. Also, sometimes Tatoeba asks me to log in, but when I refresh, I'm logged in again. I often get "An Internal Error Has Occurred" when trying to load the homepage. I think Tatoeba is very close to not being usable right now.
hide replies
Pfirsichbaeumchen
16 days ago - 16 days ago
Tatoeba has been slow for me for some time. Searches can be extremely slow. Entering sentences or translations regularly takes a long time, too. So does loading the home page. Pages of particular sentences do not seem to be affected by that issue so often.
TRANG
15 days ago
Tatoeba has been slow these past few days mostly due to the search. The "internal error" you've got on the homepage is because the random sentence relies on the search. I assume the search was failing and as a result the homepage was failing as well.

When issues with the search happened last time[1], the problem was due to too many changes happening in a too short time. It created a sudden spike in workload for the search engine, it couldn't handle it. This time I'm not 100% sure, but it's likely for the same reason.

Note that "too many changes" doesn't necessarily mean too many sentences added or edited. It can be any change on data that the search needs to handle: tags, lists, owner of sentences... So it's not necessarily obvious what causes the search engine to breakdown.

The current "quick and dirty" solution is to disable the search from the website, run some script so that the search finishes its usual background tasks in peace, then re-enable the search once the background tasks are done.

This makes the website usable again for some time, but slowness will come back as soon as the search is being overwhelmed.

The longer term solution would be to optimize the search feature. That would include migrating to another version of the search engine[2]. But it's a pretty big effort and we'll unfortunately need to wait at least another month before anything can be done about it.

---

[1] https://tatoeba.org/eng/wall/sh...#message_30960
[2] https://github.com/Tatoeba/tatoeba2/issues/1518
Thanuir
22 days ago
What are best practices for linking similar sentences that only differ in punctuation?

Particularly common variant is expressions like "Hello!" and "Hello."; to me, they have different meanings as symbolized by the exclamation mark or the full stop.

Should these be linked as synonyms or not?

The same question happens across languages; should one link "Hei." only with "Hello." or also with "Hello!"?

(The exclamation mark has slightly different uses in different languages, but in many cases the meaning seems to be the same.)
hide replies
PaulP
21 days ago
Good question. I started to link them as synonyms a while ago. But it would be good to have a general agreement on this. An advantage of linking them is that people will not add the Hello! sentence if they clearly see that the Hello sentence exists.
hide replies
Thanuir
20 days ago
Looking into it a bit more, the Wikipedia page is useful, as it often is: https://en.wikipedia.org/wiki/Exclamation_mark .

There are languages without an exclamation mark (old European languages, presumably many non-European languages unless they all have adopted it, which I doubt) and it is used in different ways in different languages. Communicating these differences is only possible if the sentences are not made synonyms unless the meanings do overlap.

As such, it seems a good idea to not link for example "Hey." and "Hey!", because if they are not linked, it is meaningful to link a word in another language to both of them, if that language for example does not use the exclamation mark at all or uses it a lot or uses it very rarely, when compared to English. These distinctions are lost if all sentences are linked irrespective of punctuation.
hide replies
Objectivesea
20 days ago - 20 days ago
I am employed as a professional editor at the British Columbia Legislative Assembly. Our house style guide takes account of the current regrettable tendency among some people to overuse exclamation marks. Our style guide mandates that in nearly all cases the exclamation marks be changed to periods, reserving exclamation marks for truly unusual occasions.

I have sometimes received emails in which EVERY sentence, even the most mundane, ended with an exclamation mark; other writers adapt bizarre and idiosyncratic conventions like ending all sentences with an ellipsis, as if they were unable to complete a rational thought.

A Member of the Legislative Assembly might seem, in his or her rhetorical excitement, to be saying: "Imagine that!" or "Shameful!" or "That Minister should have known better!" — even using the sort of intonation in speaking that might make it seem that an exclamation mark is somehow warranted. Regardless, our parliamentary reports style these sentences as: "Imagine that." or "Shameful." or "That Minister should have known better."

People who are interested could take a look at one of our Hansard (verbatim record of debates) issues, accessible from this menu, and just do a search for the "!" character:

https://www.leg.bc.ca/documents...nt/3rd-session

It would be a very, very rare transcript that has even a single exclamation mark.

My recommendation would be to simply change all Tatoeba sentences ending in an exclamation mark to end with a period instead. Then let the automatic dupolicate removal tool purge the unneeded duplicates.
hide replies
Thanuir
20 days ago
I disagree with replacing all exclamation marks with full stops, as the exclamation mark is part of many languages and communicates meaning. There are valid sentences with exclamation marks.

I agree that it is rarely useful in formal writing (factorials aside), but that does not seem relevant for Tatoeba.
CK
CK
20 days ago - 20 days ago
> My recommendation would be to simply change all Tatoeba sentences ending in an exclamation mark to end with a period instead.

I don't agree with this. However, I do agree that most English sentences with exclamation marks might be just as good or better without the exclamation marks. Non-native English speakers seem to overuse them here, so I assume that exclamation marks are more commonly used in some other languages.

I think some languages use exclamation marks naturally for imperative sentences, and maybe other uses.

Even in English, we more often than not see certain sentences with exclamation marks.

[#1557994] How annoying! (melospawn) *audio*
[#1915781] How arrogant! (donkirkby)
[#1913083] How beautiful! (CK) *audio*
[#729024] How romantic! (Shazback)
[#2173907] What a bad movie! (Hybrid)
[#1460433] What a beautiful city! (piksea)
[#1108267] What a beautiful day! (nata23)
[#528258] What a big dog! (fanty) *audio*
[#7357078] What a bizarre idea! (Eccles17)
[#2824472] What a great party! (eirik174)

For more, see sentences tagged as "exclamative".
https://tatoeba.org/eng/Tags/sh...h_tag/3827/eng

Aiji
20 days ago - 20 days ago
I'm not sure that I completely understood what sentences you're linking together so I wouldn't say that I disagree but I think I might disagree^^
Languages that possess an exclamation mark, or whatever punctuation sign it may be, possess it for a reason. The fact that people don't know how to use them or overuse them is certainly sad, but « Mange. » and « Mange ! » are two completely different sentences.

Therefore, « linking them as synonyms » is, for me, a mistake. And I actually unlink this kind of pair when I see them. We wouldn't use the one instead of the other. And if we translate these two examples into Japanese, for example, they will give us two quite distinct sentences.

On a more global scale, I am against linking sentences of the same language together, except in the case of expressions/sayings/regionalisms/not dictionary-friendly. Doing so could lead users / learners of the said corpus to unfortunate mistakes, believing two words have the same meaning out of all context.
hide replies
AlanF_US
19 days ago
> « Mange. » and « Mange ! » are two completely different sentences.

"Completely different" overstates the case. Obviously they're not identical. But they contain the same vocabulary, the same grammatical structure, the same word order, and so on. It's just the punctuation that's different. A language learner who reads « Mange. » will know that they can produce the sentence « Mange ! » through a trivial transformation. So in the context of language learning (probably the major purpose for which Tatoeba is used), the two sentences have a very strong connection.

> And I actually unlink this kind of pair when I see them.

This makes me uncomfortable, especially if you do not know why the reasons that they might have been linked in the first place. If your contribution to a group project is simply to undo someone else's contribution, and the change is not based on a firmly settled principle, someone's time is being wasted.

Linking two sentences, whether in the same language or in different languages, does not imply that the sentences are freely interchangeable in all situations. It means that the linked sentences are interchangeable in at least ONE situation. Sentences that differ in their use of a synonym ("That house is big" vs. "That house is large"), or in order of clauses ("In the summer, we go to the beach" vs. "We go to the beach in the summer") are interchangeable in many situations, which means it's perfectly justifiable to link them. Similarly, in many situations, the use of an exclamation point versus a period is a matter of taste, meaning that these sentences are indeed interchangeable.

Tatoeba is not designed to be a dictionary, much less the kind of dictionary where every place where multiple alternatives are offered must be accompanied by usage notes. I believe that our users understand that a link between sentences does not mean that we guarantee that they are identical, or that we are obligated to warn them how they might differ in emphasis or formality.

Furthermore, even if a sentence in Japanese could be linked only to "Eat!" and not to "Eat.", or vice versa, there's no problem if someone links "Eat!" with "Eat." It's a basic principle here that the fact that a sentence can be linked to another sentence does not imply that it can necessarily be linked to every sentence to which the second sentence is linked. That's the reason we treat direct and indirect links differently in the user interface.

One purpose served by linking "Eat!" with "Eat." is that "Eat!" may be translated into language X while "Eat." is not, and users looking at "Eat." are better served by being able to see the the translation of "Eat!", even if it's marked as an indirect link. They can use their judgment to determine whether that translation (with or without a transformation) will suit their needs.
hide replies
Thanuir
18 days ago
Personally, I link sentences if and only if their meanings overlap in at least one situation where I cannot find a strictly better translation, or where they are close enough and a strictly better translation is not yet in Tatoeba.

For sentences in the same language, this is usually not the case, as different words or punctuation marks suggest or emphasize different things, most of the time.

So "Hei." and "Hei!" are not good translations of each other, since one is a more excited than the other. The sentence itself is its best translation; no need to link them.
On the other hand, the words "ekvivalentti" and "yhtäpitävä" have precisely the same meaning in mathematical Finnish, so in any such sentence, one can be replaced by the other while conserving the meaning. So such sentences could be made synonyms. (One would still be communicating a little bit about one's usage of foreign versus more Finnish words, which is why I do not make them synonyms myself, but I would not object to someone else doing it or remove the links. Communicating that difference is not a big deal in modern Finland.)

Examples of why a single situation where two sentences have the same meaning is not a sufficient grounds for linking them:
Context: A dead animal was found in a region where lion is the only great cat around. "It was killed by a lion." and "It was killed by a great cat." would be essentially the same in that context, yet they should not be made synonyms.
Context: Mathematics research within function theory (i.e. complex analysis). "But wait, the function is differentiable." and "But wait, the function is analytic." have precisely the same meaning, since differentiable complex functions are also analytic. But making these sentences synonyms would be a mistake, since in many other contexts they have a crucially different meaning.

Likewise, if you are excited about meeting someone, you might say "God morgen!", and if you are feeling less energetic, maybe you would say "God morgen.". There certainly exists a situation between those two where they are interchangeable. But still, most of the time, the tone of the greeting makes a difference for the meaning and the exclamation mark suggests the tone.
hide replies
AlanF_US
18 days ago
I agree that where two sentences are equivalent only because there is a situational context that eliminates other possibilities, it doesn't make sense to link them. But I still think that where they are equivalent except for the presence or lack of an exclamation point, it does make sense to link them. In that case, as I mentioned, the vocabulary and grammar are identical, and producing one sentence from another is a trivial matter of adding or removing an exclamation mark, so there's a value in making sure that people who see one see the other as well. Note that I'm not trying to convince you to link such sentences yourself. But as I said, I'm not comfortable with someone actively unlinking them because there doesn't seem to be a widely held opinion that that should be done.
hide replies
Thanuir
18 days ago
To me, this sounds like an ad hoc decision based on the feature of many (maybe all?) European languages that adding the exclamation mark is a trivial thing. I am not sure this is true of every language.

For Finnish, when adding a question mark to a statement, one often does more: "True. True?" could be "Totta. Tottako?" (not good sentences, but I hope you get the point). This might be true of structures related to the exclamation mark in some languages.

This reminds me of the issue of using standardized names of people or cities. Names are inflected (I hope this is the right word; taivuttaa) in Finnish, and especially for foreign names this is non-trivial, so using standardized names would be a loss.

I am generally against almost all links within a language, for the reasons mentioned earlier; I understand I might be in the minority here, so I do not remove those links, most of the time. But when posing the question I was more interested in links between languages, and I still am.

Different languages use the full stop and the exclamation mark in slightly different contexts, which is not communicated at all if sentences with and without exclamation marks are linked indiscriminately. So I suggest not linking phrases across languages where one has a full stop and the other has an exclamation mark without being familiar with the use of the symbols in both. One might have broader use of a symbol than the other, for example.
hide replies
AlanF_US
18 days ago
That sounds reasonable.
Aiji
17 days ago - 17 days ago
Your reasoning seems biased to me, at least on two points: european-centered, and purpose-centered (Tatoeba is used to learn a language).

>But they contain the same vocabulary, the same grammatical structure, the same word order, and so on.
It is funny how you do not say "the same meaning", although it is my mistake for saying "different sentences" and not "different meaning". However, since we're debating in English, you'll surely excuse me for that mistake.
And now, if you can tell me contexts where "Mange." and "Mange !" are the same, in terms of meaning, and not of "trivial transformation", I'll be happy to hear some of them. But then I guess "Tu vas au restaurant." and "Tu vas au restaurant ?" are also similar sentences that can be linked. And I shall be doomed.

Long story short: don't tell me how to maintain our French corpus, French natives are here to debate on that. :)
For most of the other points, Thanur has explained them probably better that I could have.
hide replies
AlanF_US
17 days ago
If you have a policy of unlinking sentences within the French corpus that differ only in the use of exclamation mark versus period, then yes, that's your prerogative. I thought that the French was being offered as an example, but that you were talking about unlinking such pairs even in English, where we don't have such a policy. I still think that many people can profit from seeing both elements of such a pair even in French, but they can be found in other ways, such as via search.
hide replies
Aiji
17 days ago
Then let me apologize for not being precise enough. I was talking only about this specific case.
As another example, I would not unlink "Qu'est-ce que tu fais ?" and "Que fais-tu ?" if they were linked. Personally, I don't find it necessary, but such a link is not harmful and "may" be helpful to somebody.
hide replies
AlanF_US
16 days ago
I was not sufficiently precise, either.
DostKaplan
17 days ago - 17 days ago
English --> Turkish

Searched for: I hope I can find

Results contained Arabic sentences (!!) as well as other variations in English. I am not interested in results in English. It didn't use to work like this before.

eng
I hope I can find a job in Boston.
ara
آمل بأن أجد وظيفة في بوستن.
eng
I hope that I can find a job in Boston.
tur
Umarım Boston'da bir iş bulabilirim.
hide replies
CK
CK
17 days ago - 17 days ago
I tried both of these and didn't get anything but Turkish translations in the first 100 results.

dog|cat
https://tatoeba.org/eng/sentenc...rom=eng&to=tur

"What's your favorite"
https://tatoeba.org/eng/sentenc...rom=eng&to=tur

What search did you try that got Arabic, too?
Could you paste in the URL?

hide replies
DostKaplan
17 days ago
@CK, like I said, I searched for: I hope I can find
DostKaplan
17 days ago
See screenshot:

https://ibb.co/cJDVpPW
Aiji
17 days ago
I got the same six sentences with only their Turkish translations when doing the same search. As CK mentioned, could you give us the complete URL of your results page?
hide replies
DostKaplan
16 days ago
This is the URL of my results page:

https://tatoeba.org/eng/sentenc...rom=eng&to=tur

I am using Chrome browser on an iPhone 6.