Wall (7,249 threads)
Tips
Before asking a question, make sure to read the FAQ.
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
CK
10 hours ago
LeviHighway
14 hours ago
frpzzd
19 hours ago
doemaar14
3 days ago
gillux
3 days ago
sharptoothed
3 days ago
Babelball
6 days ago
TATAR1
6 days ago
LeviHighway
6 days ago
AlanF_US
6 days ago
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.
I'm trying to search English sentences that are questions or contain questions, i.e. sentences containing the question mark "?" character. I cannot figure out how to do it, and reading over the Advanced Search instruction page, it seems like this may not be possible:
https://en.wiki.tatoeba.org/art...ctuation-marks
This page actually says: "Most punctuation symbols cannot be found via a search." The only listed exceptions are $ and _, which can be escaped by a backslash.
Is it really true that this is impossible with the current search function? If so, I will probably open an issue on the Github repo for this, since being able to search for questions seems like a pretty fundamental piece of functionality.
in addition to that, I found that the Chinese period (。) and Chinese semi-comma (、) is considered a word, they add up the word count and can be searched, but the Chinese question mark (?) and most other punctuations are not considered a word and cannot be searched.
You could try these approaches to find sentences with a high-probability of being English questions.
^who|^what|^where|^when|^why|^how
Sentences beginning with these question words
^Is|^Are|^Was|^Were|^Do|^Does|^Did|^Can|^Could|^Will|^Would
Sentences beginning with these question words
Here is a link that will give you a random selection with both of the above combined.
Results limited to sentences with audio and 3 or more words.
After trying this, you can edit the search criteria.
https://tatoeba.org/en/sentence...rd_count_min=3
I recomed the &rand_seed, so clicking the link again will give different results.
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.
I think most of nonong's sentences should be deleted or rewritten. They're too long and when translating his sentences into other languages, they would usually go longer than the limit.
https://tatoeba.org/zh-cn/sentences/of_user/nonong
Agreed. Also, the length of the entries keeps users from attempting to translate them. Content-wise they're good sentences, mostly. Is there a way to somehow automatically split them up into single sentences?
It will not help. Firstly, it breaks the context. Maybe it keeps users from translating them, but when sentences have more content, it makes them much more interesting for readers and language learners. And secondly, it will not change anything, because this limit is still not mentioned in any guidelines or site rules and there is no any kind of character counter when we add sentences. So I don't even know when my text will be too long and will be cut because of it . And this problem will happen again and again. But even if we could change it, there will always someone who wants to write a text with all permitted characters. So I think that the number of permitted characters should be different for translations and original sentences (more for translations). To me that seems like the most correct solution.
> I think most of nonong's sentences should be deleted or rewritten. They're too long and when translating his sentences into other languages, they would usually go longer than the limit.
Sentences are deleted only when they break site guidelines, such as when they:
- attack community members
- break copyright or license rules
- are of poor quality and cannot easily be fixed
Sentences are generally rewritten on an individual basis, to correct errors that have been pointed out in comments. Breaking up a collection of sentences, either manually or automatically, is infeasible for a variety of reasons.
Long sentences create a bunch of problems. They tend to go uncorrected, so they often contain errors, subtle or otherwise. As their length increases, so does the probability that they will contain something that is difficult or impossible to translate, rendering the entire content untranslatable. And, as you have noticed, translations are likely to be long as well. Even if your translation does fit within the maximum number of characters, it's quite possible that someone who translates your sentence will not be able to do so within the limit.
There is a simple solution: don't translate long sentences.
Tatoeba is not only for translations. It is also useful to find examples of usage in a single language only.
If other contributors are adding long sentences, it probably means they are useful to them. Maybe they are not useful to you, but it doesn’t mean they shouldn’t be on Tatoeba. What should or should not be on Tatoeba is a totally different thing than your own needs.
If you don’t want to see long sentences, you can filter them out from the search by setting a maximum number of words.
While there is definitely a place for long sentences or walls of text, I think three concise sentences are much more useful (and the better option): one sentence containing the key word, and two sentences providing context or illustration.
E.g. for the word 'car': ''He got into the car. The dashboard lights flickered for a moment. He exhaled slowly, deciding whether to drive away or go back inside.''
Tatoeba was updated today. What’s new?
- New interface language Chinese (Taiwan) using traditional characters, thanks to the tremendous translation efforts of @LeviHighway. Levi also helped identifying UI strings problems which have been fixed and now benefit to all UI languages.
- New features and minor tweaks to ease moderation work by admins. In particular, admins can now edit all the fields of users’ profile.
- User status change now have immediate effect. For example, if you are promoted to advanced user, you won’t have to logout and login to get the new status.
- The random "blackholed request" issues are hopefully/mostly solved, but it may still happen since these are difficult to identify.
- Attempting to edit a vocabulary item in a way that results into a duplicate now display an error message, thanks to a contribution from @nmleuz
Technical details: https://github.com/Tatoeba/tato...e/301?closed=1