Tips

Here you can ask general questions like how to use Tatoeba, report bugs or strange behavior, or simply socialize with the rest of the community.

Before asking a question, make sure to read the FAQ.

Wall (3803 threads)

<<< 1234567 >>
pchamorro
5 hours ago
For new contributors, it would be a good idea to include a link for this page: http://blog.tatoeba.org/2010/02...n-tatoeba.html
Impersonator
21 days ago - edited 3 hours ago
*** Limited language list is confusing for new users ***

Currently, a user can only choose the languages they have chosen in this profile. This means, if they made a mistake when choosing a language in their profile, they will carry this mistake on to all the added sentences.

Here’s an example of an user who chose Cherokee over Chinese in the profile, and then added Chinese sentences tagged as Cherokee:
http://tatoeba.org/eng/user/profile/mng

If they had a full list of languages, they probably would have noticed they’re adding something wrong.

I have no idea how to deal with this. :/
hide replies
gillux
20 days ago
I think we should restore the complete list of languages and put the profile languages on the top, the same way personal lists appears on the top when you click on the “Add to list” icon of a sentence.
hide replies
sharptoothed
20 days ago
I like the idea.
123xyz
12 hours ago
I see that it's been implemented now.
hide replies
gillux
8 hours ago
No, it has been implemented everywhere but on the “add sentence” and “add translation” forms.
hide replies
TRANG
20 hours ago
I personally don't think the problem is due to the restriction of the list but due to the fact that the rules and the form are not very clear for new users.

The form is missing some label and instruction to explain to new users that if a language is not in the list, they should add it in their profile so that they can select the correct language for the sentences they add.

This wasn't much of a problem because there was the auto-detect option. But now, users who have only 1 language in their profile don't have the auto-detect option.

I can bet that if we had the full list of languages, without the auto-detect option, a lot of languages would be wrongly added in Abkhaz, or whatever is the first language in the list.
hide replies
gillux
7 hours ago
> I personally don't think the problem is due to the restriction of the list but due to the fact that the rules and the form are not very clear for new users.

I think it’s both. If you look cindycute’s profile, there is only English while he or she’s a native Chinese. When people are learning a language, and they are asked to add languages they are interested in into their profile, it’s normal they don’t add their native language. Who would know? Even when you’re adding a new language into your profile because the “add sentence” or “add translation” form told you to do so, it’s not clear that we mean your native language rather than the language you want to add sentences in or translate into.

We could solve this problem by having the following process:
• Bring back the full list of languages, with autodetection selected by default.
• Autodetect the language *before* submitting the sentence, so that the user is aware of what language he or she’s about to submit.
• If the selected language isn’t in the user’s profile, display an error message “Sounds like you’re trying to add a sentence in X, but you didn’t add this language to your profile.” And if X was autodetected: “If the sentence was mistakenly detected as X, please select the correct language.”
123xyz
19 hours ago - edited 19 hours ago
I see that Horus (the duplicate-merging bot) runs all the time now. As soon as I post a duplicate sentence, it's merged with the appropriate pair. Is this something new or has it always been this way? I was under the impression that Horus is manually activated once a month or so (whenever a powerful user such as an administrator feels that there is a need for that). Anyhow, I think that this immediate merging is excellent.
hide replies
TRANG
14 hours ago
It's not really new. It's been like this for a month now.

Cf. https://github.com/Tatoeba/tatoeba2/issues/665
hide replies
123xyz
12 hours ago
All right, still rather new. Thank you for the reply.
pchamorro
yesterday - edited yesterday
I wish it would be an option to tag translated phrases as "wrong" or "not the main meaning" or "there are other uses more common for this phrase" etc. or simply Thumbs up or Thumbs down, or "check needed". I mean something easy to apply, or at least, to include instructions about how to do in case someone find weird or wrong translations. I would like to have something easy to use that simply works for potential contributors. Thank you.
hide replies
123xyz
yesterday
I support the idea with the thumbs up/thumbs down; the rest is already feasible with the existing tags, as far as I'm concerned.

123xyz
2 days ago
Is it compatible with Tatoeba's goals and ideology to post incorrect words and/or grammatical structures that people use all the time in everyday life and then mark them as "common error", or would that be misleading and potentially destructive? I myself think that letting users see what an error looks like is almost as useful as showing them something correct, but I can easily imagine someone reading a sentence containing a common error, not bothering to open it to read the tags/comments, and inadvertently "learning" something wrong from it.
hide replies
gillux
2 days ago
Yes, I don’t think such sentences go against Tatoeba’s goals. The guidelines say “We want sentences that a native speaker would actually use” ¹, so giving an error is common among native speakers, you can put it on Tatoeba. I think we rather use tags like “casual” however.

1. http://en.wiki.tatoeba.org/arti...s,-not-word-fo

> I can easily imagine someone reading a sentence containing a common error, not bothering to open it to read the tags/comments, and inadvertently "learning" something wrong from it.

That’s what you get from learning sentences out of context. And this is true for correct sentences too. If you learn a grammatically correct sentence without knowing it’s exclusively used in, let’s say business context, you’re learning something wrong by thinking it could be used casually.
hide replies
CK
CK
2 days ago - edited 2 days ago
I'd suggest not contributing such sentences unless you actually use them yourself.

Many of the projects that use the data from tatoeba.org, only use the sentences and don't use the tags and comments. (http://bit.ly/tatoebalinks)

I'd hate to be learning Macedonian, using one of your sentences, thinking that I was learning something that was natural, if it wasn't.
hide replies
123xyz
2 days ago - edited 2 days ago
>I think we rather use tags like “casual” however.

I think there's a notable difference between casual and wrong - it is possible for something to be casual but still be in accordance with a language's official rules. For example, "what's up" in English is casual, but it doesn't contain any mistakes.

>That’s what you get from learning sentences out of context. And this is true for correct >sentences too. If you learn a grammatically correct sentence without knowing it’s >exclusively used in, let’s say business context, you’re learning something wrong by >thinking it could be used casually.

I definitely agree with this. As I'm translating some general English sentences, I am continually aware of how some of my translations only reflect single meanings of the original sentences, such that it would be easy for a hypothetical amateur learner to use them in the wrong context. I try to post multiple translations where possible, but whereas that helps a Macedonian speaker learning English, it doesn't necessarily help an English speaker learning Macedonian. However, analysing multiple sentences comparatively should allow one to deduce what's to be used in what context, I suppose.



Guybrush88
2 days ago - edited 2 days ago
> Many of the projects that use the data from tatoeba.org, only use the sentences and don't use the tags and comments. (http://bit.ly/tatoebalinks)

> I'd hate to be learning Macedonian, using one of your sentences, thinking that I was learning something that was natural, if it wasn't.

Personally I think this is pretty unrelated to Tatoeba itself. I don't think it's Tatoeba users' fault if other websites decide to use just the sentences and not tags and comments when using Tatoeba sentences for their own projects. If I add a sentence to Tatoeba and I tag or comment it in a certain way, I don't think it's my fault if another project use my sentence and omits some information I gave on the original sentence
123xyz
2 days ago
>I'd suggest not contributing such sentences unless you actually use them yourself.

I don't want to contribute such sentences either - I would find it annoying. I was just curious as to what other members think. However, about the "unless you actually use them yourself" part, I don't see how that's important - I might consciously use something wrong myself, just because it's mainstream.

>I'd hate to be learning Macedonian, using one of your sentences, thinking that I was >learning something that was natural, if it wasn't.

There would be no problem with something being natural or unnatural - I was specifically referring to very common errors, which sound natural to native speakers, but which violate standard Macedonian's rules nonetheless. I would never post something random, sounding like "I were into the bus", and mark it as wrong, just so as to illustrate one of infinite possible incorrect ways to say "I was on the bus".
al_ex_an_der
2 days ago
> I myself think that letting users see what an error looks like is almost as useful as showing them something correct

As far as tatoeba is concerned, it's aim is definitely NOT to register errors, but in the contrary to display examples showing what is considered correct use in a given language.

To register, analyse and comment common errors is certainly useful too. But tatoeba would be the wrongest place for that.
hide replies
CK
CK
yesterday - edited yesterday
I agree with this.

However, if 123xyz is referring to things wrong according to "prescriptive grammar," but correct according to "descriptive grammar," then I see nothing wrong with it, since such sentences are actually used.

Here's a good 23-minute video talking about "Prescriptive vs. Descriptive Grammar" as it applies to American English.
https://www.youtube.com/watch?v=ugf_yCelFE4
Ooneykcall
yesterday
I thought we aim to provide "natural" sentences, aren't we?
gillux
4 days ago - edited 5 hours ago
** Advanced Search **

I’ve been working on implementing an advanced search feature. You can test it here: https://dev.tatoeba.org/sentences/search

Now, unapproved and orphan sentences are no longer put at the end of the search results. Instead, they are filtered out by default when performing a search from the top bar. One can later make them appear in the results by changing the appropriate criterion.

Feel free to comment on anything. Note that searches are performed on a copy of tatoeba.org’s database from February 2nd, so you won’t find sentence added past this date. But you can add new sentences just for the sake of searching. They should be visible within ten minutes.

I’d like you to test not only all the criteria, but also searching for newly-added or newly-modified sentences. I changed the way new or modified sentences got indexed, so this part may contains bugs too. In short, every sentence you add or modify should become visible (or disappear) in a search within ten minutes, be it because of a modification of its contents, its tags, ownership, audio, link etc. anything that can be searched.

Previous thread: https://tatoeba.org/wall/show_message/22852
hide replies
CK
CK
4 days ago - edited 4 days ago
Perhaps the "advanced search" boxes could be put in the right column of the page. It would look nicer, I think, and the search results would display better.

I tried this search and it worked well.

https://dev.tatoeba.org/eng/sen...ans_has_audio=

With the word: Tom
Language: English
With audio
Not translated into Japanese
Tagged: SVC


I tried this one, too. It also worked as expected.

https://dev.tatoeba.org/eng/sen...ans_has_audio=

Words: でしょう
Language: Japanese
Show: English
Owner: tommy_san

Limit to:
Language: English
Link: direct
Owner: CK


This is an interesting possibility, too.
It's an example of what a Japanese native speaker could use, if he wanted to find English sentences already translated into German, and easily see which of them hasn't yet been translated into Japanese, even though the search doesn't automatically eliminate those.

https://dev.tatoeba.org/eng/sen...ans_has_audio=

Words: house
Language: English
Show translations in: Japanese
Has audio: Yes

Limit to
Language: German
Link: Direct
hide replies
gillux
3 days ago
I tried to put the search fields to the right, although it feels a bit packed to me, and it requires quite some scrolling to get to the submit button, and it uglify the page when there are no results. Tell me what you think.
hide replies
AlanF_US
2 days ago - edited 2 days ago
I prefer the original layout, for the reasons you say. (Note that it requires scrolling not only to hit the submit button, but also to access many fields that I would expect would be frequently changed.) However, if you do put it on the right, I would hope that you would at least keep the link to an advanced search, since, as it is now, there's no clue that we support all these additional criteria.
hide replies
CK
CK
2 days ago - edited 2 days ago
Of course, you could add an additional "submit" button at the top, too, which would help.

See an example:
http://goo.gl/kBy2Ce

The problem with the original layout was that you had to scroll before you could see the search results. I guess it just depends on which scrolling will be more irritating -- scrolling to set the search criteria, or scrolling to see the results on each page.
hide replies
CK
CK
2 days ago
It might also be nice to also have a single page "advanced search" form, with a link to it from one of the drop-down menus at the top of the page.

This would allow visitors to start with an advanced search if they wanted to, rather than doing a "throw away" search, just to get to the advanced search form.

You could also include a bit more information on that page if further information were needed.
hide replies
AlanF_US
2 days ago
Yes. Also, would it be possible to open up a new window (or tab -- I guess this would depend on how a user's browser is configured) for search results? I've wanted this before, but with the advanced search feature, it makes even more sense. Then one could keep open the search criteria window/tab for future searches even after one has closed the the results window/tab. In that case, the layout that would require less scrolling would be the original "horizontal" one rather than the right-sidebar "vertical" one.
AlanF_US
3 days ago
There's a typo: "Oprhan sentences are likely to be incorrect."
hide replies
gillux
3 days ago
Where?
hide replies
CK
CK
3 days ago
Oprhan => Orphan
AlanF_US
3 days ago
When you type right-to-left text into the "Words" field, it's left-justified rather than right-justified. This differs from the behavior of, say, the "Example sentences with the words:" field in the main search area.
hide replies
gillux
3 days ago
Thank you, this should be fixed now.
hide replies
AlanF_US
2 days ago
Yes, it works fine now.
AlanF_US
3 days ago - edited 3 days ago
It would be nice to have a way to "gray out" the whole "Translations" box for those times when you don't want it to affect your search.

It would also be nice to have a "Randomize" feature that displays the sentences in random order so that you don't end up seeing the same sentences over and over again, especially if it produces a search that you want to bookmark and use repeatedly. I did this in a bookmarklet that I wrote ( http://en.wiki.tatoeba.org/arti...ated-sentences ).

In general, the new feature works very well.
hide replies
gillux
3 days ago
What do you mean by graying out? Do you mean a reset button for that part only? Or do you mean to gray it out automatically when its fields are all set their default values?
hide replies
AlanF_US
2 days ago
I guess a reset button is what I want. Better not to gray it out, since that might make people think that nothing in that area could be selected.
CK
CK
3 days ago - edited 3 days ago
Rather than putting a randomize function just for the advanced search, I'd suggest adding a "rnd" link to the pagination code, so it's possible to jump to a randomly-selected page for any series of pages with that code.

For example, ...

Browse English sentences with audio
http://tatoeba.org/eng/sentence...nly-with-audio

1. Get the highest page number from the last link in the pagination code (>>).
2. Generate the random number from that.
3. Put that number in the "RND" link. (Perhaps this could be located just to the left of > >> , but maybe another location would be better.)


Here's an example of something I put together to jump to a randomly-chosen page on my list of proofread English sentences.

http://bit.ly/randomenglish

By having a "random" link as part of the pagination code, this type of thing would be possible for any list. You could even jump to a randomly-chosen Wall page, or page of comments.
hide replies
gillux
3 days ago
Since adding a “random page” link on every pagination is rather more difficult to do, I only added a random sort order function to the advanced search.
gillux
3 days ago
It’s now possible to randomize the results by selecting the appropriate option of the “Sort” field.
hide replies
AlanF_US
2 days ago
I love it, including the other sorting options. Thanks!

I suggest these rewordings:

"Results sort" -> "Order" (because the box caption already says "Sort")
"Less number of words should be first" -> "Shortest first" or "Fewest words first"
"Last created first" -> "Most recently created first" (if it fits, otherwise keep what you have)
"Last modified first" -> "Most recently modified first" (if it fits, otherwise keep what you have)
hide replies
AlanF_US
2 days ago
Also, a rewording for the "More search criteria" group box:

"Owned by a self-proclamed native" -> "Owned by a self-identified native" or "Owned by a self-identified native speaker"
hide replies
al_ex_an_der
2 days ago - edited 2 days ago
And don't forget this one:
"Owned by a self proclaimed hyper-genius allegedly being fluent at a near-native level in fifty languages"
;-)
hide replies
CK
CK
2 days ago - edited 2 days ago
Here is a list of members claiming more that one native language if you're interested.

http://goo.gl/kqE19W

Here is a list of those who only claim one native language.

http://goo.gl/7xjY31
CK
CK
2 days ago - edited 2 days ago
The "owned by native speaker" function doesn't work yet, though.

Note that Sharptoothed's sentences show in this search, even though his profile doesn't say he's a native English speaker.

https://dev.tatoeba.org/eng/sen...amp;sort=words
hide replies
tommy_san
2 days ago
This is strange indeed.
https://dev.tatoeba.org/jpn/sen...amp;sort=words

I don't mind seeing sentences owned by non-native speakers though, as long as they are tagged OK by native speakers. I'd find the criterion "owned or tagged OK by self-proclaimed native speakers" more useful.
hide replies
CK
CK
2 days ago - edited 2 days ago
>"owned or tagged OK by self-proclaimed native speakers"

This would be possible already.

1. Click the "native speakers" option.
2. Add the tag OK.

This would work assuming that each advanced contributor is following the rules and only adding the OK tag to sentences in their own native language. If we can trust members to be honest about their native languages, we should be able to trust them to follow the rules. (At least, we can trust many of our members.)
hide replies
tommy_san
2 days ago
You're right, but I still think that the mother tongue of a sentence has nothing to do with its quality. Are you implying that sentences that are in your OK list and owned by non-native speakers are worse in quality than those owned by native speakers?

It'd be also nice if we could search for/among the sentences owned or tagged OK by one or more particular members.
gillux
2 days ago
That’s because dev.tatoeba.org’s database is so old that doesn’t contain much natives. There are only people who set themselves as natives on dev.tatoeba.org when Trang introduced this feature. See https://dev.tatoeba.org/stats/native_speakers
CK
CK
2 days ago - edited 2 days ago
1. Another idea to consider would be to allow the option of an alphabetical sort if it doesn't put too much of a load on the server.

This would, for English, often group similar sentence patterns together.

I plan to go.
I plan to swim.

Tom loves fishing.
Tom loves windsurfing.

2. Also, another idea to consider would be to allow the option to sort the sentences by how they end, if this doesn't put too much load on the server.

This would, for Japanese, often group similar sentence patterns together.

あなたはどのくらいのお金が必要なのですか。
ここなら安全なのですか。
あなたは私に何を尋ねたいのですか。

外国語の勉強を始めるのに遅すぎるということはありません。
アラビア語は難しい言語ではありません。
影響はいつも同じではありません。

(You can see such a list of 2,792 sentence here. https://tatoeba.org/sentences_lists/show/4383.)

3. This one may put too much load on the server, but ...

Allow the option to display sentences with audio first, followed by sentences without audio, like the following.

[#2643713] See you in the morning, Tom. *audio*
[#3330300] I'll see Tom in the morning. *audio*
[#2541432] I'll talk to Tom in the morning. *audio*
[#3415139] Tom works in the morning.
[#3475104] Tom left early in the morning.
[#3822250] You can see Tom in the morning.

hide replies
gillux
2 days ago - edited 2 days ago
> 1. Another idea to consider would be to allow the option of an alphabetical sort if it doesn't put too much of a load on the server.
> 2. Also, another idea to consider would be to allow the option to sort the sentences by how they end, if this doesn't put too much load on the server.

I think this would be too resource-hungry as we’d need to index all the sentence texts.

> 3. This one may put too much load on the server, but ...

This won’t put any load and is easy to do.
CK
CK
3 days ago - edited 3 days ago
I don't know if it's possible, but here's an idea to consider.

Allow some way to skip the first 1,000 results, so that once a person has browsed (and possibly translated the first 1,000 results), they can search again, skipping the first 1,000 results, and continue browsing the results. (Maybe, also allow skipping 2,000, 3,000, etc.)

Here's an example search with over 7,000 results.

https://dev.tatoeba.org/eng/sen...ans_has_audio=

Words: and
Language: English
Tags: OK

Not directly translated into Japanese
hide replies
gillux
3 days ago
No, it’s not possible to display more than the first 1000 results. Why would you want that by the way? Also, what if the search criteria are set so that sentences disappear from the search results once translated? This way, the contributor only needs to translate the first results.
hide replies
CK
CK
2 days ago - edited 2 days ago
>Why would you want that by the way?

This would allow a student who is studying certain things to find and read all sentences that meet a certain criteria, rather than just the first 1,000.

However, if it's "not possible to display more than the first 1000 results", then it's not possible. Maybe in the future someone can figure out how to do it.

Serious students can always do such searches through the downloaded data. I just thought it might be useful and easier for people to do it online here at tatoeba.org.

The following is an example for the English word "need" with 4,968 sentences that have been pre-searched, limited to sentences on this list http://tatoeba.org/sentences_lists/show/907/und.

http://www.manythings.org/sente...ds/need/6.html

If such a search were possible on tatoeba.org, then people could get a similar experience with other languages, too.

TRANG
2 days ago
The dev website has been updated and now only includes the fixes/improvements that will be deployed in the next Tatoeba update (scheduled for Monday). If you have some time, please test these:

https://github.com/Tatoeba/tato...29+is%3Aclosed

For those who were testing the advanced search feature[1], note that this feature feature will not be testable anymore on the dev website for a couple of days. You will be able to keep testing it after Monday, when Tatoeba will be updated.

Thank you!

-----

[1] http://tatoeba.org/eng/wall/sho...#message_23234
lipao
2015-05-18 13:23 - edited 2015-05-18 13:25
Hello, I'll ask a silly question that must have been asked a thousand times before, but still: Is there any way to search for a part of some words, I mean, for an affix, or even for a single letter inside a word? Would it be possible somehow to teach the search query how to work with asterisks? (I'm computer illiterate, you know.)
hide replies
gillux
2015-05-18 18:45
Hello lipao. No, it’s not possible the way you describe it, but you might want to have a look at this article: http://en.wiki.tatoeba.org/arti...w/text-search. Pretty much all you can do is described there.
hide replies
lipao
2015-05-18 19:04
OK, thank you.
tornado
2015-05-18 20:09
I think what lipao is looking for is called a "wildcard search". It's especially useful to study root words and affixes. There are many search functions on that wiki page, but apparently using wildcards is not possible. Support for that in the future would be highly appreciated.
hide replies
AlanF_US
2015-05-18 22:37
You can download the sentences in a given language and then search through the downloaded file. Naturally, that requires more effort, as well as knowledge of whichever program or programming language you use to process it, but it is an option in the absence of a wildcard search offered on the site itself.
hide replies
tornado
2015-05-18 23:22 - edited 2015-05-19 00:54
I didn't think about it. Definitely, it would be much more difficult working on a huge file than Tatoeba would be having native support for that, but it's better than nothing. Thank you.
hide replies
tornado
2 days ago
It's supported now. Thanks to those who spent their time and energy to implement that feature.
hide replies
AlanF_US
2 days ago
Yes, many thanks to gillux, who made this change.

You can't search for a single letter (because this would make the index huge), but you can search for a string of three or more letters.
hide replies
tornado
2 days ago
Searching for a single letter is obviously an extreme case, but some may want to search for two letters, depending on language.

I have also managed to conduct some wildcard searches that consist of two letters (they have at least one non-English letter).

http://tatoeba.org/eng/sentence...tur&to=und
http://tatoeba.org/eng/sentence...tur&to=und

Anyway, allowing three letters would usually be sufficient.
hide replies
AlanF_US
2 days ago
In the first search you mentioned, "uş" is represented internally as three characters: u + s + cedilla. In the second search, "tı" is represented internally as two characters, but the second one is two bytes long (in UTF-8). So maybe the criterion is that the string needs to be at least three bytes long.

Probably the summary, for people who are not interested in the technical details, is that searches for strings of three characters will work, while searches for strings of two characters may or may not, depending on what they are. Searches for strings of one character probably will not work.
danepo
2015-05-19 07:31
You can only search for whole words. You have to search like this:

mi|vi|ni|ili|li|ŝi|Tom|Mary|Tomo|Manjo|Maria havas|havis|havos|havu|havus hundon|hundojn|virhundon|virhundojn|hundinon|hundinojn

I think you can search for more than 200 letters.

ecorralest101
3 days ago
I have a question. In the languages supported by Tatoeba, I can see Bereber, however, it does not specify which dialect it is, Bereber is a group of languages, it can be Tamazight, Tachelhit or Kabyle among others. I wonder if someone can clarify this doubt. It would be more precise to know which dialect it corresponds to.
hide replies
123xyz
3 days ago
It's mostly Tamazight, since Amastan, the user who built most of the Berber corpus, spekas Tamazight (Amazigh).
123xyz
3 days ago
In English, we say "Berber", and not "Bereber", as in Spanish.
hide replies
ecorralest101
3 days ago
Thanks for your answer
Lebad
3 days ago
** Mass translating **

I'm either missing an obvious feature here, or there is no mode to mass translate all the sentence on a page. For example, I search for English sentences that have no translations in my native language, and I get a list of such sentences.

Then I have to click on the first sentence to translate, type a sentence, and hit Enter. Now if I want to translate a sentence again (to add alternatives), I have to pick up my mouse again! Translating seems to be the most important feature here and I fear there are not enough keyboard shortcuts that allow me to do it efficiently without wasting time on repetitive work.

There should be a shortcut, for example, SHIFT + ENTER that will send the sentence and re-open the dialogue for entering an alternative, and CTRL + ENTER that will send the sentence and open the dialogue for the next sentence. Pressing CTRL + ENTER on the blank entry should just skip to next sentence.

Is it already implemented, either in Tatoeba or as an add-on? If not, I think it would be a huge step to making our time spent here more efficient.
hide replies
ricardo14
3 days ago
+1
Guybrush88
3 days ago
maybe also this feature could be implemented to speed up the translation process: https://github.com/Tatoeba/tatoeba2/issues/334
<<< 1234567 >>