menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search

Wall (6,006 threads)

Tips

Before asking a question, make sure to read the FAQ.

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

Latest messages feedback

CK

44 minutes ago

subdirectory_arrow_right

Yorwba

6 hours ago

subdirectory_arrow_right

Selena777

8 hours ago

subdirectory_arrow_right

rumpelstilzchen

yesterday

feedback

MarijnKp

2 days ago

subdirectory_arrow_right

Ricardo14

4 days ago

subdirectory_arrow_right

small_snow

4 days ago

subdirectory_arrow_right

Selena777

4 days ago

subdirectory_arrow_right

DJ_Saidez

4 days ago

subdirectory_arrow_right

PaulP

4 days ago

morbrorper morbrorper September 12, 2020, edited September 12, 2020 September 12, 2020 at 7:09:11 AM UTC, edited September 12, 2020 at 7:13:33 AM UTC link Permalink

Most often, when I perform an exact search for a Japanese sentence I have found at Clozemaster, the search doesn't find anything. However, if I set "Is orphan" and "Is unapproved" to Any, the sentence will be found. I find it very irritating that I have to go through that extra step, especially if the first search is lengthy.

Try this one: https://tatoeba.org/eng/sentenc...rom=und&to=und

I can possibly understand that Tatoeba does not want to pollute the search results with dodgy sentences, but when there is an exact match I think an exception should be made.

Another issue is why there are so many Japanese sentences that seem to be mostly forgotten, having no owner and sometimes with very "imaginative" English translations. Might this partly be because they are hidden by the search interface?

{{vm.hiddenReplies[35938] ? 'expand_more' : 'expand_less'}} hide replies show replies
CK CK September 12, 2020 September 12, 2020 at 8:45:36 AM UTC link Permalink

Bookmark this URL.
It's a pre-filled advanced search form with what you want.

https://tatoeba.org/eng/sentenc...&sort_reverse=

The pre-filled search form option is something that was added recently.

{{vm.hiddenReplies[35939] ? 'expand_more' : 'expand_less'}} hide replies show replies
morbrorper morbrorper September 12, 2020 September 12, 2020 at 9:26:15 AM UTC link Permalink

Thanks, that helps a bit, but not when I issue the search directly from Clozemaster.

{{vm.hiddenReplies[35940] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US September 12, 2020, edited September 12, 2020 September 12, 2020 at 12:06:54 PM UTC, edited September 12, 2020 at 12:12:23 PM UTC link Permalink

I suggest posting a request at Clozemaster for them to change the parameters of the query issued to include at least orphans (unapproved might be more problematic) when searching for a full sentence. Searching for a single word is probably a different story, since you would probably get enough good matches without relaxing the criteria.

Many of our Japanese sentences come from the Tanaka Corpus, and we've had too few Japanese-speaking members to fix all the problems with it.

http://www.edrdg.org/wiki/index.php/Tanaka_Corpus

Cabo Cabo September 18, 2020 September 18, 2020 at 5:18:44 AM UTC link Permalink

Are you sure that all of the sentences originate from here?
Or maybe the sentence was rewritten but on clozemaster the infomation hasn't changed.
I also not found one of its English sentence pair so far. (in first 100 most used words list)

{{vm.hiddenReplies[35964] ? 'expand_more' : 'expand_less'}} hide replies show replies
morbrorper morbrorper September 18, 2020 September 18, 2020 at 7:00:48 AM UTC link Permalink

Judging from my experience, I am quite sure. I don't think Clozemaster uses any other corpus than Tatoeba, and Clozemaster does not rewrite sentences, to my knowledge.

It seems that Clozemaster fetched all its Japanese sentences from Tatoeba some five to eight years ago, so any changes made after that are not reflected in its corpus. On the other hand, not a lot seems to have happened with these sentences here at Tatoeba in the meantime, anyway. But if you don't find a particular sentence, it might have been deleted.

sharptoothed sharptoothed September 13, 2020 September 13, 2020 at 4:39:07 PM UTC link Permalink

** Stats & Graphs **

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/

{{vm.hiddenReplies[35943] ? 'expand_more' : 'expand_less'}} hide replies show replies
Guybrush88 Guybrush88 September 13, 2020 September 13, 2020 at 10:03:04 PM UTC link Permalink

thanks

CK CK September 14, 2020 September 14, 2020 at 11:48:53 PM UTC link Permalink

51 usernames from sharptoothed's Tatoeba User Activity Chart from last week had both a tcnt/scnt over 1 and over 1,000 native language sentences.

aldar, alexmarcelo, arh, Balamax, bandeirante, bill, brauchinet, bunbuku, CH, CK, CM, CN, danepo, deniko, diegohn, dotheduyet1999, elenacristina260, Esperantostern, felix63, felvideki, gillux, GrizaLeono, Hybrid, lbdx, LeeSooHa, Luiaard, manese, marafon, MarijnKp, martinod, Micsmithel, morbrorper, mraz, Ninja, Nylez, Objectivesea, odexed, ondo, PaulP, Pfirsichbaeumchen, po_slovensky, Ricardo14, sacredceltic, Selena777, sharptoothed, shekitten, Shishir, Silja, small_snow, Tepan, Yorwba

Source URL: https://tatoeba.j-langtools.com...=1&chtype=alla

{{vm.hiddenReplies[35947] ? 'expand_more' : 'expand_less'}} hide replies show replies
Thanuir Thanuir September 15, 2020 September 15, 2020 at 4:32:48 AM UTC link Permalink

tcnt/scnt?

{{vm.hiddenReplies[35948] ? 'expand_more' : 'expand_less'}} hide replies show replies
deniko deniko September 15, 2020, edited September 15, 2020 September 15, 2020 at 3:20:53 PM UTC, edited September 15, 2020 at 3:21:04 PM UTC link Permalink

tcnt/scnt is this guy:

https://i.imgur.com/KSwx03t.png

If it's 1.5 this means 1000 of your sentences have 1500 translations directly linked to it (on average).

If it's <1 this means you add a lot of sentences without translations.

{{vm.hiddenReplies[35949] ? 'expand_more' : 'expand_less'}} hide replies show replies
Cabo Cabo September 15, 2020 September 15, 2020 at 5:05:30 PM UTC link Permalink

This number only means that how many sentences are connected your sentences.

"If it's <1 this means you add a lot of sentences without translations."
If I write only one sentence (and not translate it), but 8 other contributor tranlates that, then my tcnt/scnt number is 8.

Thanuir Thanuir September 15, 2020 September 15, 2020 at 6:35:04 PM UTC link Permalink

So this is a measure of

1. how popular your language is (as a foreign language)
2. how convenient (easy, short, simple) your sentences are to translate
3. how many sentences you write as translations of others, rather than as original sentences.

I would say that it is not particularly virtuous to aim for a high number here.

{{vm.hiddenReplies[35952] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US September 15, 2020 September 15, 2020 at 10:36:56 PM UTC link Permalink

I agree with you.

soliloquist soliloquist September 15, 2020, edited September 16, 2020 September 15, 2020 at 8:15:59 PM UTC, edited September 16, 2020 at 11:29:12 AM UTC link Permalink

I don't quite understand how tcnt/scnt is determined.

The user below has 3 sentences all of which are linked to a translation, so I would expect their tcnt/scnt to be 1, but it is 0.3 on the chart.

https://tatoeba.org/eng/activit...ces_of/nnireyh


And this user's ratio is 0.8 on the chart although their all sentences are linked to a translation.

https://tatoeba.org/eng/activit..._of/0sigmaone0

What am I missing?

{{vm.hiddenReplies[35953] ? 'expand_more' : 'expand_less'}} hide replies show replies
sharptoothed sharptoothed September 16, 2020 September 16, 2020 at 5:23:36 AM UTC link Permalink

It looks like "Count only native contributions" checkbox is checked. Uncheck it and you'll see what you expected. I can't explain it right now, sorry. Maybe there's a bug in my scripts. I'll try to find it out.

{{vm.hiddenReplies[35957] ? 'expand_more' : 'expand_less'}} hide replies show replies
soliloquist soliloquist September 16, 2020 September 16, 2020 at 11:27:43 AM UTC link Permalink

Thanks. I'm sorry to have bothered you.

The user glavsaltulo has 31,220 sentences (31,193 of them are in their native language, Lithuanian).

https://tatoeba.org/eng/activit...of/glavsaltulo

And only 4 of their sentences are untranslated.

https://tatoeba.org/eng/sentenc...&sort_reverse=

But their tcnt/scnt is 0.50 on the chart when 'Count only native contributions' is checked.

So there must be something that decreases this number other than untranslated and non-native sentences.

Unchecking that checkbox gives a more accurate number as you mentioned.

{{vm.hiddenReplies[35960] ? 'expand_more' : 'expand_less'}} hide replies show replies
sharptoothed sharptoothed September 16, 2020 September 16, 2020 at 7:54:29 PM UTC link Permalink

When that checkbox is checked only sentences created by natives counted. That is, if a member has 3 sentences and all of them are translated once but only one sentence belongs to a native then the ratio will be 1/3.

{{vm.hiddenReplies[35961] ? 'expand_more' : 'expand_less'}} hide replies show replies
soliloquist soliloquist September 16, 2020, edited September 18, 2020 September 16, 2020 at 9:06:52 PM UTC, edited September 18, 2020 at 10:38:33 PM UTC link Permalink

.

{{vm.hiddenReplies[35962] ? 'expand_more' : 'expand_less'}} hide replies show replies
sharptoothed sharptoothed September 17, 2020 September 17, 2020 at 4:39:17 PM UTC link Permalink

"Count only native contributions" is unchecked by default and saved in cookies, as far as I remember. This checkbox is for those who needs statistics for native contributors.

Ricardo14 Ricardo14 September 15, 2020 September 15, 2020 at 4:12:21 PM UTC link Permalink

Shouldn't the Hebrew sentences be RTL?

> https://tatoeba.org/eng/sentenc...&sort_reverse=

> https://tatoeba.org/eng/contributions/latest/heb

{{vm.hiddenReplies[35950] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US September 15, 2020 September 15, 2020 at 10:41:25 PM UTC link Permalink

They are displayed right-to-left, but I think your question is whether they should be right-justified rather than left-justified, correct?

{{vm.hiddenReplies[35955] ? 'expand_more' : 'expand_less'}} hide replies show replies
Ricardo14 Ricardo14 September 16, 2020 September 16, 2020 at 5:29:56 AM UTC link Permalink

Yes, indeed. Thank you!

thebpogroup thebpogroup September 16, 2020 September 16, 2020 at 3:37:04 AM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

gillux gillux September 3, 2020 September 3, 2020 at 4:09:55 PM UTC link Permalink

Check out the new "Browse by language" page: https://dev.tatoeba.org/sentences/index (note that it displays differently for guests and logged-in users).
Related issue: https://github.com/Tatoeba/tatoeba2/issues/2157

Feedback is welcome. Also, a lot of code changed under the hood, so I’d be glad if you could check if the rest of the website is working normally.

{{vm.hiddenReplies[35880] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US September 3, 2020, edited September 4, 2020 September 3, 2020 at 8:42:32 PM UTC, edited September 4, 2020 at 12:34:29 AM UTC link Permalink

I really like it. I see a lot of influence from Wikipedia, but there's nothing wrong with that. I particularly like the display of languages by number of sentences (100,000+, 10,000+, etc.). That might be especially helpful in motivating people to add sentences in order to move their language from one group to another (for instance, another 3,000+ sentences will get Polish into the 100,000+ group).

Regarding the text "0+ sentences": Since you have a "1+ sentence" category, it should suffice to show "0 sentences". But I'm not sure why we have so many languages with no sentences. Don't we require sentences to be added for a language before we support the language? I suppose that we could have languages that have a small number of sentences that are then all deleted, for one reason or another. But displaying languages with zero sentences could just lead people to wonder why we don't support every known language, since we apparently don't even require sentences for the language before we display it.

{{vm.hiddenReplies[35881] ? 'expand_more' : 'expand_less'}} hide replies show replies
gillux gillux September 3, 2020 September 3, 2020 at 9:44:18 PM UTC link Permalink

Thanks! I’m glad you like it too. :-)

Right, the "0+ sentences" is weird. It won’t appear on tatoeba.org because all languages have at least one sentence. It does appear on dev.tatoeba.org though because we keep the list of supported languages updated without adding new sentences. I’m going change it into "0 sentences" nonetheless.

{{vm.hiddenReplies[35884] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US September 4, 2020 September 4, 2020 at 12:32:47 AM UTC link Permalink

I see. That makes sense.

TRANG TRANG September 5, 2020 September 5, 2020 at 10:18:56 AM UTC link Permalink

Thanks again for this page gillux, I'm looking forward to having this deployed on prod :)

The issues I've noticed:

1) The "unknown" language has a blank icon and clicking on it leads to an error.

2) The South Levantine Arabic icon is wrongly sized (but you have already noted that).

3) I also thought for a moment that some languages were missing an icon. For with "Tahaggart Tamahaq", if you don't know that it's one language you could be thinking that Tahaggart is one language and Tamahaq is another language and wondering why the second one doesn't have an icon.
If you could make the space between languages larger, or reduce the line-height of the language names, it would make it clear which string corresponds to the same language.

{{vm.hiddenReplies[35895] ? 'expand_more' : 'expand_less'}} hide replies show replies
gillux gillux September 5, 2020 September 5, 2020 at 9:51:38 PM UTC link Permalink

I fixed 1) and 2) already.

As for 3), I borrowed this way of displaying text from Wikipedia. It’s less of a problem for them because they localize each language names. I could change the display like you suggested, but this will make the text unaligned with other columns. I like how straight and organized it looks right now.

I’m not sure how to go about this but I’d rather try to approach the problem differently.

rumpelstilzchen rumpelstilzchen September 14, 2020 September 14, 2020 at 7:30:24 AM UTC link Permalink

The new "Browse by language" page is now deployed on tatoeba.org.

CK CK September 12, 2020 September 12, 2020 at 2:14:17 AM UTC link Permalink

There are now 777,777 sentences on List 907. 513,553 (66%) of these have audio.

This is the list of good proofread English sentences that I use on my projects. http://www.manythings.org/corpus/tatoeba.html

Bilingual sentence pairs made up of these sentences and sentences by native speakers contributing to the Tatoeba Project can be downloaded from http://www.manythings.org/anki/ .

Screenshot showing the 777,777 number.
https://ibb.co/pXZR22v

For comparison, here are the number of sentences for the 2nd and 3rd ranked languages on tatoeba.org.
Russian = 798,683
Italian = 767,143

Link to List 907.
https://tatoeba.org/eng/sentenc...s/show/907/und

Thanuir Thanuir September 9, 2020 September 9, 2020 at 2:50:47 PM UTC link Permalink

Codidact languages site: https://languages.codidact.com/

This is an open source Q&A-community in the style of Stack exchange, but non-commercial and also the source code is open. This particular instance I linked to is about languages; maybe people here find it to be of interest.

{{vm.hiddenReplies[35931] ? 'expand_more' : 'expand_less'}} hide replies show replies
mccarras mccarras September 10, 2020 September 10, 2020 at 12:33:10 AM UTC link Permalink

Cool! Thank you!

Objectivesea Objectivesea September 3, 2020 September 3, 2020 at 11:12:49 AM UTC link Permalink

I had a very brief look at the list of vocabulary words for which sentences are desired. When restricted to English words only, it nevertheless had a few anomalies.

1. ‹Lautgesetzlich›, clearly a German word, is mislabelled as English.

2. ‹thunderstrock› is misspelled; it should be ‹thunderstruck›.

3. ‹щзхлзщхлщзх›, being written in Cyrillic letters, cannot be English. Perhaps it is Russian or Bulgarian, but my guess is that it may be a nonsense word, as I do not see any vowels. Can it even be pronounced?

{{vm.hiddenReplies[35875] ? 'expand_more' : 'expand_less'}} hide replies show replies
mccarras mccarras September 3, 2020 September 3, 2020 at 11:51:00 AM UTC link Permalink

I have a similar question-what do we do when we make a mistake in a vocab request? I think I was half asleep when I requested sentences with "though" in Dutch--I used the word "though", which is obviously English and not Dutch :(

{{vm.hiddenReplies[35877] ? 'expand_more' : 'expand_less'}} hide replies show replies
Thanuir Thanuir September 3, 2020 September 3, 2020 at 12:48:39 PM UTC link Permalink

https://tatoeba.org/spa/vocabulary/of/mccarras - pystyt poistamaan itse lisäämiäsi sanoja omasta sanastostasi.

Thanuir Thanuir September 3, 2020 September 3, 2020 at 12:47:13 PM UTC link Permalink

Tällä hetkellä virheellisyyksiä ja typeryyksiä ei voi poistaa listasta. Profiilissani on joitakin käytännön ehdotuksia.

{{vm.hiddenReplies[35878] ? 'expand_more' : 'expand_less'}} hide replies show replies
Objectivesea Objectivesea September 9, 2020 September 9, 2020 at 4:52:53 AM UTC link Permalink

A rough English version of @Thanuir's Finnish remarks:

> You can remove words which you yourself have added to the vocabulary.
> At this time, inaccuracies and stupidities cannot be removed from the list.
> There are some practical suggestions on my profile.

Thank you, Thanuir, for your helpful comments and for the very helpful guidance on your profile page, at <https://tatoeba.org/eng/user/profile/Thanuir>, for contributing sentences based on proposed vocabulary items that are currently rare in the Tatoeba sentence database. Kiitos.

{{vm.hiddenReplies[35927] ? 'expand_more' : 'expand_less'}} hide replies show replies
Thanuir Thanuir September 9, 2020 September 9, 2020 at 8:06:03 AM UTC link Permalink

Vær så god.

Selena777 Selena777 September 9, 2020 September 9, 2020 at 6:04:25 PM UTC link Permalink

‹щзхлзщхлщзх› That's not a word, just 4 letters that are located side by side on the Russian layout. Can be deleted.

Snowy0wI Snowy0wI September 6, 2020 September 6, 2020 at 3:52:40 PM UTC link Permalink

Hi everybody,

Way too often grammatical gender or number are being lost in indirect translations.
Examples: #2707014 #1612621

Any ideas how do we fix it?
Adding correct translations as 'direct' does not remove flawed indirect options immediately.

Thanks.

{{vm.hiddenReplies[35908] ? 'expand_more' : 'expand_less'}} hide replies show replies
CK CK September 6, 2020 September 6, 2020 at 10:36:06 PM UTC link Permalink

This is the expected behavior of tatoeba.org.

See...

https://en.wiki.tatoeba.org/art...tions-in-grey?

{{vm.hiddenReplies[35911] ? 'expand_more' : 'expand_less'}} hide replies show replies
mccarras mccarras September 9, 2020 September 9, 2020 at 12:11:49 AM UTC link Permalink

But @CK and all, is there no way to tag the originals as, e.g., "female speaker" or "formal" or "addressing multiple people"? It seems like this might help clear things up.

{{vm.hiddenReplies[35926] ? 'expand_more' : 'expand_less'}} hide replies show replies
Thanuir Thanuir September 9, 2020 September 9, 2020 at 5:50:19 AM UTC link Permalink

Lauseisiin voi lisätä tunnisteita. Se ei mitenkään estä linkittämästä niitä käännöksiin eikä vähennä epäsuoria käännöksiä.

Akenaseryan Akenaseryan September 7, 2020 September 7, 2020 at 7:58:28 AM UTC link Permalink

There is a technical problem going on. When you have multiple pages for a search result, any page other than the first have the no result message.
Anybody else see that?

{{vm.hiddenReplies[35913] ? 'expand_more' : 'expand_less'}} hide replies show replies
Guybrush88 Guybrush88 September 7, 2020 September 7, 2020 at 8:02:35 AM UTC link Permalink

I saw that, too, and I reported it on the bug tracker: https://github.com/Tatoeba/tatoeba2/issues/2547

samir_t samir_t September 7, 2020 September 7, 2020 at 8:03:41 AM UTC link Permalink

I was going to ask the same question since I noticed it too.

Pfirsichbaeumchen Pfirsichbaeumchen September 7, 2020 September 7, 2020 at 8:42:59 AM UTC link Permalink

Oui, moi aussi j’ai remarqué cela.

deniko deniko September 7, 2020, edited September 7, 2020 September 7, 2020 at 9:23:51 AM UTC, edited September 7, 2020 at 4:20:42 PM UTC link Permalink

Also, the random sorting order doesn't work, as in it kind of works, but when you reload the first page of the results (you can't go to the second anyway) you're getting the same results.

Normally in Random mode every time you reload any page you get different sentences (well, random, some of them can be the same, of course)

rumpelstilzchen rumpelstilzchen September 7, 2020 September 7, 2020 at 8:10:14 PM UTC link Permalink

The problem should be fixed.

Sorry for any inconveniences.

{{vm.hiddenReplies[35921] ? 'expand_more' : 'expand_less'}} hide replies show replies
deniko deniko September 8, 2020 September 8, 2020 at 1:39:32 PM UTC link Permalink

It does work fine now, thanks for fixing it.