clear
{{language.name}} No language found.
swap_horiz
{{language.name}} No language found.
search

Wall (5461 threads)

Orava
2019-06-30 08:01
Is there a way to make an advanced search on a list that would show the sentences added to the list first?

If not, I would like that feature because going through a list of someone's sentences with audio is a bit difficult when hundreds of sentences are added to the list every now and then, and it takes time for me to find the page where I left off.

Any ideas how to systematically go through a list that is probably growing in the near future, such as this? https://tatoeba.org/eng/sentenc.../show/8977/eng
hide replies
CK
CK
2019-06-30 22:58 - 2019-06-30 23:42
You can view a list showing the oldest items on the list first. This way, you could bookmark the page where you leave off and continue from there the next time.

https://tatoeba.org/eng/sentenc...&direction=asc

This needs to be at the end of the list's URL.
?sort=created&direction=asc
CK
CK
2019-06-30 00:10 - 2019-06-30 00:29
Here is a search that will find the 1,000 longest "sentences" with audio.

https://tatoeba.org/eng/sentenc...rt_reverse=yes

Some items are actually multiple-sentence entries.

Even if you jump to the last page, you'll notice that the sentences are still fairly long.
Thanuir
2019-06-28 08:17
Feature suggestion: Sort sentences by what should be translated first.

Rationale: When translating sentences from a given language, it would be nice to get them in an order such that translating the sentences highest up on the list would presumably be more useful than translating those that are farther down.

Ideally, this should be a one-button option, like "Suggest sentences to translate". Conceivably this might also make life easier for newcomers, as they would have a clear thing to start with.

Details:

The suggestion algorithm should be language-neutral.

I think the following priorities would be useful. They are not in any particular order.

1. The rated language competence of the sentence owner.
2. The rated language competence of the sentence author.
3. Original sentence is better than a translation. (Reasoning: Original sentences are more likely to be something used in the language, rather than an attempt to express a foreign concept or structure in the language.)
4. Sentence with audio is better than a sentence without audio.
5. Owned sentence is better than an orphan.
6. Normal sentence is better than one redded out.
7. Highly rated sentence is better than lowly rated one. Order: The most ok ratings and no others; the most okay ratings when compared to number of unsure ratings and no "not OK" ratings; no ratings; the best ratio of OK to "not OK" ratings; the least unsure ratings; best ratio of "unsure" ratings to "not OK" ratings; fewest "not OK" ratings.
8. Sentence tagged OK is better than sentences with no quality tags is better than sentences with "check" or "needs native check" is better than "change" is better than "delete".
(Maybe I am forgetting some quality indicators.)

Sentences with many translations are likely to be higher quality or more useful, but translating sentences with no translations is likely to contribute more to the diversity of the corpus. One of these conditions might be added, too. (I would prefer "no translations".)

Or maybe the following would be a relevant hierarchy:

a. Sentences with no transations to any language.
b. Sentences without indirect or direct translations to the target language.
c. Sentences without direct translations to the target language.
d. Sentences with as few as possible direct translations to the target language. Maybe break ties with the number of indirect translations.

...

With English, I can fairly easily find sentences that satisfy many of the quality criteria. CK also has several tools for this. With smaller languages it is more tricky, as there might, for example, not be any sentences with audio, there might be many non-native contributions, maybe most sentences are translated or maybe almost none are. Hence, if one wants to find high quality or useful sentences to translate, it takes lots of experimentation to find them. If there are only few and one translates them, one needs more experimentation to find the next patch.

...

So, the idea would be to have a single button that gives, say, a hundred or a thousand high quality sentences in order of quality. The order does not have to include all the points above, but it should include a sufficient amount to prioritize useful sentences.

More importantly, the list should never be empty, if the language has even a single sentence, or maybe even a single untranslated sentence.

...

Making this would require value judgements from the community or whoever makes the thing. This might cause arguments about what kind of sentences are the most valuable. Maybe it would not be worth it.

This is just a suggestion; feel free to ignore.
hide replies
AlanF_US
2019-06-29 13:59
> Making this would require value judgements from the community or whoever makes the thing. This might cause arguments about what kind of sentences are the most valuable. Maybe it would not be worth it.

I think it's worthwhile for people to think about which sentences they find most valuable for their own purposes. But coming to a consensus or even a majority decision would be counterproductive. Not only would you not be able to get different people to agree, you wouldn't be able to get one person to agree with this ranking all the time. People use different criteria for different purposes. Is it a sentence that has been translated fifty times better than one that has not been translated at all? Yes, if your desire is to see how a phrase (typically a very common one) differs across languages. No, if your desire is to see a more complicated sentence (which is likely to be translated less often) that provides more context for a particular word. Is a sentence with audio better than one without? Yes, if you want to listen to audio. But when I don't want to listen to audio (as is generally the case for me), a search that favors audio gives worse results because they're more homogeneous (since the sentences that received audio were generally chosen by one person) and often shorter than I would like.

As I have said before, I favor making default searches as free from assumptions as possible. Adding assumptions decreases the diversity of the sentences that people see (and translate), and increases processing time.

By the way, I have never found a problem with the overall quality of English sentences in the Tatoeba corpus as a whole, so the concern over low-quality sentences has never resonated with me. I certainly come across poor individual sentences, which I flag or correct, but I never get the sense that they predominate in our collection of sentences. Occasionally I've done informal experiments where I've measured the fraction of sentences with mistakes that come up in a default or random search, and I've generally been impressed by the low proportion.
hide replies
Thanuir
2019-06-29 14:11
Note that I am not suggesting doing anything to the search function, but rather giving a simple way for someone to just get good and useful sentences to translate. This should not affect search results in any way.
MacGyver
2019-06-27 07:15 - 2019-06-27 09:23
Here's a list with ~2156 duplicates in the corpus (there are 4362 sentences in total). They differ only on spacing or some ~equivalent punctuation. It means, Horus can't merge them.

If you own any of these sentences, please consider changing the spacing or the punctuation if it does not change the meaning (or correctness) of the sentence.

This is how the file looks like: #id1 #id2 lang sentence1 lang sentence2

https://github.com/Tatoeba/tato...0/_rep_v02.txt

related: https://github.com/Tatoeba/tatoeba2/issues/642

And here's the number of duplicates by language

fra (1419)
rus (252)
epo (61)
tur (54)
bre (54)
ber (46)
deu (39)
ukr (26)
eng (25)
kab (25)
hun (17)
ara (17)
ita (16)
fin (16)
spa (7)
heb (6)
pol (6)
ell (6)
cor (6)
por (4)
lat (4)
bul (4)
tat (4)
eus (4)
thv (4)
mar (3)
srp (3)
pes (3)
oci (3)
jpn (2)
swe (2)
hin (2)
vie (2)
uig (2)
khm (2)
nld (1)
dan (1)
ina (1)
tlh (1)
afr (1)
slk (1)
urd (1)
ceb (1)
swg (1)
cym (1)
hide replies
Thanuir
2019-06-27 08:42
Jos kielessä on useita tapoja kirjoittaa lainausmerkit tai välimerkit, en näe mitään syytä olla sisällyttämättä kaikkia käytäntöjä myös Tatoebaan. Toisaalta, kielen normin vastaiset käyttötavat voisi korvata oikeammilla.

Muutamassa suomalaisessa lauseessa oli oikeinkirjoituksen vastaisia lainausmerkkejä; korjasin tai pyysin korjaamaan. Osa sen sijaan johtuu siitä, että kielessä on kolme erilaista tapaa ilmaista lainauksia tai puheenvuoroja. Ne ovat kaikki oikein.

...

If the language has several different and correct ways of writing something - quotes in Finnish, space or no space before certain punctuation in French - then all of these should be fine in Tatoeba. On the other hand, incorrect use of punctuation should be corrected.
hide replies
CK
CK
2019-06-27 08:50 - 2019-06-27 08:52
> ...then all of these should be fine in Tatoeba.

I don't think I agree with this, since this would lead to a lot of redundancy and also mean that we do not get as many languages linked to the same series of sentences.

For example, if we had the following, and the first one had a French translation, and the second one had a Japanese translation, there would be no indirect link between the French and the Japanese.

I can’t sing.
I can't sing.

(Two different apostrophes)

hide replies
Thanuir
2019-06-27 09:13
If the symbols look almost the same and are used in the same way, then it might not do too much damage to standardize them.

On the other hand, if the symbols (or the use of punctuation) look markedly different, or if the different uses follow national, socio-economic or other boundaries, then standardization is not a good idea.

If there are various different appearances, then knowing all of them is helpful.

If there is a regional or other difference between the conventions, than choosing one over the others will elevate one cultural practice over another. This is a highly political act and not something should be done by Tatoeba.

For reference, here are the ways of quoting/expressing speech in Finnish, recovered from https://www.kielikello.fi/-/lainausmerkit- :
”Tule mukaan!” hän pyysi.
»Tule mukaan!» hän pyysi.
– Tule mukaan! hän pyysi.

All three ways are visually distinct and the third way works differently than the others. The first and the third are frequently used. The second is met in literature, but is rare elsewhere, as far as I know.
Aiji
2019-06-27 13:49
It is a common interrogation, but when thinking thoroughly, that is actually not such a big issue.
7+ millions sentences, 4000+ duplicates
That is not even 0.1% of the corpus.

Most of them are due to the French spaces. That discussion happened in the past. Numbers were even calculated and posted on github. Suppose that each of them would have 50 direct translations (and they don't), that would still represent only around two percent of the corpus.

In my opinion, it is not worth putting more effort than human screening to address this point. And as Thanuir mentioned, some of them cannot be de-duplicated for cultural reasons (among others).
Ricardo14
2019-06-23 00:14 - 2019-06-23 00:16
**Suggestion (Feature request?)**

I've some problems with that and I'd like to share with you.

From times to times I want to contact some people who speak, understand a specific language to translate a (some) sentence(s) I want into other languages. It can be for my self studies, for my students, for other projects I have...

On "Language of members" and "Native speakers" I can find people to help me but...

1st - Are they active on Tatoeba? When is this user's last log? If I go into their profile I'll get an answer but it's not that clear (it's for me because I've been around for quite a long time)

2nd - Are they willing to help me? Do they reply to messages often? Always? Never? If they are not around and also set up their settings to don't receive emails from Tatoeba, it'll be hard to contact them, don't you think?

OK, but don't we have the "My vocabulary" feature?
Yes and it's great! However some problems...
-people don't look at it very much (better saying: I don't see people looking at the requests.. Maybe because they don't have time? Interest? Or even they don't remember, know that there are requests over there?

Perhaps it'd be good to have a kind of "advanced search" which we could look for
- active members in x language who understand y language
- people that speak x and y language but that they've not been around
- people who set up/agreed to receive emails from Tatoeba - like PMs!

That's it. Thanks.
hide replies
Smoky
2019-06-23 07:43
You can ask me if you need to translate something from English into Russian.
hide replies
Ricardo14
2019-06-24 15:46
Thank you so really much! :D
gillux
2019-06-24 07:08
Ricardo, I agree wholeheartedly with you.

I also think that Tatoeba sorely lacks ways to allow members sharing interest in the same language to reach out to one another. I remember in my last UX test, the first thing the user wanted to do is to find other German contributors. And she had such a hard time doing it. She even looked at the Wall and was disappointed to only see English messages.

I think it would be terribly awesome if we had separate spaces for each language. Members having interest in that language, be it natives or learners, could reach out to one another, follow what’s going on, exchange messages, actively *use* that language etc.

When I first joined Tatoeba, I remember I was happy to see other French active members, and happy to get Japanese messages from tommy_san about my translations. Looking back, it may have been a key point explaining why I’m still around now.
hide replies
soliloquist
2019-06-24 14:04
> I think it would be terribly awesome if we had separate spaces for each language. Members having interest in that language, be it natives or learners, could reach out to one another, follow what’s going on, exchange messages, actively *use* that language etc.

How about a forum like this one?

https://forum.wordreference.com

https://tatoeba.org/eng/wall/sh...#message_31426
hide replies
CK
CK
2019-06-24 23:28 - 2019-06-24 23:29
> How about a forum ...

Perhaps it's time for tatoeba.org to add a forum.
A forum could be used in addition to the Wall or perhaps instead of the Wall.
A forum could be used for what Ricardo14 is looking for.

Other advantages:
Most forums are searchable. The Wall isn't.
Forums could be sub-divided into categories. The Wall isn't.

There is apparently at least one CakePHP forum plugin available. Perhaps there are others.
https://github.com/CakeDC/cakephp-forum

Or, possibly use another open-source forum platform, after figuring out how to interface it using the same usernames and passwords. (However, our Wiki isn't interfaced with the main tatotoeba.org website yet.)


Related Issute on GitHub: https://github.com/Tatoeba/tatoeba2/issues/367
Ricardo14
2019-06-24 15:50
Thanks for your reply, Gillux.

It has been suggested to me to look for answers in another websites but
1 - I really like and *trust* Tatoeba. Everyone here are willing to have an awesome corpus;
2 - By helping me (and other members) we'd be feeding Tatoeba
3 - I don't feel like going to other websites for now. I prefer Tatoeba.

> She even looked at the Wall and was disappointed to only see English messages.

How about "encouraging" members to post on 2+ languages or having a "machine translation" that we'd be corrected - if necessary - by a native speaker?

> I think it would be terribly awesome if we had separate spaces for each language. Members having interest in that language, be it natives or learners, could reach out to one another, follow what’s going on, exchange messages, actively *use* that language etc.

+ 1,000. + sentences on Tatoeba! More hands to help each other!
hide replies
Thanuir
2019-06-25 18:14
Useammalla kielellä kirjoittaminen lienee hyvä käytäntö. Mutta konekääntäminen suomesta ei toimi kovin hyvin. Toiminee paremmin suurempien kielten kanssa.

Å skrive på mange språk er fint, synes jeg. Men maskinoversettelse virker ikke så bra med finsk. Kanskje er det bedre med større språk.

sharptoothed
2019-06-24 09:00
** Stats & Graphs **

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/
hide replies
Guybrush88
2019-06-24 12:14
thanks
Smoky
2019-06-24 15:22
I dunno, looks like nonsense.
Smoky
2019-06-24 08:52
Is it OK to add here quotes by famous people of historical significance?
Like this one: "Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety."
By Benjamin Franklin.
hide replies
Objectivesea
2019-06-24 10:56
We have some historic sentences already. I think including them is fine, especially if tagged, as you suggest, with the author name. One suggestion: since modern English no longer capitalizes nouns mid-sentence, it would be best to lowercase these to conform with modern usage. (German will of course continue to follow its own rules.)
hide replies
Thanuir
2019-06-24 12:37
There is also the tag "archaic" and others starting with that word. Maybe they are useful.

(If the original is a written source, then keeping the original spelling is appropriate, if one claims it is a quote.)
jegaevi
2019-06-20 16:22 - 2019-06-20 16:27
Hello everyone!

I've been using Tatoeba for a few months now and I have a few things in mind that I could really use.

It would be nice if I could exclude tags. Both in the language I'm translating from and translating to. Or exclude sentences with tags that I added.
I would like to be able to pick more than just one language in the 'Show translations in:' bar.

Or have an option to display all my profile languages.
And if I link a sentence to another one I don't want all the linked languages to appear, It often causes lagging or scrolls down to the end of the page. I don't need to see for example the Italian translations, since I don't speak it and I won't link it.

It would save so much time for me to be able to tag sentences without clicking them. I mean, maybe an 'OK' button could be added next to the 'Translate' button above the sentences.

That's it. These are just my personal preferences, I'm not implying that the site would be any better with them. It's maybe possible that some of these features already exist and I didn't notice them.
I've read the usability test that gillux posted a couple days ago and I would like to add that in my opinion the front page needs to be changed. Whenever I tell about Tatoeba to someone I always have to guide them, because they have no idea what the site is about. And even after I explain it, they won't use it because it's just too complicated for them.
I hope my post won't offend anyone, that's not my intention. I know that Tatoeba is nonprofit and I think it's amazing that this many people sacrifice their free time and devote it to Tatoeba.
hide replies
jegaevi
2019-06-20 16:57 - 2019-06-20 17:32
> I would like to be able to pick more than just one language in the 'Show translations in:' bar.
Or have an option to display all my profile languages.
And if I link a sentence to another one I don't want all the linked languages to appear, It often
causes lagging or scrolls down to the end of the page. I don't need to see for example the Italian
translations, since I don't speak it and I won't link it.

I just found the way to do that so you may disregard this part.

EDIT
>It would save so much time for me to be able to tag sentences without clicking them. I mean,
maybe an 'OK' button could be added next to the 'Translate' button above the sentences.

I just found out that you can apparently 'rate' sentences. I wonder if it's the same this as adding OK, change or check tag.

hide replies
mraz
2019-06-20 18:31
Kedves jegaevi!

Minden tudásomat összeszedtem, de így sem sikerült értelmeznem, amit írtál.

Üdvözlettel:

mraz
TRANG
2019-06-20 19:19
> Whenever I tell about Tatoeba to someone I always have to guide them, because
> they have no idea what the site is about.

When you say no idea, how clueless are they exactly?

Is it that they don't understand the purpose of the project? They don't understand *why* we are collecting sentences and translations.

Or is it that they don't even understand that Tatoeba is about creating a dataset of sentences and translations?


> I just found out that you can apparently 'rate' sentences. I wonder if it's the same
> this as adding OK, change or check tag.

The rating feature and the OK/check/check tags feature definitely overlap in terms of purpose.

For the context, tags were implemented a longer time ago, in June 2010: https://blog.tatoeba.org/2010/0...2th-2010.html. That's 9 years ago. We used tags as a solution to our needs for a quality process, even though it was not the main role of tags. But it was better than nothing.

The ratings were implemented in 2015: https://github.com/Tatoeba/tatoeba2/pull/738, and were an attempt to provide a better alternative than the tags. Ultimately they could replace the tags, but the feature did not make it out of the "beta" phase (yet?) and ratings are currently not visible to everyone.

So tags have more visibility than ratings.
- Tags: everyone can see them on the sentence's page.
- Ratings: only registered members who have activated the feature can see them on the sentence's page.
But if you tag "OK" or rate "OK", both express the same thing.
hide replies
jegaevi
2019-06-20 19:52
Thank you for replying.

>When you say no idea, how clueless are they exactly?
They understand that this is a collection of sentences. They want to learn English so I show them the search bar and how to search for English sentences with audio. I tell them that they can search for specific words, patterns or expressions. But they never stick around. I tried this with 3 people so far. All of them are really determined to learn their target language. Each one of them were really excited when I introduced them to the site. They said that it is amazing and they were very happy about the audio. But after a few days I asked them if they found it useful and they all said that it wasn't. It's too complicated. Or too random. Maybe I didn't explain it well enough, I don't know.
Thank you for the clarification about the tags and ratings.
hide replies
mraz
2019-06-20 21:46 - 2019-06-22 06:52
@jegaevi,

.... They want to learn English.....

Ezt nem arra találták ki. >>>

You cannot learn a language just by using Tatoeba.
It doesn't provide any structure for learning.
CK
CK
2019-06-21 03:36 - 2019-06-21 03:37
Perhaps you could introduce this "dashboard" of useful links to your friends.

http://study.aitech.ac.jp/tatoe...late/links.php

I know it's unlikely that it's exactly what they might want, but it might make it easier for them to easily understand some of the website's potential.
hide replies
jegaevi
2019-06-21 05:59
Thank you CK!
Maybe this will help. I think I'm gonna show them your Listen and repeat videos. Maybe they get more use out of those.
TRANG
2019-06-22 04:57
> But after a few days I asked them if they found it useful and they all said that it
> wasn't. It's too complicated. Or too random. Maybe I didn't explain it well enough,
> I don't know.

If you introduced Tatoeba to them as a language learning platform, then obviously they would be disappointed. Because that's not what Tatoeba is. You cannot learn a language just by using Tatoeba. It doesn't provide any structure for learning.

I wonder if you ever noticed the link "What is Tatoeba?" at the bottom of the page and if you tried to show them the video on that page.
hide replies
jegaevi
2019-06-22 11:11
Well, I didn't say that it is a language learning platform to any of them. I think I said that it is a collection of example sentences and their translations. And they can look up words, expressions or structures. And I told them that maybe they should copy and paste sentences they found interesting in Anki. Or any other flashcard system. Or maybe write it down on paper or whatever.

>I wonder if you ever noticed the link "What is Tatoeba?"
Yes, actually I read the wiki (not the whole thing, just the parts I was interested in, but most of it) and watched the video, too.
They wouldn't understand the video since they don't speak English very well yet. But I'm gonna meet up with them in a few days and try to explain it again.
And just to add, even though I speak English (I mean my grammar and spelling is horrible) I found Tatoeba confusing at first. Without the members who messaged me I wouldn't even know how to use the Advanced search options. I figured out the rest on my own after I read the wiki, but not many people would go trough the whole wiki like I did.
I really don't want to be disrespectful towards anyone here, please don't take it this way. I just wanted to speak my mind.
hide replies
TRANG
2019-06-22 14:18
> But I'm gonna meet up with them in a few days and try to explain it again.

If they understood it's not a language learning platform but they don't find it helpful for their particular case, then it's okay. They just have different needs. It's just like they needed a spoon, but we gave them a fork. Perhaps they'll need a fork in the future, but right now they need a spoon, that's all :)

At least they know Tatoeba exists, they know what it does. One day they may find themselves in a situation where they really need something like Tatoeba, and hopefully, they will remember about Tatoeba. Better yet, maybe one day they feel inspired to participate in Tatoeba, by creating new sentences, translating sentences or proofreading sentences.

> They wouldn't understand the video since they don't speak English very well yet.

The video has subtitles in Hungarian by the way. Maybe it helps.

> I really don't want to be disrespectful towards anyone here, please don't take it
> this way.

You don't have to worry about it. We appreciate every feedback.
hide replies
jegaevi
2019-06-22 16:13
I didn't know about the subtitles, thanks for bringing it to my attention.
I hope they will find Tatoeba useful in the future. It helped me a lot in the last few months.
Thank you for being this nice and taking your time replying. I really appreciate it. :)
gillux
2019-06-24 07:41
First, I want you to know that such constructive feedback is very welcome here. Thank you for taking the time to write it.

> I would like to be able to pick more than just one language in the 'Show translations in:' bar.

Actually I’ve been thinking about that too. How about having "Profile languages" as one of the choices for 'Show translations in:' or 'To:'? We make it the default and use the expand/collapse sentence button as a way to show other languages.
hide replies
jegaevi
2019-06-24 07:56
Yes, that would be amazing. It would make things faster fur sure, at least for me.
CK
CK
2019-06-24 08:35 - 2019-06-24 08:43
Seeing the languages listed in your profile first, with the others hidden, but easily visible if wanted, is a great idea. I hope it's possible to do.
DostKaplan
2019-06-23 21:06
"Search error

An error occurred while performing the search. If the problem persists, please let us know."
hide replies
samir_t
2019-06-23 21:48
Yes, the problem persists.
hamzah
2019-06-23 22:34
Yes there is a problem
rumpelstilzchen
2019-06-24 03:23
Maybe I'm the culprit.

I've looked at https://github.com/Tatoeba/tatoeba2/issues/1895. After trying a few search variants I've got the error.
gillux
2019-06-24 05:36
The problem should be solved now. Sorry for the inconvenience!
CK
CK
2019-04-05 03:01
** Audio Milestone **

https://tatoeba.org/eng/audio/index
Sentences with audio (total 555,555)
hide replies
Seael
2019-04-20 17:19
Great!
:)
CK
CK
2019-06-05 02:22
In the last 2 months we have added over 22,300 audio files.

https://tatoeba.org/eng/audio/index
Sentences with audio (total 577,944)

You can see which member's audio file lists have had changes recently with this search.
https://tatoeba.org/eng/sentenc...direction:desc
However, in some of the lists, editing has taken place and not additions of new audio files.
hide replies
CK
CK
2019-06-24 00:51 - 2019-06-24 00:51
We now have 588,888 audio files, up 10,944 since 18 days ago.

https://tatoeba.org/eng/audio/index

See a screenshot showing 588,888.

https://prnt.sc/o5ragn

If you, too, would like to contribute audio files in your own native language, see http://bit.ly/shtooka .