Wall (6,753 threads)
Tips
Before asking a question, make sure to read the FAQ.
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
janTuki
yesterday
deyta
yesterday
janTuki
yesterday
deyta
yesterday
Nuel
2 days ago
Cangarejo
2 days ago
Nuel
2 days ago
Cangarejo
2 days ago
Nuel
2 days ago
Cangarejo
2 days ago

Hello Mindenki!
Jó egészséget, sok szerencsét és minden jót kívánok!
BÚÉK 2023!
mraz

あけましておめでとう!
¡Feliz Año Nuevo!
o tenpo sike pona a!
新年好!

B.Ú.É.K.!
Gott nytt år!
Godt nytt år!
Godt nytår!

dear Tatoeba, can you add the Yotvingian language?

It’s an extinct language also known as Sudovian or Jatvingian. The language code is “xsv”.
Flags can be found on the page below.
https://www.crwflags.com/fotw/flags/by%7Dyet.html
More info on the wiki.
https://en.wiki.tatoeba.org/art...nguage-request

Thanks for getting more information.
I’m out of town right now but when I get back in a few days I’ll work on it.

Hello, I'm part of the team in charge of language requests in Tatoeba!
When you request a language, we need a few sentences in that language to be added to Tatoeba. Are you able to do that?
Cangarejo already provided a flag and more information about it, including an ISO code.
Let me know if you have any questions!

Kails! Kai jūms ait? — Hello! How are you?
Sandeiv! — Goodbye!
Denkauj, spartas laban — Thank you, very well

An intrinsic part of Tatoeba's value is how remarkably censorship-free it is, so that anyone can post linguistic content as they like, so long as it is free of plain errors. Any limitation of that would simply involve one group of people enforcing their will and vision onto another group, which is of course no bueno (unless you belong to the former group, then you may naturally consider doing so the proper course of action).
I am able to handle the presence of sentences I dislike - believe me, there's plenty - and do not call for them to be curated and their contributors disciplined simply because I might like that to happen for this or that reason. I would greatly enjoy if everyone strove to adopt a fair mindset like this and let each other be.

Funny to read from a MAJOR censor who attempted to ban so many of my sentences…

You have a leaky memory, sir. I did specifically not suggest your sentences be deleted, although I did not care for some of them getting altered (like that frenchy space-before-question-and-exclamation-mark thing, except in English) but I wouldn't vote to do that myself.

Valitettavasti yhteisö, joka sallii kaikenlaisen sisällön, ajaa pois herkemmät ja haavoittuvammat käyttäjät ja jättää jälkeensä vain meidät, jotka ovat tarpeeksi välinpitämättömiä, vahvoja tai etuoikeutettuja jaksaakseen kaikenlaisia lauseita.
On mahdotonta rakentaa yhteisöä siten, että kaikkien ääni kuuluu siellä. Joidenkin ääni hukuttaa tai ajaa pois toisten äänet. Tämä on hyvin valitettavaa; maailma olisi mukavampi paikka, jos sananvapaus olisi helpommin toteutettavissa.

There is no community in existence where all people can be a part of it.
Communities exist for mutual goals, mutual interests.
If the community is about translating sentences then it would be about translating sentences.
So, maybe some of us just stopped believing that the community is about translating sentences. There is no weak and strong when we are talking about contributing to a free site. There are only translators, language learners.
If your first thing is not about translating, writing, curating sentences and helping others achieve correct sentences in here, then, you are not the one who is in the right community. You are in a different one in the first place, and I ask everyone who wants to promote their community instead of be a member of translators, writers, curators in tatoeba, to stop believe that they are the right community here.
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

It has been more than 10 years now since I joined Tatoeba. It has been a constant in my life for all of these years and I have contributed extensively to adding sentences in Marathi, Hindi, and English, as well as audio in Marathi. I have also worked on the team as an advisor for language icons, and translated the UI into Marathi to completion.
It has been fascinating working alongside and communicating with other members from all over the world here, including members from cultures and languages I had never heard of before. I think that we have one of the most diverse communities in the world here.
I also think it's time for me to take my leave. I had previously imagined I would be on here till the day I die, but as the years have passed, I have become increasingly disillusioned with the community here.
I now see corpus maintainers flagrantly adding transphobic, homophobic, and xenophobic sentences. I see users making racist comments allowed to go scot-free despite complaints. It's a terrible environment to work in, and a community I feel ashamed to now be seen as a part of.
With Ricardo's passing especially, I no longer feel any interest in continuing with my work here. I apologise to him for that, and I also apologise for any negative discussion that might follow this message.
With that, I thank you all for bearing with me for the past 10 years, and I leave now for different horizons.
Goodbye, all of Tatoeba.

I’m sorry to see you go, but I appreciate everything you did for the project, and wish you luck in your future endeavors.
I hope Tatoeba can become a better environment for new contributors.

sabretou, I'm sorry to hear that you're encountering a hostile environment at Tatoeba. You are valuable to us, and I hope we can resolve whatever issues you are confronting so that we can persuade you to stay. I sent you a private message.

Dear sabretou,
I hope you are still coming back to find these responses. Please do answer Alan's message so we can resolve the issues. As he said, you are valuable to us, and it would be very, very sad to see you leave like this. Please don't let 10 years end this abruptly.

Tatoebaのような言語コーパスにとって、表現の自由や多様性は重要だと思いますが、秩序も大切だと私は思います。この問題を解決し、私たちが気持ちよく貢献できる場所を提供してくださると信じてます。@Pfirsichbaeumchen, @AlanF_US

Yes, @Sabretou, please do stay and continue working to help make Tatoeba an even better and more welcoming place. I am in awe of your contributions to date; I also joined Tatoeba about ten years ago and have contributed less than a tenth as many sentences as you have. I am cisgender, straight, white, Anglo-Saxon and Caucasian, but I have no tolerance whatsoever for hatred and opprobrium grounded in a desire to arbitrarily and artificially elevate oneself by degrading anyone else.
I am so very sorry that you are finding instances of transphobia, homophobia, xenophobia and racism on this site — appallingly, some of it, as you say, from actual corpus maintainers. I trust that over 95 percent of all contributors to Tatoeba will be as disgusted as I am with any comments or sentences expressing archaic, loathsome and hate-grounded views that seek only to divide the Tatoeba community and to lend support to fascistic narrow-mindedness.
Such sentences, when they occasionally rear their ugly heads, should, in my opnion, be ruthlessly eliminated by the admin staff or, if they have any redeemeable element, be rewritten to emphasize the disapproval of the Tatoeba community. For example, we might recast the loathsome and unacceptable sentence #8734076; salvaged, it might read like this, which I think would be much more respectful and healthier for civil discourse on Tatoeba:
When Karen said that queer adoption is a sort of child abuse, Tom pointed out that the Bible says nothing about gay parents adopting children. "Furthermore," he argued, "over 17 million children worldwide have neither father nor mother, and they would unquestionably be better off with two loving parents. Whether those parents are gay or straight should not matter in the slightest."
If people collect heinous and unacceptable sentences and notify me, I would be more than happy to recommend improvements along such lines to the admins or, alternatively, to pronounce them unsalvageable and recommend their immediate deletion.
We should not find any serious objection arising to this remedy on spurious free-speech grounds, as one person's civil rights must end at the point where they interfere with the civil rights of others, and no individual or group should be abused by calculated repetion of poisonous comment.
If we get a promise from the admins to purge Tatoeba of hostile sentences, could we persuade you, @Sabretou, to resume your yeoman service to world culture and peace among nations?
— Objectivesea (Victoria, B.C., Canada)

10 years is a long time. I'm only here for about 3 and a half years. (this isn't my first account) And I think I learned what is Tatoeba to me. It took me some time, I quit 3 times, but I'm still here. And now I don't accept any other answer from myself to "why I would leave Tatoeba" than "I don't want to translate sentences here anymore".
Haters will be haters at the end of the day, but I will do what I want, and I want to translate sentences, try some ideas. Others do the same, there is no one who is really in control. So trying to control or regulate the flow of hateful sentences may can be not just difficult, but impossible. Still, can be light in the dark.
I do tolerate everything, but I do not translate what I don't like, so by contributing I might regulate something.
And you can do the same by doing something, anything.

[#1723] Merry Christmas!
Maybe you would like to translate some sentences tagged "Christmas" today.
https://tatoeba.org/en/tags/sho...s_with_tag/493
Christmas (963 sentences)

Gleðileg jól!

I want to contribute by reading sentences. How can I do it?

You mean, by checking them for correctness? That'd be very helpful!
You can pick a set of sentences to check, and then write a comment on those you feel are incorrect, suggesting a correction if you know what it's trying to say.
Some sentences have a "@needs native check" tag on them, put on it either by the original author or by someone else who feels it might be wrong. These are the Spanish sentences with that tag.
https://tatoeba.org/en/tags/sho...h_tag/1207/spa
Mostly there are sentences out there that are wrong or unnatural but don't have a tag on them though. For those, you can pick your own strategy for going through sentences.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Como, ¿revisar oraciones para corregir errores? ¡Eso sería de mucha ayuda!
Puedes elegir una serie de oraciones para repasar, y luego escribir un comentario en aquellas que sientas que son incorrectas, sugiriendo una corrección si crees que sabes lo que quiere decir.
Algunas oraciones tienen una etiqueta de "@needs native check" en ellas, usualmente puesta por el autor original o por alguien más que piensa que podría estar incorrecta o sonar no natural. Estas son las oraciones en español con esa etiqueta:
https://tatoeba.org/en/tags/sho...h_tag/1207/spa
Por la mayoría hay oraciones por allí con errores o que no suenan naturales pero que no tienen cualquiera etiqueta. Para esas puedes elegir tu propia estrategia para decidir cuáles revisar.

Or do you mean reading them out loud, recording your voice?
CK is in charge of that, I'd send a private message to him, and he can tell you the next steps.
¿O quieres decir leerlas en voz alta, grabando tu voz?
CK está a cargo de eso. Yo le mandaría un mensaje privado, y él te puede decir los siguientes pasos.
https://tatoeba.org/en/user/profile/CK

More info about contributing audio can be found in the wiki.
https://wiki.tatoeba.org/articl...ntribute-audio

For the purpose of testing myself, is there a way to force the results of a search to be displayed WITHOUT the target translation. Example:
https://tatoeba.org/en/sentence...=%3Dsaw&to=ron
I want the results to show only the English sentences with the Romanian translations [temporarily] suppressed. Is there a parameter I can add to the URL?

It would be a good learning and self-testing feature if a parameter can be added to the URL to blank out the target translation with a checkbox displayed at the end of the hidden sentence. When you check the checkbox, the translation is revealed. When you uncheck it, the translation is hidden again. Add to feature list? Please? 😊

"show translations in: none" should do the trick. You only get the sentences translated into Romanian, you just don't see the translations. If you want to see the translation you open the sentence.

@Shishir
Thanks. I was able to only show the English sentences without any translations.
https://tatoeba.org/en/sentence...roved=no&user=
What do you mean by "If you want to see the translation you open the sentence"? How do I "open the sentence"?

You click on the "#" or on the sentence number and go to the sentence page. There you see all the translations including the one in the language you want.

@Shishir
@CK
OK, I think there's a bug. My profile has English, Malay, Turkish, Romanian shown in the Languages section.
My search URL is:
https://tatoeba.org/en/sentence...3DGoing&to=ron
I see the translations in Romanian. In the "Show translations in" drop-down, I select "None" and then hit "Search". The results are sentences in English without any translations. That's good. That's what I want.
I then tap on "#" of one of the English sentences, and the next page shows only Turkish translations—even though Romanian is one of the Languages in my profile.

In addition to specifying your languages in the profile, you can specify a list of language codes in your settings that will restrict the languages in which you see translations. Is it possible that you've done this?

@AlanF_US
You're right. Language code "ron" is missing in the comma-delimited list.
Why is it "ron" and not "rom" for Romanian, by the way?


Boo! Romansh is cooler!

I beg to differ with @nipbud, for whom I have great respect, on this point. In my opinion (and perhaps by Tatoeba policy), Rhaeto-Romantsh is cool, but Romani and Rumanian are equally cool. The story of how and why languages have evolved over time in different places, sometimes from contact and sometimes in isolation, is extremely fascinating to me.
The equal treatment of multiple languages is one of the things I really appreciate about Tatoeba. Best wishes to all Tatoebans for happy holidays.

It is kind of amazing how so many languages are named after one city. 🏛️

The name "Romani" isn't derived from this city.

Right! "From romani, feminine form of romano (“of or pertaining to the Roma”), from rom (“man”)."

Maybe it's derived from the fact that they roam around, being itinerant?

From Wikipedia: "Romani is an Indo-Aryan language with strong Balkan and Greek influence.... The linguistic evidence has indisputably shown that the roots of the Romani language lie in India: the language has grammatical characteristics of Indian languages and shares with them a large part of the basic lexicon, for example, regarding body parts or daily routines." The article mentions that the Dom or Domba people of North India have genetic, cultural and linguistic links to the Roma.
Also, "Romani and Domari share some similarities: agglutination of postpositions of the second layer (or case marking clitics) to the nominal stem, concord markers for the past tense, the neutralisation of gender marking in the plural, and the use of the oblique case as an accusative. This has prompted much discussion about the relationships between these two languages. Domari was once thought to be a sister language of Romani, the two languages having split after the departure from the Indian subcontinent – but later research suggests that the differences between them are significant enough to treat them as two separate languages within the Central zone (Hindustani) group of languages. The Dom and the Rom therefore likely descend from two migration waves out of India, separated by several centuries."

Yeah, I was memeing. I just don't get very often to say that Romansh is cool :)

Also would this work? Hides sentences until you hover over them only for that page
https://add0n.com/stylus.html
Go to import and paste:
@-moz-document regexp("https://tatoeba.org/is/sentence...saw&to=ron.*") {
.translations .text span{
opacity: 0;
}
.translations .text span:hover{
opacity: 1;
}
}

"Hover" implies a mouse pointer over an area on the screen. There is no mouse pointer when accessing a web page in a browser on a smart phone or tablet. A long click or long press might work.

Hello, how can I download sentences belonging to a particular user? There is "detailed sentences" option which let indicate user's name but doesn't let to choose it before downloading. Can I change it?

I don't think that's possible before downloading. Maybe the best thing to do is to download the sentences_detailed file of the desired language and then filter the user's sentences with a spreadsheet.

Accepting this approach, I must download more than one million English sentences, when I really need about 30,000 right now. I guess the file would be too big to process it.
Plain scrapping from Tatoeba.org using Beautiful Soup looks better than that, but maybe it's possible to scrap the data from its source database without html tags?

> maybe it's possible to scrap the data from its source database without html tags?
if you have coding skills, maybe you can use the API (in beta).
https://en.wiki.tatoeba.org/articles/show/api#

That's exactly what I need, thanks a lot!

How can I get more than 1000 sentences unlike using the Tatoeba search? Is there any variable for that in the API? I tried "count=all" or "count=5000", but it didn't help.

FI
Jos tiedoston lataaminen ei ole ongelma itsessään, niin senhän voisi ladata ja sitten muokata tekstitiedostona. Poistaa vaan suurimman osan datasta, niin että jäljelle jää vain haluttu määrä lauseita.
EN
If downloading the file is not an issue as such, you could do that and edit the file (as a text file) to remove most of the data, leaving behind only a usable amount.

you'd be surprised, it's only 27mb compressed and 156mb uncompressed
using the fantastic `tatoebatools` python library (made by lbdx, funnily enough) i was able to filter for my own sentences in 5 seconds
granted, your computer might be weaker than my 4-year-old core i5 laptop, but it should still take less time than figuring out the api *or* beautifulsoup, let alone running them

Where can I learn more about the 'tatoebatools' library? (I can assume it permits you having and processing all the Tatoeba database on your own computer, right?)

there's some documentation on the pypi page https://pypi.org/project/tatoebatools/
and yeah that's what it is, it handles all the downloading and csv quirks for you

Thanks! Btw, do I need to download the database just once or every time using the library? Unfortunately, I didn't have enough of patience to wait for downloading the database but if it's just one time I can wait.

it saves it after downloading but it does redownload it if there's an update
updates are released every saturday at 6:30am utc

Thanks for the answer, I'll try it, too.

🍎 Here are files with only 4 fields, omitting the dates.
sentence_ID + tab + language_code + tab + text + tab + sentence_owner
🥝 All the exported sentences
http://study.aitech.ac.jp/4flds...2022-12-17.zip
228 MB
🥝 Just the English sentences
http://study.aitech.ac.jp/4flds...2022-12-17.zip
27 MB
🥝 Just the 891,125 sentences on List 907
http://study.aitech.ac.jp/4flds...2022-12-17.zip
12 MB
Perhaps one of these will help you.

Thanks a lot!
Where can I get updates with new sentences?

At the bottom of every page on Tatoeba there’s a link to the Downloads page, which is updated weekly.
https://tatoeba.org/downloads

If it is the English sentences on List 907 that you are interested in, you can download them at any time from this URL.
https://tatoeba.org/en/sentence...s/download/907
You will also have the option to include any linked Russian sentence.
Note that this export doesn't include the names of the sentence owners, though.

Hi!
I don't know whether this has been asked before, but I was wondering whether it could be possible to search for example sentences of a specific length. For example, looking for sentences that are at least 4-word long and at most 9-word long.
Thanks in advance!

It’s not possible to do that kind of search at the present time. The closest you can get is sorting by length.
It should be relatively straightforward to implement though. You should open an issue on GitHub.
https://github.com/Tatoeba/tatoeba2

Though not exactly what you are asking for, you could try these 2 lists.
🥝 7&8-word English Sentences with Audio (Over 115,000 Sentences)
Browse the sentences.
https://tatoeba.org/en/sentence.../show/9364/und
Or, try this pre-filled advanced search form.
https://tatoeba.org/en/sentence...roved=no&user=
🥝 9-10-11-12-13-word English Sentences with Audio (Almost 50,000 Sentences)
Browse the sentences.
https://tatoeba.org/en/sentence.../show/9355/und
Or, try this pre-filled advanced search form.
https://tatoeba.org/en/sentence...roved=no&user=
🥝 Also, here are some pages with 4-to-8 word sentences that I created in 2020 for a similar request.
http://study.aitech.ac.jp/sente...audio/x-words/