menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search
AlanF_US {{ icon }} keyboard_arrow_right

Profile

keyboard_arrow_right

Sentences

keyboard_arrow_right

Vocabulary

keyboard_arrow_right

Reviews

keyboard_arrow_right

Lists

keyboard_arrow_right

Favorites

keyboard_arrow_right

Comments

keyboard_arrow_right

Comments on AlanF_US's sentences

keyboard_arrow_right

Wall messages

keyboard_arrow_right

Logs

keyboard_arrow_right

Audio

keyboard_arrow_right

Transcriptions

translate

Translate AlanF_US's sentences

AlanF_US's messages on the Wall (total 1,324)

AlanF_US AlanF_US 18 days ago April 6, 2024 at 9:34:52 PM UTC link Permalink

I think your best bet would be to go to the Downloads page. A link to this page ( https://tatoeba.org/en/downloads ) can be found at the bottom of every page. The first item on the page lets you download sentence pairs. You could download an English->German and an English->Romanian file. Then you could use the sentence IDs in the files to help you find sentences that exist in both languages. Depending on your situation, you could do this manually, with spreadsheet functionality, or with a programming language (in which case the Links download might also be useful). I don't think we're likely to provide this functionality from the advanced search page, which is already quite complex.

AlanF_US AlanF_US March 21, 2024, edited March 21, 2024 March 21, 2024 at 12:33:32 PM UTC, edited March 21, 2024 at 12:34:48 PM UTC link Permalink

Your point about a better error message that points to the search help page is a good one. However, I want to respond to this statement:

> Obviously this means I wouldn't expect to get any results because a search string like "-personally" is effectively equivalent to searching for an empty string.

Searching for an empty string should ideally find *ALL* sentences that match the other criteria (such as the source language and/or target language, if specified). The reason that this often results in an error is that it produces too many results, and something in the code is not handling that well. (For what it's worth, the error message in this case is different, but also unhelpful.) As you say, specifying "-personally" should find all sentences that don't contain the word "personally", and since this is a relatively uncommon word (at least compared to a word like "the"), the search is effectively equivalent to searching for all sentences.

If you search for an empty string and specify the "from" language as English and the "to" language as one that only has a few translations from English, like Madurese, you will see a finite number of results and no error.

AlanF_US AlanF_US January 12, 2024 January 12, 2024 at 3:42:34 AM UTC link Permalink

Tatoeba is a collection of (1) sentences that are meant to look and sound natural but are not necessarily limited to a particular situation and (2) links between two sentences that can match in at least one situation, but are not guaranteed to match in every respect. Once a translation is added to a sentence, the translation and the original sentence are equally important. The translation is meant to be able to stand on its own; it is not as though its only purpose is to tell you what the original sentence means. Therefore, it also has to be natural.

The English sentence "Did you work yesterday?" could be either formal or informal. There's not enough context to decide. Capitalizing "you" would violate the "look natural" requirement, especially because a capitalized pronoun is generally reserved for a deity in a monotheistic religion (as is mentioned on the linked page in Thanuir's comment). Similarly, adding "(formal)" to the text of the sentence would also make it look unnatural, and is against the site's guidelines.

The Italian sentence "Lei ha lavorato ieri?" matches at least one situation in which "Did you work yesterday?" could be used, namely to address someone formally. So they are valid translations of each other in the Tatoeba world. If someone wants to copy the pair to a place outside Tatoeba and manipulate one or both of the sentences in some way (add tags, colors, or explanatory notes) so that learners could know that they're supposed to provide the formal "Lei" where English has the formal or informal "you", they are free to do it there. But doing it inside Tatoeba would violate the way the site works.

AlanF_US AlanF_US January 8, 2024 January 8, 2024 at 9:18:46 PM UTC link Permalink

Is this a reference to the field labeled "Metadata" on the right-hand side of a sentence page? It contains information ("Reviewed by", "Tags", "Lists", "Sentence text", "Correctness", "Logs") about the sentence, rather than the text of the sentence and its translations. You don't need to refer to the metadata in order to see the existing sentence or its translations, or to perform operations on it (translate, edit, add to list, etc.). Therefore, metadata can be pushed off to the side, and when the window is too narrow to show it together with the sentence, it remains accessible via an extra click.

AlanF_US AlanF_US January 5, 2024 January 5, 2024 at 12:18:48 AM UTC link Permalink

Fixed. Thanks for reporting the issue.

AlanF_US AlanF_US December 26, 2023, edited December 26, 2023 December 26, 2023 at 12:58:32 PM UTC, edited December 26, 2023 at 12:59:05 PM UTC link Permalink

It looks fine to me. Which menu are you looking at, and what are the problems you're seeing? Did you try clearing cookies for the site? Did you try another browser?

AlanF_US AlanF_US December 24, 2023 December 24, 2023 at 2:14:05 PM UTC link Permalink

Did you check the favorites list immediately after you added the new sentences, or did you wait some time? I know that there are some operations that take time to process.

My advice is to write down the number of favorites before you add any more. Then, the next time you add some, put them into a list as well. If the final number doesn't match the original number plus the ones you added, let us know these numbers, as well as the list you added them to.

AlanF_US AlanF_US December 14, 2023 December 14, 2023 at 4:14:22 PM UTC link Permalink

Yes, and perhaps also because "mar" is a word in its own right. Unless the search engine is told to do otherwise, or is acting on a language without a "stemmer", it "stems" words by discarding sequences of letters that sometimes act as suffixes/endings from the words in both the search query and candidate sentences and determining whether there is a match. This works only imperfectly, but injecting a human's full knowledge of the words in question into the search engine may not be feasible.

AlanF_US AlanF_US December 3, 2023, edited December 3, 2023 December 3, 2023 at 5:06:39 PM UTC, edited December 3, 2023 at 5:21:31 PM UTC link Permalink

Let's be clear: Amastan's account was suspended because he contributed sentences that were against the terms of use, not because he "was a hater". We can't presume to know what's in his mind, nor is it our goal to punish him for it.

I wasn't involved in the decision to choose a neutral flag for Kabyle, but I recall that after considering evidence based on external sources, not solely the opinions of Tatoeba members involved in the debate, the proposed alternative flag was seen as too new and too controversial. That decision process makes sense to me, regardless of whether one of the chief opponents of that flag has an active account on the site.

Flags are not chosen on Tatoeba as symbols of group pride, any more than an envelope or pencil icon is. They're meant to be indices that serve as a quick reference to an operation or category. The icon for my primary language, English, is the flag of the United Kingdom, a country to which I have no connection and towards which I feel neither pride nor animosity. The antagonism between my part of the world and the UK faded two centuries ago, but if Tatoeba had been around in 1776, I would have been perfectly content to work with an icon that contained the Union Jack and a language/variant abbreviation, as long as that preserved the peace.

Any pride you feel toward your language should not depend on whether an apolitical site like Tatoeba chooses your preferred symbol to represent it.

AlanF_US AlanF_US December 3, 2023 December 3, 2023 at 4:50:39 PM UTC link Permalink

Kabyle and Berber don't have the same flag. They have a flag based on the same icon, differentiated by the letters KAB for Kabyle and BER for Berber. There's nothing unique about that situation: other groups of languages, such as Central Bikol, Cuyonon, Chavacano, and Southern Subanen; Berom, Igbo, and Nigerian Fulfulde; and Assamese and Ho also share an icon and are differentiated by language abbreviations.

AlanF_US AlanF_US November 30, 2023 November 30, 2023 at 1:00:11 PM UTC link Permalink

Amastan was suspended for contributing sentences that violated the site's terms of use. The user "always" has not done that. It's not clear what you consider "vandalism": perhaps the contribution of large numbers of similar sentences? This is counter to the spirit of diversity that the site promotes, so indeed contributors should avoid it. If either of you wants to continue this discussion, please send me a private message.

AlanF_US AlanF_US November 26, 2023 November 26, 2023 at 3:37:41 PM UTC link Permalink

You would have more trouble finding the "cart" translation than the "car" translation, given that there are fewer than 20 sentences where "carro" is translated as "cart" and many more where it's translated as "car". But let's imagine that there are lots of sentences of the form you mentioned that refer to someone pushing a cart. There is an additional way to make it easier to filter out unwanted sentences: Use the "-" prefix before a word that you don't want. This only works for the source language, not the target language, but chances are that you could filter out sentences containing the forms of an unwanted word in the target language by filtering out sentences containing a corresponding word in the source language.

If you did find a sequence of sentences of the type you mentioned with only minor differences ("He is pushing the cart", "She is pushing the cart", etc.), and it was so long that it was driving out the other sentences you wanted, you could specify that you want the sentences listed in random order rather than the default, which tends to put short sentences first. I generally prefer random order anyway since it displays a more interesting and diverse set of sentences.

AlanF_US AlanF_US November 26, 2023 November 26, 2023 at 3:13:31 PM UTC link Permalink

I think that's a good change.

I see that you're using the International Phonetic Alphabet underneath words to indicate their pronunciation. While there are advantages to doing this, I believe that many of your users will not know the IPA, and it takes a fair amount of effort to learn. Furthermore, for a language like Russian, where the pronunciation can almost always be determined from the spelling plus position of stress, a pronunciation in IPA is overkill, and you'd be better off just using the regular spelling plus an acute mark over the vowel in the stressed syllable. For languages where there isn't even a question as to which syllable is stressed, providing any pronunciation simply adds visual clutter.

AlanF_US AlanF_US November 24, 2023 November 24, 2023 at 10:12:27 PM UTC link Permalink

It looks good, but I was confused by the fact that the user interface is translated partially into German and partially into English. If you're not going to provide separate translations into a variety of languages (and I could see why that would not be feasible in the beginning stages of your project), I think you should start by picking a single language for the user interface. This would also make it easier to produce translations in the future.

AlanF_US AlanF_US November 24, 2023, edited November 24, 2023 November 24, 2023 at 12:07:06 AM UTC, edited November 24, 2023 at 9:41:25 PM UTC link Permalink

Thanks for pointing that out. I changed the title of the section from "I would like to use Tatoeba's data for my project. How do I give proper attribution?" to "I would like to use data from Tatoeba for my project. How do I give proper attribution?" and the link started working correctly. I also saw this behavior with another section title. I think the wiki software may not be handling apostrophes correctly.

AlanF_US AlanF_US November 23, 2023 November 23, 2023 at 2:27:50 PM UTC link Permalink

Many thanks for your kind words, morbrorper!

AlanF_US AlanF_US November 22, 2023, edited November 22, 2023 November 22, 2023 at 6:11:18 PM UTC, edited November 22, 2023 at 9:26:43 PM UTC link Permalink

I've been doing some thinking about why politics and Tatoeba mix so badly. I took a look at the site's Terms of Use (which I encourage you also to do, as you might find them more interesting than you would think), and a few phrases jumped out at me:

- "Our decisions are independent of any political, national, religious, cultural or ideological current..."
- "We judge neither the logic nor the veracity of statements..."
- "... our will is not to censor, but to preserve the tranquility of the project..."
- "To maintain harmony within our community, we invite you to show moderation..."
- "In particular, we prohibit, but not exhaustively, the purposes of propaganda (political, religious, ideological, denial, etc.), pornography, defamation (individual or community), etc. ..."

They resonated with me because they captured the spirit that drew me to Tatoeba as much as the promise of finding a mind-blowing variety of sentences and translations in all the languages I was learning. I discovered soon enough that people at Tatoeba did not always get along perfectly, and both their sentences and their comments could reflect divisiveness. Nonetheless, the voices of calm seemed to win out more often than not. This was especially welcome to me because in an earlier phase of life, I had read political blogs obsessively, and then burned out. Even though the blogs tended to attract people from the same part of the political spectrum, who generally agreed with each other, and turned the majority of their anger against political opponents outside the blog itself, there was something exhausting about the whole atmosphere. Tatoeba appealed to me because it was a neutral zone: a place where I could get satisfaction from immersing myself in other languages, in writing sentences and translations that could help me and others at the same time. It was about language, not about battle, and I found that refreshing.

My earlier experience with that other world makes me highly attuned to the way that I feel when a sentence throws me "out of the zone". Pornographic-style sentences, or ones that glorify violence, can do it, but political sentences, including but not limited to those about groups and ideologies, have a disruptive power all their own. They instantly evoke allies and foes pitted against each other in the outer world, and challenge you not only to choose a side, but to throw yourself into an arms race. The fact that Tatoeba is built around individual sentences, not around essays where an idea can be developed and context or the various sides of a question can be addressed, makes the sloganeering even more obnoxious. And also ineffective: in a political blog, you're engaging with a self-selected group of people who are sufficiently interested in politics and current events that you might actually learn something from them. Writing political sound bites in a fundamentally apolitical environment like Tatoeba, especially on a large scale, amounts to propaganda: throwing out assertions in the hope that they will lodge in someone's mind and stick there, preferably in the subconscious. It's not whether these assertions are true, partially true, or completely false that determines whether they're propaganda: it's the way in which they're used, and more particularly the goal they're meant to serve.

It is not our job as admins to decide Tatoeba's positions on individual political questions relating to the outside world, and this is for a simple reason: Tatoeba doesn't take such positions. It *is* our job to maintain harmony on the site so that people can search for and contribute sentences in peace. Sometimes we find that a contributor has written a batch of provocative sentences, usually political, that threaten this atmosphere. One of the tools at our disposal, other than deleting sentences outright, is to mark the sentences "unapproved" so that they do not appear in default searches or downloads. When we resort to this, which we try to do as little as possible, we may or may not explain why, and if we do explain, we may not do so immediately, or as comments on the sentences themselves. It simply takes us time to formulate our thoughts, and even more time to have a group discussion among ourselves. Furthermore, we sometimes find ourselves in a position where we've already explained to a contributor why a certain kind of sentence is objectionable, and we don't see how repeating ourselves would be productive.

Sometimes we mark sentences unapproved even though we actually agree (in part or in full) with the sentiments they express. We need to do this because (1) we have to be equitable in the way we treat sentences that violate the site's terms of use, regardless of where on the spectrum they fall, and (2) we don't want to start an arms race in which the sentences posted in response are more objectionable than the original. Sometimes we encounter a person who calls us out for "censorship" because we unapprove or delete sentences they agree with, and sometimes we run into someone who castigates us for not taking action against a sentence they disagree with -- and sometimes these people are one and the same.

If you think that a sentence, or group of sentences, should be unapproved, or conversely, was unfairly marked as unapproved, your best bet is to send a private message to TatoebaAdmins. This will give us the ability to discuss and decide on the best way to approach the problem. The more public, and the more confrontational, you make your request or complaint, the greater the risk that the situation will escalate in a way that is not what you wanted.

I know that this has been a long post, and a personal one. Thanks for your patience.

AlanF_US AlanF_US October 27, 2023 October 27, 2023 at 2:34:03 AM UTC link Permalink

> If only one sentence has audio, that one is kept.

I can see why the algorithm works this way, from both a conceptual and a technical standpoint. The audio file name is based on the number of the sentence with which it was submitted. If the algorithm were to instead choose the lower number, the audio file would have to be renamed, which is tricky and could cause problems if something happened to the system at that moment.

Without knowing examples of the sentences, it's hard to know how often you see this occurrence or how likely it might be that two versions of these sentences, one correct and one incorrect, are submitted by chance. You're welcome to send me a private message with more information if you want.

Given the number of English sentences marked "@change" at any time, many of which are not particularly suited to recording, it's difficult to imagine how this could be exploited on a scale large enough to significantly affect one's ranking in terms of the number of sentences owned, either on the part of the owner of the original sentence or the owner of the one with audio. That doesn't rule out the possibility that someone would do it, but it does lower the stakes. It goes without saying that if this behavior were deliberate, it would be petty and worthy of reprimand. But I feel I don't currently have enough information to come to that conclusion.

AlanF_US AlanF_US October 13, 2023 October 13, 2023 at 3:08:18 PM UTC link Permalink

I hope you receive excellent care in the hospital and get well soon.

AlanF_US AlanF_US August 5, 2023 August 5, 2023 at 10:36:59 PM UTC link Permalink

Thanks.