Register Log in
language English

chevron_right Register

chevron_right Log in


chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio


chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

Cangarejo {{ icon }} keyboard_arrow_right















Comments on Cangarejo's sentences


Wall messages








Translate Cangarejo's sentences


Contact Cangarejo


Comments posted
Sentences owned
Audio recordings
Sentences favorited
Show latest activity


  • Email notifications are ENABLED.
  • Access to this profile is PUBLIC. All the information can be seen by everyone.


Member since
February 13, 2022
advanced contributor
Frequency lists of words with number of translations:
Plain frequency lists of words:
Frequency lists of characters:
Selections of random sentences:
Sentences potentially with problems:
Potentially duplicated sentences:

I use the Google query below to find public domain sentences from VOA articles. Articles and image captions by AFP, AP, Reuters, RFA, RFE, and some other news companies are not in the public domain."+-"RFE"

I use the Google query below to find public domain sentences from Gutenberg books containing given words. Most of the books here, but not all, are in the public domain.

I use the Google queries below to find sentences in the public domain from US governmental sites. Federal content is in the public domain but state content is not. Research papers and other third party material are also not. Read the terms of use."word"|

I have a GitHub repository with scripts for processing Tatoeba dump files.

Dump files and sentence pairs can be downloaded here:
It's also possible to download lists of sentences. Each list has a page with a download button.

Translation dictionaries:

Translation corpora:

Machine translators:




Etymology dictionaries:

Frequency dictionary:


Frequency lists:

Writing assistants:

Random name generator:

Text tools:

VOA journalists:
Ade Astuti
Art Chimes
Brian Allen
Bruce Alpert
Deana Mitchell
Deborah Block
Erika Celeste
Faith Lapidus
Faiza Elmasry
George Putic
Hannah McNeish
Jeff Lunden
Jeff Swicord
Jessica Berman
Joe DeCapua
JoEllen McBride
Kim Lewis
Lenny Ruvaga
Marsha James
Mary Morningstar
Mike O'Sullivan
Parke Brewer
Ray Kouguell
Rebecca Ward
Refael Klein
Richard Paul
Rick Pantaleo
Rick Pantaleo
Rosanne Skirble
Shelley Schlender
Steve Baragona
Suzanne Presto
Ted Landphair
Tom Banse
Vidushi Sinha

Lists of undertranslated words:

Contribution statistics:

User, language, and sentence statistics:

Tatoeba’s Twitter page:

Tatoeba’s blog:

What’s new on Tatoeba:

Sentences rated “not okay”:

Sentences rated “unsure”:

Sentences tagged with “@change”:

Sentences tagged with “@check”:

Sentences tagged with “@check translation”:

Sentences tagged with “@change flag”:

How do I get started?

How do I get less repetitive sentences?

What kind of sentences are allowed?

Should traditional and simplified Chinese be separated?

Can I delete one of my sentences?

Does Tatoeba have too many simple, repetitive sentences?

How do I change my email address?

How do I transfer sentences from one account to another?

How do I search for exact words?

Can I mass-tag sentences?

Can I post sentences in a language I’m not a native speaker of?

How do I unlink sentences?

Can I sort sentences by difficulty?

How do I link sentences?

Can I upload a very large quantity of sentences?

How do I add audio to sentences?

Is it possible to search for sentences with a specific length?

How do I download sentences belonging to a particular user?

Can I get a list of all the sentences that haven’t been reviewed yet?

Is Tatoeba losing users?

Can I look at the most recent translations of my sentences?

Can I get a list of all the sentences that haven’t been translated yet?

Do indirect translations need to match the original sentence?

Do other projects misuse the sentences on Tatoeba?

Should there be restrictions on which names are allowed?

What is considered a sentence?

Where can I find a list of current corpus maintainers?

Has the Tatoeba corpus been poisoned?

Some public-domain translations by the US Department of State:

Lay or lie?

Reason why or reason that?

Segway or segue?

Better than I or better than me?

Snuck or sneaked?

Mr or Mr.?

Ingenuity or ingeniosity?

News is or news are?

Didn’t used to or didn’t use to?

Cite, site, or sight?

Nevermind or never mind?

Bended or bent?

Auger or augur?

Comma before “such as”?

Punctuation inside or outside quotation marks?

Which quotation marks should I use?

Which dashes should I use?

Quotation marks around nicknames?

Spaces before punctuation?

Que ou do que?

Há anos atrás?

Me a mim?

Vos a vocês?

Muitas mais coisas ou muito mais coisas?

Ouve-se sons ou ouvem-se sons?

Concordância com “maioria”:

Povos com inicial maiúscula?

Espaços à volta de travessões?

Pontuação e aspas:

Aspirantes a realizador ou aspirantes a realizadores?

Plural de CD e ONG:

Duas milhões ou dois milhões de estrelas?

¿Puntuación dentro o fuera de comillas?

Citation de vers:

La punteggiatura in italiano:

All my comments and sentences are in the public domain, assuming they were derived from content in the public domain.

VOA terms of use:

Gutenberg terms of use:

NASA usage guidelines:

NOAA policies:

FDA website policies:

MedlinePlus license:

National Cancer Institute license:

CDC policies:

USGS copyrights:

FBI / Department of Justice copyright status:

DOL / OSHA copyright information:

CIA copyright notice:

House of Representatives terms of use:

National Human Genome Reasearch Institute copyright policy:

Department of Energy web policies:

National Archives copyright and permissions:

Department of Education copyright status notice:

NIH guidance on use:

SEC policies:

GovInfo policies:

White House policies:

National Weather Service disclaimer:

FDIC content and copyright:

Federal Reserve disclaimer:

Fish & Wildlife Service disclaimer:

Senate website policies:

Women's Health policies:

Veteran Affairs copyright policy:

CC licenses:

I do not endorse any of the articles or books from which my sentences were taken. I am in no way affiliated with any of these organizations. Let me know if you do not like any of my sentences. I might replace them.

I do not usually translate other users’ sentences, and I do not expect other users to translate my own. I am also not looking for people to proofread my sentences. Tatoeba has a huge amount of sentences. If you want your sentences to ever be translated, you better ask your friends to do it for you, otherwise no one is ever going to even see them.

I am too busy playing with words in my little corner. I usually only translate English sentences containing words that have not been translated yet into Portuguese.

Does Tatoeba have too much negative content?

I prefer the old design of the sentence page. The new design unnecessarily buries some buttons under menus. Also, it’s not possible to change the language of a sentence and the translations become hidden when editing a sentence. The new design is also less compact on computers; it’s better on phones.

I like to disagree. Sorry about that. You will suffer my opinions. Let’s agree to disagree.

When a sentence is giving too much trouble, it’s best to just find a different one.

I prefer longer sentences, because there is more of a one-to-one relationship between the original sentence and its translation. Shorter sentences tend to have slightly different interpretations in different languages. Also, longer sentences from articles or books tend to be more interesting.

Am I a spammer? I should stop opening issues on GitHub.

You can safely ignore everything I say. Also, my corrections may be wrong. I am just an amateur. If a you disagree with any one of my suggestions, you should just ignore it and remove the “@change” tag. Also, do not bother asking me questions about my corrections because I probably will not be able to answer them.

It wouldn’t bother me if Tatoeba decided one day to arbitrarily exclude certain sentences from the files exported on the Downloads section to make the files more usable in politically-correct learning environments where sentences have to be flawless, as long as users still had the option to download the entire unfiltered corpus. This would be similar to what CK already does anyway.

It’s easy to filter out sentences written by nonnative speakers, or by an arbitrary selection of users, using the data exported on the Downloads page. There’s even a related option on the Search page. Some users and administrators are okay with non-natives writing sentences. Others are against it. Personally, I hope they don’t decide to start barring users from posting sentences in languages they’re not native speakers of.

I think it could be useful to be able to sort the list of all users who speak a given language by date of latest contribution.

It would be nice if I could receive notifications the same way I receive personal messages. That way, I would only need to think about notifications while I am actually using the website. Receiving email notifications in real time is somewhat stressful.

You can’t edit your own sentences after audio has been added to them.

There are over a thousand sentences in Chinese tagged with @change.

Maybe Horus could be modified to make suggestions when it finds certain words in certain languages. It could detect incorrectly capitalized words or suggest translations for names of places.

If Tatoeba ever develops a script to automatically remove spaces around em dashes, they can use it on my English sentences (not my Portuguese sentences because in Portuguese it's more common the other way around). They can also substitute round quotes with straight quotes. In Portuguese, they should substitute hyphens with non-breaking hyphens because of the way words are broken at the end of a line when the word already contains a hyphen. In French, they should substitute spaces with non-breaking spaces before punctuation.

My sentences were taken from articles and books in the public domain. Please do not translate copyrighted sentences.

You can use Wikipedia to find out what the translation for certain names is, especially names of animals or plants. Just remember to use other sources to double-check your results.

TRANG and CK ask that users not link sentences written in the same language. Personally, I thought doing so could be useful for monolinguals who want to translate expressions or sayings into the same language, or useful for linking sentences that have the same meaning, to make it easier to find indirect links that could be made direct.

If you find a sentence that needs to be changed, don’t forget to tag it with “@change” besides leaving a comment, so, in case the user never replies to the message, a corpus maintainer can later find it and decide what to do. It’s better to write “@change” in a tag than in a comment. Use also the review feature.

Maybe the list-of-all-members page should show which users have been active the past 24 hours instead of a list of who made the latest 200 contributions. I’m also curious about who has been most active the past week, month, and year. The rate of contributions keeps growing.

It would be nice if users were automatically assigned different-colored avatars when joining the website. That way it would be easier to distinguish between new users that have not yet uploaded an avatar. Like @brauchinet’s avatar.

In theory, it should be easy to implement a feature that allows excluding multiple users from the search results, but what you can do right now is just create a list of friends and only translate sentences belonging to those users. It should also be simple to make changes to the Advanced Search page to allow only searching for sentences written by non-native speakers when looking for sentences to proofread.

Users want their free translations, so there’s going to be friction, if the sentences in the search results are more likely to belong to some users and not others. The probability that someone will translate your sentences diminishes as Tatoeba grows in size.

When adding tags to a sentence, the autocompletion feature tries to find existing tags with the same prefix. I would be nice to have a similar feature on the Advanced Search page when looking for sentences with certain tags, and also on the Members page when looking for users.

You can’t edit another user’s sentences, but you can add alternate translations.

If you’re going to write a script in Python to calculate the novelty of sentences, you should probably exclude words that are not accepted by spellcheckers because those words are probably untranslatable.

spellchecker = SpellChecker(language = "en")
" ".join(word for word in findall(r"\w+", sentence.lower()) if not spellchecker.unknown([word]))

If you think you have found hate speech, don’t forget to tag it with “hate speech”.

Tatoeba should allow different accounts to use the same email address.

Users who complain about the quality of content on Tatoeba just need more features to be able to sort and filter sentences in the search results.

I think it could be useful to be able to sort sentences by number of translations, where translations into the same language count as a single translation.

Maybe the users’ reviews file, users_sentences.csv, should exclude outdated reviews.

The sentences I’ve been posting on Tatoeba aren’t messages I’m trying to get accross. They’re just the first sentences I come accross that meet certain criteria, such as containing yet-untranslated words, having a certain length, making sense out of context, being easy for me to translate, not being too negative or violent, not criticizing people, companies, or countries, and being in the public domain. I don’t really care about famous books or authors.

Maybe the vocabulary request page should allow sorting words by frequency according to frequency lists available on the internet. Also, it could be useful to be able to download the full list for each language and for corpus maintainers to be able to delete nonwords.

It’s easy to say a translation is bad. It’s not as easy to say how it could be better.

It would be useful if there were a character count when adding or translating sentences.

Using foreign words or untranslatable words in sentences is probably a bad idea. I think I might start avoiding sentences with names.

It seems the textbox for leaving comments on sentences is missing a checkbox for users to choose whether they want their comments to show up on the discussion page.

It’s probably a bad idea to get into arguments on the Wall or the discussion page.

I forgot the password for my previous email account, so I may not have seen your messages.

If people really dislike some of your sentences, they’re not going to want to translate any of your other sentences. You will be canceled.

Some users leave comments on sentences just to send messages to other users on the discussion page, so it’s probably pointless to correct other users’ sentences. The sentences might be messages or the comments might be the messages. It's probably more worthwhile to translate new sentences than to correct minor details in old ones. Text data are meant to be dirty. Tatoeba is more interested in translations than in corrections.

I disagree with @Idbx that contributions need to be “balanced”. Users should be allowed to contribute the way they want to and it’s up to Tatoeba to recommend lists of sentences. Tatoeba is like Twitter but grammatical and with translations. Some groups are interested in being able to communicate a set of messages, not in teaching or learning a language.

I disagree with @maaster that users should be allowed to keep sentences that are incorrect. If a native speaker tells you a sentence is wrong and how to fix it, the sentence needs to be changed.

It’s not possible to delete a review after a sentence has been deleted.

When someone changes the language of a sentence, that isn't registered in that sentence's history.

Tatoeba should include more language learning features.

Tatoeba was down October 14 and 15, 2023.

Tatoeba has a hierarchy of users. From top to bottom, there are “administrators”, “corpus maintainers”, “advanced contributors”, and “contributors”. “Advanced contributors” and above can add tags to sentences. “Corpus maintainers” and above can edit other users’ sentences. It’s written at the top of a user’s profile page what their current rank is.

All the sentences on Tatoeba have an open source license. You are free to make a copy of any of them and edit it. To show that a sentence was derived from another, just make it a translation of the other sentence and then unlink the sentences. The link will appear in the log.

I’ve added too many “to-days” to Tatoeba to be bothering other users about extra or missing hyphens or spaces.

Tatoeba is mostly used to build datasets of translations. The datasets can be used for learning, but on software like Anki or on other websites.

As long as your translations sound natural and are faithful to the original sentences, punctuation shouldn’t matter much.

This policy of write original sentences in your native language and translate sentences from non-native languages has the consequence that almost everyone will be translating from English and almost no one will be translating from other languages.

Tatoeba and the organizations that use Tatoeba’s data probably want sentences to be as neutral as possible, so users don’t feel put off by the sentences. Political messages are probably unadvisable, because messages like that are never able to please all sides on an issue. And, anyway, Tatoeba is for spreading languages, not for solving disputes.

Anything you post on the internet will probably be studied by artificial intelligences.

That annoying paperclip from Microsoft Office might be my cousin.

0x002D - hyphen-minus, dual purpose character
0x00AB « left-pointing double angle quotation mark
0x00B0 ° degree sign
0x00B2 ² superscript two
0x00BB » right-pointing double angle quotation mark
0x00D7 × multiplication sign
0x2010 ‐ hyphen, joins words
0x2011 ‑ non-breaking hyphen, prevents word wrapping
0x2012 ‒ figure dash, used for numbers (phone numbers, sports scores)
0x2013 – en dash, used for number ranges (time, year intervals)
0x2014 — em dash, signals interruptions
0x2015 ― quotation dash, used for dialog
0x2018 ‘ high six quotation mark
0x2019 ’ high nine quotation mark
0x201A ‚ low nine quotation mark
0x201C “ high sixty-six quotation mark
0x201D ” high ninety-nine quotation mark
0x201E „ low ninety-nine quotation mark
0x2082 ₂ subscript two
0x2192 → right-pointing arrow
0x2212 − minus sign


No language added.

TIP: Encourage this user to indicate the languages he or she knows.