clear
swap_horiz
search

Öneriler

Burada Tatoeba'nın nasıl kullanılacağını sorabilir, hataları veya tuhaflıkları bildirebilir veya sadece topluluğun geri kalanı ile sohbet edebilirsiniz.

Soru sormadan önce SSS'yi okuduğunuzdan emin olun.

Duvar (4639 konu)

keyboard_arrow_left 1234567...464
Ricardo14
dün
** A reflection about posting sentences on Tatoeba **

As everybody knows, Tatoeba's mission is to get sentences and translations from as much languages as possible. For now, 300+ sentences are supported here - and this is the point I'd like to cover now.

Take for example, countries. In English we'd say "the United States (of America)", "Russia", "France", "Germany", "Portugal", "Japan", etc. (there are over 200 countries). Are we "allowed" to post one sentence - and translate it - for each country? I mean

I'm from Japan.
I'm from Portugal.
I'm from Germany.
I'm from France.
I'm from Russia etc

They would be near-duplicate sentences in English but for most of languages, it would help - A LOT - to study them

Take Portuguese, for example:

(Eu) sou **dO** Japão.
(Eu) sou **dE** Portugal.
(Eu) sou **dA** Alemanha.
(Eu) sou **dA** França.
(Eu) sou **dOS** Estados Unidos. etc

"de" is a preposition and as in many languages, we "link" them with the article (o,a, os) - de+o=do, de+a=da, d+os=dos.

I'm also asking that because many people - me included - wouldn't be able to think on a sentence to use the words given.

cevapları gizle
OsoHombre
19 saat önce
I agree with you, Ricardo. There are approximately 190 independent sovereign countries, each with its own name. If I write 190 sentences following the grammatical structure of 'I am from {country name}', I might have 190 sentences with exactly the same structure but with valuable linguistic information about the country names mentioned in the sentences, as this would help the reader-learner-translator learn how each country is called in a given language and how these country names are affected by the language's grammatical rules.

And, I'd like to remind everybody here that that issue of nearly-identical sentences is dead and buried. I no longer bother about that and I'd like to ask any normal human being that's a member of Tatoeba not to worry about that, too. I think that we're pretty free to write any sentence that's grammatically correct and meaningful as long as our intentions are to help the project, not to intentionally destroy it or affect its quality. Therefore, people contributing nearly-identical sentences for more linguistic information are not dumb and should not feel they're doing a bad thing.
Aiji
13 saat önce
We've already been through that several times.
I'll just quickly make a point. Again, please do not confuse the purpose of the tool with how to use the tool. Tatoeba is suppose to be a corpus of sentences, not a website to learn a language. If you think that it is helping your learning of languages, that is good, but please do not forget that other people share the same tool, and therefore may have different goals. Claiming a full liberty on the name of learning a language is doing to others what you don't want others to do to you.

Now, specifically for the countries name, again, a corpus of sentences and a corpus of words are two different things. Dictionaries are here to help finding the translation of a specific word. Tatoeba is not a dictionary. I'm pretty much sure that no language in the world has 200 grammatical rules, one for each country name... Therefore, for a corpus of sentences, the value to have the same sentences with 100 country names over having them with ten names is pretty much zero. However the value of having 10 country names over only 2 is huge, I join you on this point.
cevapları gizle
Ricardo14
8 saat önce
> Now, specifically for the countries name, again, a corpus of sentences and a corpus of words are two different things. Dictionaries are here to help finding the translation of a specific word. Tatoeba is not a dictionary.

Haven't seen that this way. Thanks for that

> However the value of having 10 country names over only 2 is huge, I join you on this point.

Thanks.
AlanF_US
7 saat önce
I agree with Aiji. Some additional points of my own:

Tatoeba should try to do what it is best at, not what it is not suited for.
Individuals should use Tatoeba for what it is best at.
Individuals should use whichever tool is best suited for whatever immediate task is at hand.

People learned languages long before Tatoeba, and indeed long before the Internet Age. In the old days, if you wanted to learn French, you used a textbook. Somewhere in that book, there was a lesson on how to write phrases with country names. It presented an explanation (the fact that country names have gender and number, etc.), lists of countries with masculine, feminine, and plural names, sample sentences, and exercises. The good ones even contained pictures. The fact that a textbook was written by a small number of knowledgable people, proofread, and published, helped ensure that the explanation and the sentences were coherent and of high quality.

Nowadays, there are resources on the Internet that do pieces of what textbooks have always done, although in a more scattered form, so you can find the equivalent of the textbook country name explanation online (usually together with some advertising), at least for major languages.

Tatoeba allows people all over the world to collaborate on the "sample sentence" part of a textbook, and furthermore builds links between hundreds of different languages. I don't need to tell anyone here all the wonderful advantages that that brings. However, Tatoeba cannot and should not try to replace textbooks (printed or electronic), or electronic references. Unless you're a very young child, whose brain is wired to construct rules entirely from examples, you need to learn the concepts. There's only so far that tags and comments on individual sentences can take you if you don't start with an explanation.

>> I'm pretty much sure that no language in the world has 200 grammatical rules, one for each country name... Therefore, for a corpus of sentences, the value to have the same sentences with 100 country names over having them with ten names is pretty much zero.

Well put. Furthermore, I would say that in many ways, Tatoeba would decrease its value by covering 100 country names rather than 10, because it depends on human readers and translators. Let's say that I want to find sentences in language X that contain the phrase "I live in". Or let's say I want to translate the most recent sentences, or the most recent sentences in language X, or the most recent sentences by contributor Y. If contributor Y has just contributed 100 languages of the form "I live in" in language X, every one of those lists will be watered down by the same mind-deadening, content-poor material. While it's true that you increase the chances that someone looking for a sentence with specific content will find it, you will also increase the chances that someone who is looking for something broader or deeper will need to filter out all the unwanted noise.
Ricardo14
dün
** Adding languages **

This has been intriguing me a little bit

For now, whenever someone wants Tatoeba to support a certain language, this user has to
1) get the
-ISO 639-3 code
-an icon that would represent this language better
-a link on wikipedia that talks about this language

2) send the information above to me, Nuno and cueyayotl

3) wait

in three single steps, we might have lost some members that would could have helped us by posting sentences - even more in endangered languages. This can be even harmful - why .*** language is supported and mine isn't? / it's taking too long to have my language on Tatoeba,etc

Suggestion

During the next weeks, I will find people who speaks these languages and ask them to
1 - create an account at tatoeba and start posting sentences (I've already done that). However, if it doesn't work,
2 - I'd post his/her sentences under my accounts after checking in foruns, another websites, etc

All that to Tatoeba support as many languages as possible.

Suggestions
cevapları gizle
Selena777
21 saat önce
Maybe, it's better to create a special account for every "indirect" contributor? I.e. "All sentences added from this account was created by Mr./Mrs N, who is a native speaker of X language. As long as Mr./Mrs N refused to create his/her own account and post sentences by his/herself (lack of time, bad internet connection and so on), this account will be maintained by Ricardo14. If there is a few such "indirect" contributors in the same language it will be much easier to separate their contribution.
cevapları gizle
Ricardo14
13 saat önce
It's a good idea but I wonder if we can "speed up" this process by alread having such sentences on Tatoeba already.
Aiji
13 saat önce
Isn't that the purpose of the "other language" flag?
I think it already happened in the past, a user added sentences with the question mark flag, and in his profile it was written he was waiting for approval of his language.
cevapları gizle
Ricardo14
13 saat önce
@Aiji

Yes, you're right. My point is: Can't we "speed up" this process by adding some languages before a new user requests it?
brauchinet
12 saat önce - 9 saat önce
I'm not convinced that it's really a good idea to speed up the process.
We have now more than 100 languages with less than 30 entries (many of doubtful qualitiy, we may guess).
I find this is even more deterring for potential contributors than not having a language at all.
What would you think of a website when your language is represented by only 10 sentences, one of them perhaps something like Good day dear landwoman?

Maybe, on the contrary, it would be better to require at least say 100-200 sentences, before a new language is added.
This would ensure that the contributor (ideally a native speaker of course) has a serious interest in the language. Remember that the possibility to add sentences in "exotic" languages will likely - as it did in the past - attract non-natives. Their sentences remain uncorrected, and when resources of that language on the internet are scarce, these low quality sentences will find their way into all kinds of other language websites. So maybe we are not even doing those languages a favour by just providing a platform without any requirements in quantity (and quality).
cevapları gizle
Ricardo14
8 saat önce
> We have now more than 100 languages with less than 30 entries (many of doubtful qualitiy, we may guess).

They're checked by either me or cueyayotl. I ask my friends (most of them, natives), if those sentences are correct or not and even on foruns.

> What would you think of a website when your language is represented by only 10 sentences, one of them perhaps something like Good day dear landwoman?

I don't think that happens here. Never seen that before (or cueyayotl, who "controls" which language will be added or not. Many weren't because they were either not sentences or don't have an ISO 639-3 code - https://tatoeba.org/eng/sentenc...ne/indifferent

> Remember that the possibility to add sentences in "exotic" languages will likely - as it did in the past - attract non-natives.

Is it too risky like that? I'm sure that we can bring people who speaks that language. Some of my friends will join Tatoeba in the following days and their languages were not added on Tatoeba

> Their sentences remain uncorrected, and when resources of that language on the internet are scarce, these low quality sentences will find their way into all kinds of other language websites.

We've been cleaning up that. I've asked people to adopt/delete sentences from people who have their accounts set as inactive or were banned. It's a small effort yet, I know, but we're taking care of that.

for sure I want a sentence like "I don't have a car" instead of" I have car not" or "I haven't haven't have car not" or "I don't not have car" but I believe we're alreading doing that every day, every time.

Just thoughts...
beggi
4 gün önce
Arapça kelimelere bütün okutucu işaretlerin eklenmesi, bu dili benim gibi buradan öğrenmek istiyenler için elzem görünüyor. Bu konuda birşeyler yapılabilir mi?
cevapları gizle
Gulo_Luscus
4 gün önce
Öncelikle hoş geldin. Bu konularda İngilizce yazarsan yöneticilerden veya Arapçayla ilgilenen diğer üyelerden daha net ve kesin sonuçlar alırsın. Bu dille ilgili bir bilgim olmadığından yardımcı olamayacağım ama sorununu İngilizce yazabilirim.


beggi asks if it is possible to add all Arabic diacritics to words, for they are needed to learn the language.

cevapları gizle
odexed
4 gün önce - 4 gün önce
> add all Arabic diacritics to words

This is not how native speakers write in Arabic. Newspapers, books (except the Holy Quran) are written without diacritics.
cevapları gizle
beggi
4 gün önce
Sure they don't add them since they already know them. So I am not a native arabic speaker who wants to know how to write ...
cevapları gizle
odexed
4 gün önce
The point here is that Tatoeba prefers the most natural form for the sentences.

> We want sentences that a native speaker would actually use.
http://en.wiki.tatoeba.org/arti...how/guidelines
cevapları gizle
beggi
4 gün önce
Not a practical preference for Arabic learners lIke me...Thank you for the answers
OsoHombre
18 saat önce
I wish diacritics were systematically added. They are systematically used in the Quran to indicate how to precisely read the Holy Book, because any error in the pronunciation could change the meaning of a word, and misinterpretation of the meanings of sacred texts might be a very bad thing. However, diacritics aren't systematically used in normal writing (media, books, etc.) and this is one of the major obstacles for the promotion of the learning of Arabic by non-natives.
OsoHombre
18 saat önce
It is true that Arabic diacritics are very helpful for Arabic-language beginners. However, it is difficult to oblige ordinary contributors to systematically 'diacritize' (add tashkil = add diacritics) every single word they write. It's time-consuming. Yet if we want to make Tatoeba a really helpful website for Arabic-language learners, we should think about a system that I've observed is used for Chinese: the feature that lets people read a sentence in Chinese pinyin (Latin phonetic alphabet for Chinese) in addition to reading it in Chinese characters. It would be great if Tatoeba adopted the same system for Arabic with a feature that shows the same sentence with Arabic diacritics, but I think that, contrary to the feature used for Chinese pinyin, this cannot be automatically done for Arabic, because Arabic diacritization represents vowels (and sometimes tense consonants), but the occurrence of many of these vowels (especially at the end of an Arabic word) is governed by grammatical rules. Therefore, developing a program that automatically diacritizes Arabic sentences means that the program has to know Arabic grammatical rules, and this is not only very complex to prepare, but its complexity would also give way to many potential errors that the program would be making, and this would need years of continuous improvement before we see that program perform its function properly on Tatoeba, whew!!! Just talking about that makes me tired. It has to be done by someone, somewhere, some time, but I think that Tatoeba's volunteers aren't prepared for carrying out this from start to finish. Another solution would be having Arabic sentences diacritized by Tatoeba contributors. But before a sentence is diacritized, it needs to be really correct, because this diacritization involves literally re-writing the whole sentence by adding the diacritics manually. In this case, a sentence has really to be correct to be worth re-typing. One last solution that has just come across my mind is the transliteration of Arabic sentences in the Latin alphabet. There is already a Latin alphabet that's used by specialists to transliterate Arabic words and names (especially proper nouns), similar to the one used by Encyclopaedia Britannica (look at the map on this https://www.britannica.com/place/Saudi-Arabia). This is a partial, quick, and practical solution that would give people an idea on how to read an Arabic sentence or phrase, but the accuracy of such a transliteration system also needs to be checked by a native speaker (especially the word endings that are determined by grammatical rules).


Pfirsichbaeumchen
4 gün önce
[Suggestion]

I agree it is a hindrance for Arabic beginners, who need diacritics to learn. Japanese and Chinese come with Furigana and Pīnyīn, respectively. It would certainly help to have a feature to display (or not display) full diacritics for Arabic.
cevapları gizle
odexed
3 gün önce - 3 gün önce
The problem here is that it's not possible to put diacritics right automatically. And I don't think anybody would contribute sentences with full set of diacritics (the most Arabic sentences here except for those from our experienced contributors don't even have the proper punctuation.
cevapları gizle
Pfirsichbaeumchen
3 gün önce
Naturally people would have to go to the trouble of adding them manually. It is the same problem with Japanese. No machine can do that job reliably, yet the feature exists and is used by some.

I think the same was suggested for Hebrew a while ago, and there was the same resistance.
cevapları gizle
odexed
3 gün önce - 3 gün önce
I don't resist, I'm just pondering your suggestion. I myself am a student of Arabic and I know how hard it is to read it when you don't know the words yet. But I think it's bad to learn something wrong (I wouldn't trust this feature). In Arabic diacritics differ for the words depending on their position in the sentence and on the previous words so it's not very useful to only know how to pronounce standalone words.
I'd rather suggest to make the recording of the audio more simple (just by clicking a button without having to use any programs). This way people could contribute audio as fast as they translate so we won't have this problem.
cevapları gizle
OsoHombre
18 saat önce
@odexed

Yes, I'm with you regarding making audio recordings simpler. I've read about that and it's a real conundrum (peppered with some bureaucracy).
odexed
3 gün önce
By the way, I think your suggestion would make more sense for Russian (if somebody needs the feature for displaying the accents).
cevapları gizle
AlanF_US
3 gün önce
Another issue worth thinking about is whether characters with diacritics are treated like characters without them for the purpose of search, the same way that uppercase characters are treated like lowercase characters. This has to be set up manually. I did this for Russian with the stress mark (acute sign), and I also did it for Hebrew with the vowels and final consonants. Currently, there's nothing like this in place for Arabic.
cevapları gizle
odexed
3 gün önce
I wouldn't change the current settings because sometimes diacritics in Arabic are indispensable because they could change the meaning.
For example, دَرَسَ means to study something and دَرَّسَ - to teach someone
cevapları gizle
OsoHombre
18 saat önce
@odexed:

Another example:

نَزَلَ - to descend, to go down

and:

نَزْل - hotel

OsoHombre
18 saat önce
@AlanF_US

This is another problem, indeed. I user might be frustrated if they can't find an Arabic word in a search engine simply because the word isn't diacritized properly, however, there is another risk. The simplest solution would be 'turning off' the diacritics captor of a search engine, but in this case, if a search engine doesn't take into consideration diacritics, the search results might yield many irrelevant results. For example, the word 'درس' may either mean:

دَرَسَ - to study

or:

دَرْس - lesson


cevapları gizle
odexed
18 saat önce - 16 saat önce
If I don't know how to put diacritics on a word I can always look it up in my dictionary. On the other hand, I'd get frustrated while looking for examples with "دَرَّسَ" if there are thousands sentences with "دَرْس" and "دَرَسَ"
Ricardo14
3 gün önce
+1

Tatoeba is also a big tool for learners. There would be a way to do so, I think.
OsoHombre
18 saat önce
@Pfirsichbauemchen:

If there are any concrete suggestions, then I volunteer to take part in them. After all, I'm here for the promotion of the Arabic language (although I don't have much free time at any moment of the year and I may be absent for prolonged periods at a time).
Tirifto
29 gün önce
Saluton! Mi legis en kelkaj frazoj Esperantaj je «preni la buson.» Tiu signifo de «preni» ŝajnas idioma de kelkaj lingvoj; nek PIV nek ReVo ĝin mencias. Ĉu ĝi vere taŭgas?
cevapları gizle
PaulP
2 gün önce
Mi nur nun vidis vian komenton. Mi kredas, ke pli taŭga loko por la komento estus sub unu el la koncernaj frazoj. Ankaŭ mi persone taksas la esprimon „preni la buson” evitinda kaj uzas „veturi per buso”, „uzi la buson”.
arh
arh
3 gün önce
En respuesta a algunas consultas recibidas sobre la grabación de frases, he incorporado una pequeña sección de resolución de problemas a la guía.
Os recuerdo que podéis acceder a la última versión mediante este enlace:
https://mega.nz/#F!9cEixIAL!A5Al-5OWAzB_VTh0fBl0Ig
También se encuentra en el formato original (.odg), por si alguien desea traducirla o adaptarla a su propia lengua.
Os animo nuevamente a que pongáis voz a todas vuestras oraciones, para que, además de leerlas, los usuarios puedan escucharlas.
----
As I got some requests regarding problems with the recording of sentences, I have just added a short troubleshooting section to the guide.
Please remember that you may access the latest version using this link:
https://mega.nz/#F!9cEixIAL!A5Al-5OWAzB_VTh0fBl0Ig
I also left the original document (in .odg format) in case that anyone would like to translate or adapt it to his/her native language.
May I once again encourage you to record all your sentences, so that the users can not only read them, but also listen to them.
cevapları gizle
odexed
2 gün önce - 2 gün önce
Con todo respeto quiero hacerle una pregunta. ¿Usted realmente cree que las frases incompletas como
#5967339 - Porque lo quieres (Because you want it)
#5963710 - Quiero regalar (I want to give)
#5965211 - Comprar verduras y carne (To buy vegetables and meat)
#5965212 - Echar sal (To rub salt)
pueden ser de ayuda?
---
With all due respect, I would like to ask you a question. Do you really think that such incomplete sentences as
#5967339 - Porque lo quieres (Because you want it)
#5963710 - Quiero regalar (I want to give)
#5965211 - Comprar verduras y carne. (To buy vegetables and meat)
#5965212 - Echar sal (To rub salt)
may be helpful?
cevapları gizle
arh
arh
2 gün önce
With all due respect, I do.
And, with all due respect, I don't feel this particular thread was the appropriate one for your comment.
But you know what? I'm too fed up and busy to keep on arguing with you about the quality of my work.
So, you won't have to worry any longer about my contributions.
I am done.
------
Con el debido respeto, así lo creo.
Y, con el debido respeto, este hilo no me parece el apropiado para tu comentario.
Pero, ¿sabes qué? Estoy demasiado harto y ocupado para seguir discutiendo contigo sobre la calidad de mi trabajo.
Así que, en adelante, no tendrás que preocuparte más de mis contribuciones.
Se acabó.
deniko
2 gün önce
Thanks arh, that's a very good guide. Maybe I'll start recording audio one day too.
Hybrid
18 gün önce
Firefox tells me that Tatoeba's login is unsafe.
cevapları gizle
AlanF_US
16 gün önce
I tried logging out and in again in Firefox, but I didn't see any warning. Could you elaborate? I suppose that one of your Firefox settings could be responsible, but my security settings are pretty high, and I'm not seeing any problem.
cevapları gizle
Hybrid
11 gün önce
cevapları gizle
AlanF_US
11 gün önce
That message should be displayed if you're trying to log in from an address that does not start with "https:". The login page that I use has this address:

https://tatoeba.org/eng/users/login

Which address are you using?
cevapları gizle
Hybrid
5 gün önce - 5 gün önce
Thanks. I use tatoeba.org . It doesn't give the warning if I use https://tatoeba.org , but that takes much longer to type.
cevapları gizle
AlanF_US
4 gün önce
I wrote a ticket about this:

https://github.com/Tatoeba/tatoeba2/issues/1457

Is it possible that Firefox is pulling up an old instance of the http address when you type "tatoeba.org" (no quotes) into the address bar? What happens if you clear all the instances of addresses that contain the string "tatoeba.org" from the list of suggestions? (You would use the down arrow to select a suggestion, then press the delete button to delete it.)
cevapları gizle
Hybrid
3 gün önce
Thanks. It seems to use https now. So maybe that was the problem.
baudelaire
3 gün önce
Does tatoeba has an export option? Checking the downloads link, it looks like only pre-made exports are available. Is it possible to create custom exports?
cevapları gizle
odexed
3 gün önce
Just curious, what kind of exports are you interested in? Could you please elaborate it?
cevapları gizle
baudelaire
3 gün önce
Just make an arbitrary search and export it in the detailed format.
cevapları gizle
odexed
3 gün önce
What do you mean by a detailed format?
cevapları gizle
baudelaire
3 gün önce
https://tatoeba.org/eng/downloads

Fields and structure
1. Sentence id [tab] Lang [tab] Text
2. Sentence id [tab] Lang [tab] Text [tab] Username [tab] Date added [tab] Date last modified

cevapları gizle
AlanF_US
3 gün önce
You can create a list (via "Browse/Browse by list") and then manually add items that you find during a search to it. Then you can use the "DOWNLOAD THIS LIST" button in the right sidebar to export the list (provided it contains no more than 100 sentences). Optionally, you can include the sentence ids and translations into a selected language. The format is similar to the the first one you mentioned. Specifically, it looks like this:

Sentence id [tab] Text [tab] Translation

where "Sentence id" and "Translation" are optional.

There is no option to export a custom list in detailed format, but you can always download the entire list of sentences in the corpus in detailed format and use that along with your custom list. You can load the full detailed list (~560 MB) into a text editor on your computer (provided you have enough memory). You can also extract information from it via utilities or programs, depending on how comfortable you feel with them. I have been experimenting with a Python package called csvkit that includes both a ready-made program that can be called from the command line and a set of utilities that can be called from Python. It works fine with the nondetailed format, but after some processing, it often produces an error related to character encoding when trying to read our detailed format. Thus, when I need information from the detailed download, I load it into a text editor.
Joutsentaika
9 gün önce
I have just created an account, and now want to add languages to my profile. I keep getting internal errors. Is this a common thing, a transient thing, an annoying occasional thing . . .? Is there anything I can do about it?
cevapları gizle
AlanF_US
9 gün önce
Could you please give an example of what you're doing and the errors you're seeing?
cevapları gizle
Joutsentaika
9 gün önce
My profile > Languages > ADD A LANGUAGE
Select Albanian from dropdown menu.
Click radiobutton 1: Beginner.
Click ADD LANGUAGE

Result:

An Internal Error Has Occurred.
Error: An Internal Error Has Occurred
cevapları gizle
AlanF_US
9 gün önce
I was able to do it successfully three times in my browser (Firefox on Windows 7). What browser and operating system are you using?
cevapları gizle
Ricardo14
8 gün önce
It happened in my Chrome on Windows 7

http://prntscr.com/elgdi7
cevapları gizle
AlanF_US
7 gün önce - 7 gün önce
Thanks, Ricardo. I find that if I go to the URL in your screenshot (https://tatoeba.org/eng/users_languages/save), I see the issue on Windows 7 in both Firefox and Chrome. However, I don't see it when I go through the steps that Joutsentaika described on Windows 7 in either Firefox or Chrome.
cevapları gizle
Ricardo14
7 gün önce - 7 gün önce
I followed the same steps, basically

1> Click over "My Profile";
2> From there, click over "Add a language"
3> Choose a language (I got "Scottish Gaelic") and the level ("0 - Almost no knowledge)
4> Click over "Add language" http://prntscr.com/em04ty
Error - http://prntscr.com/em052b - http://tatoeba.org/eng/users_languages/save

"An Internal Error Has Occurred.

Error: An Internal Error Has Occurred."
cevapları gizle
AlanF_US
4 gün önce
Sometimes I get that error, but once I got a more detailed error message, namely:

:::
Not Found

Error: The requested address '/eng/users_languages/save)' was not found on this server.
:::

I added an issue ticket:

https://github.com/Tatoeba/tatoeba2/issues/1458
cevapları gizle
Ricardo14
4 gün önce
Thank you for that!
Joutsentaika
6 gün önce
I'm using Internet Explorer 11 under Windows 10, and as I'm at work I've got no choice about that.
sharptoothed
6 gün önce
** Sentences & Translations Stats **

Tatoeba stats, graphs & charts have been updated.
Everything is accessible from one single page now:

http://tatoeba.j-langtools.com/allstats/
cevapları gizle
raggione
6 gün önce
Thanks for the good work.
Guybrush88
6 gün önce
thanks!
Aiji
6 gün önce
Russian got over 500K! Congratulations :)

The new one is a pretty detailed chart.
cevapları gizle
OsoHombre
5 gün önce
I would like to congratulate my dear Russian and Russian-speaking friends for the fact that now Russian has over half a million sentences on Tatoeba. I would also to thank the Russian friends who always translate my sentences into Russian together with those who supported me during my battle for name diversity and name-choice freedom on this website. Congratulations.
cevapları gizle
odexed
5 gün önce
Thanks a lot. شكرا جزيلا
Selena777
5 gün önce
Thanks, Oso!
CK
CK
8 gün önce - 8 gün önce
There are now over 444,000 sentences on my list of proofread English sentences.


► You can browse the list.
https://tatoeba.org/eng/sentenc...s/show/907/und


► You can choose to limit searches to this list using the "advanced search” in the “Belongs to list:” item.
https://tatoeba.org/eng/sentences/advanced_search

To make things easier for you, I've created a page that is already set up to do this.
http://tatoeba.ueuo.com/search/


► You can find sentences on this list that aren't yet translated into your language.

Here is one easy way to do that.
http://tatoeba.ueuo.com/search/untranslated.html


► You can search with keywords to find English sentences on this list that aren't yet translated into your language.

Here are some pre-selected search keywords based on words that often appear on the TOEIC test.
http://tatoeba.ueuo.com/searcheng/?v=toeic


► You can be the first one to translate English sentences on this list.

Here is an advanced search to find sentences on List 907 that don't yet have any translations.
https://tatoeba.org/eng/sentenc...d&list=907


► You can study vocabulary found on the New General Service List (NGSL).

Choose your native language, then you can easily choose to find sentence with translations to study, or find sentences without translations to translate.
http://tatoeba.byethost3.com/vocab/
cevapları gizle
AlanF_US
7 gün önce
As I told CK via e-mail a few months ago, I've analyzed his list pretty thoroughly over a prolonged period of time, and determined that the quality of sentences on it is similar to that of the set of sentences owned by self-identified native English speakers, and even to that of the set of all owned English sentences. However, the diversity of sentences on CK's list is much lower than it is in the other two sets. For example, in January, when I did my analysis, I found that there were no sentences on his list that contained the word "cluster" (or variants of it), while there were six perfectly good sentences by native English speakers and two by non-native speakers. I also found that sentences added to exercise vocabulary that was not represented in the corpus, using a list that CK himself had formulated, were excluded. They included sentences like these:

"No matter how much I prod her, she refuses to start working on the assignment."
"I don't think that such an insult necessitates a response."

I don't have a problem in general with people formulating lists of sentences that they would like others to translate, especially if they want to identify them by specific criteria (such as "for beginners"). However, I do think that if someone creates the impression that the quality of some set of sentences that they've identified is higher than that of the remaining third of the sentences in their language, they are suggesting a standard that they cannot adhere to, because there is no practical way to indicate why each particular sentence is or is not on the list. Furthermore, if they are successful in influencing people to translate or learn their sentences before they translate or learn the others, the net effect is to cut off a third of the corpus consisting of perfectly good sentences, and sentences with "cluster" and "prod" will be excluded.

Note also that English sentences for which CK has recorded audio are overwhelmingly drawn from his list, and the sentences that he has recommended that non-English speakers record are generally translations of these English sentences. This doesn't bother me as much because I feel that if he's going to go to the considerable effort of recording great quantities of audio and coordinating other people's contributions, he has more of a prerogative in terms of choosing what the sentences should be. However, the net effect is that many fewer English vocabulary items are represented in audio than could be.

At the time that I had the e-mail exchange, CK had been publicizing the list quite a bit. I asked him not to do so, not only for its potential negative effect on the coverage and diversity of sentences, but also because I felt that frequent posts of this type were dominating the Wall, making it less interesting and harder to find other information. He did refrain from writing such posts (until now), which I was glad to see.

Like many of us, I truly appreciate CK's many contributions to Tatoeba. However, I feel as though this gatekeeping activity may do more harm than good. In particular, I want to make sure that sentences not on his list also get the attention they deserve.
cevapları gizle
OsoHombre
5 gün önce
@AlanF_US

I appreciate your analysis. Let me just add the following: from my humble point of view, the sentences you're talking about are relatively undiverse from the point of view of vocabulary. The ideas tend to be very repetitive, very similar, and they also tend to be grammatically and semantically undiverse in the sense that they don't explore a wide variety of domains or ideas. In short, they are too everyday-life. They tend to revolve around a simple world of everyday life similar to the world of a primary school or a language school textbook. For example, there are very few sentences in that selected corpus that deal with technical, scientific, professional and political domains, yet most translation students and professionals deal with texts related to these domains. Translators and translators-to-be like challenging sentences, quotes, passages, and texts to hone their translation skills.
keyboard_arrow_left 1234567...464