menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search
pullnosemans {{ icon }} keyboard_arrow_right

Profile

keyboard_arrow_right

Sentences

keyboard_arrow_right

Vocabulary

keyboard_arrow_right

Reviews

keyboard_arrow_right

Lists

keyboard_arrow_right

Favorites

keyboard_arrow_right

Comments

keyboard_arrow_right

Comments on pullnosemans's sentences

keyboard_arrow_right

Wall messages

keyboard_arrow_right

Logs

keyboard_arrow_right

Audio

keyboard_arrow_right

Transcriptions

translate

Translate pullnosemans's sentences

pullnosemans's messages on the Wall (total 111)

pullnosemans pullnosemans April 24, 2016 April 24, 2016 at 4:55:17 PM UTC link Permalink

maybe it would be a good idea to display an "is your language not in our database yet? click this link for instructions on how to add a new language" kinda message on the front page or another page that is very frequently visited.

pullnosemans pullnosemans February 15, 2016 February 15, 2016 at 12:48:08 PM UTC link Permalink

TRANG, I hope I'll find the time to answer you tomorrow. For now, I'll only quickly answer gillux:

I have the impression that we are on the same page here. I'm glad about that, especially seeing that only two people gave constructive feedback to my post.

I absolutely think having a forum used only for discussing important issues would be an important step towards getting things more organized.
one possibility would be to have only a handful of people who can start threads on the forum, and only one thread can be open at a time, but when one is open, it's serious. after one or two weeks of discussion in the thread, the people with the required technical knowledge will make a decision and implement it.

of course the tool is not the problem (or rather not the only problem), but starting with having a new tool could help changing everything else, I think.
as far as prioritising tasks goes, for example, we could have it that people can still make proposals on this wall, and when a proposal really kicks off a discussion and/or comes up several times (as e.g. the orphan sentences problem), it gets a thread on the "serious" forum, and after one or two weeks, it's started to be worked on.
this way, problems can still fall by the wayside, e.g. when they are addressed while another big issue is being discussed on the forum, but it would be a start.

pullnosemans pullnosemans February 15, 2016 February 15, 2016 at 12:20:14 PM UTC link Permalink

c'est absolument fantastique! j'ai aucune connection à votre langue (du moins pas encore), mais je suis très content quand même. je vous félicite de votre succès.

pullnosemans pullnosemans February 10, 2016, edited February 10, 2016 February 10, 2016 at 6:04:26 AM UTC, edited February 10, 2016 at 6:05:33 AM UTC link Permalink

**creating a team of coordinated core members on Tatoeba**

I think that now that Tatoeba has grown into a community of decent size, with some fairly constantly committed members, it is time to think of creating a team of core contributors.

Many ideas are presented on this wall, but they are often so uncoordinated that they are lost in the chaos of everyone just saying what they think should be done without any possibility of them actually doing it because in the end, they have no say in what is done with the site. There are no clearly assigned roles as to who is able to decide what.

To make better use of the people's ideas, I think we need to have some kind of interface where a discussion can be started with the fixed goal that at the end of the discussion, all suggestions are taken into account and a decision is made.

To be able to do this, I suggest we form a "family" of experienced and trustworthy Tatoebans who are familiar with each other's competences, willing to pick issues to work on in an organised manner, and then working together to make it happen.

For a more in-detail explanation of why I think we need this, see https://tatoeba.org/fra/wall/sh...#message_25456

I am thinking of taking the time to create a poll where people can say they would be willing to be members of this core community.

Do you think this would be a good idea?
If yes, what else do you think should be included in the poll?

Please let me know. Thank you.

pullnosemans pullnosemans February 10, 2016, edited February 10, 2016 February 10, 2016 at 2:19:37 AM UTC, edited February 10, 2016 at 5:41:41 AM UTC link Permalink

**to CK**
"I was misquoted. I never said "if not zero"."
"I didn't read the original post carefully, since I find it tedious to read so much text that doesn't have proper capitalization, so I may have missed a few things."

One of the things you missed is that I only quoted you saying the percentage was "very low"; "if not zero" was outside of the quotation marks.
I'm sorry that my disuse of capitalisation is hard for you to read. I'll try to remember this when I post something in the future.


"[opinions on deleting unowned sentences or modifying the non-display feature]"

I like the idea of using colour coding. If I remember correctly, the last time we talked about this we already addressed the issue of colourblind people and agreed to use colour coding in combination with icons. I think this could be implemented very well.




**to everyone**

Unfortunately, I wasn't successful in my request to have people clearly state "for deleting" or "against deleting", but as far as I read the answers correctly, for now everyone is against it because they think it needs to be done carefully, even if right now no one seems to have a specific idea on what this would mean.

I thank everyone who suggested other ways to solve my personal problem with the unadopted sentences in English and Japanese, via commenting on this thread or sending a pm, and I will try all the suggestions out and see what they can do for me.


However, and this is very important:
I have not primarily started this thread because I have a problem with my search results working with the Japanese corpus.
I have started it because Tatoeba right now has a significant problem that is frequently being addressed, and yet even after the last rather big discussion about the problem, we arrived at no result at all. The discussion just sort of ended.


I may be reading too much into this right now, but I am getting the impression that another important thing Tatoeba lacks right now is a sense of determined companionship, teamwork, call it whatever. To me, subjectively, most people on here appear to be doing their thing, trying to work around things they don't like, and through everyone's individual interest or belief in the project, things eventually do change, but much less than they could.
I can think of some prime examples of people just contributing their own stuff without a lot of context or interaction with others and then simply leaving, as well as of some examples of people who do work together to achieve something that just makes Tatoeba a better site as a whole, but the point here is not to scold or praise any individual people.
The point is to tell you that I think with all the competent people that we have on here, that are constantly involved each on their own, we could gain much more momentum if the completely open landscape of the site would develop a more intimate, closely-knit core. Right now, it appears to me like it's simply TRANG's site, and everyone else is just getting involved where they please, some more and some less, but generally unable to really get together to get something moving on a larger scale, and often times even against one another instead of together. A lot of energy simply evaporates, people shout their thoughts into the prairie and then mostly go on doing their own thing again. This is also the case with this thread: Everyone states their opinions, but there is no certainty at all that in the end *anything will be implemented*. The only concrete proposal was made by CK, and so far, no one has said anything about it.

What could be happening instead is having a thread started in a forum after choosing a single problem to work on (in this case, "What should we do with the large number of unowned sentences right now?") with a 100% goal of arriving at a decision what do to with them within a week. This thread would not simply be started, but the problem would be chosen to be the next one on which a thread is started by the community.

I will try to find the time to create the poll I recently spoke of, and see if it finds any response and can actually start a change to this to a certain degree. But the paradoxical thing about this is that if I am the only one who thinks this should happen, I myself, as an individual, will not be able to make it happen.
I will therefore now open another thread with a link to this one (thanks for the suggestion, CK) where I will ask whether people are interested in this concept, and they think making such a poll would be worthwhile.

pullnosemans pullnosemans February 10, 2016 February 10, 2016 at 1:28:48 AM UTC link Permalink

cool!

pullnosemans pullnosemans February 8, 2016 February 8, 2016 at 5:18:41 AM UTC link Permalink

**deleting unowned english sentences**
**introducing a feature to keep unowned translations from being displayed**


to extract one central point from the recent big discussion about improving the quality of the tatoeba corpus, I would like to again address separately the issue of deleting the unowned english sentences (and possibly, the japanese and/or french ones).

ck has stated that they have checked the entire english corpus and adopted all sentences they deemed worthwile, so that the percentage of good unadopted english sentences right now should be "very low", if not zero.


*since the discussion about deleting the unowned sentences has not yielded any real results, I hereby request that they be deleted, and ask anyone interested to state whether they are for or against this.*


I want them to be deleted because it often happens that I search for a japanese word, and most if not all hits have only one direct translation in one of the languages tatoeba displays to me, english. these translations are frequently unowned sentences from the tanaka corpus, and they feel very unnatural to me. they also include many prime examples of my recently mentioned problem of tatoeba sentences lacking context (e.g. "he shuddered at the sight.": who? what sight?).
this generally makes me abandon the japanese sentence they are linked to as well because my japanese is not good enough to judge whether a sentence is good or not.
this process is always frustrating, because I have to read through sentence after sentence only to find that the english translation is rubbish, or at least untrustworthy. I would much prefer not having these translations displayed at all and speeding up my search by being able to only pay attention to sentences that are usable.


the situation is more complicated with japanese, where deleting all unowned sentences would mean losing 60-70% of the corpus, so I cannot really say anything on this one.
with french, I have only read one clearly stated opinion, which was sacredceltic's, saying that he would rather have them deleted right away because they are "a huge smear on the french corpus".


alternatively, I call for the feature of not displaying orphan sentences by default to be extended to direct and indirect translations of owned sentences. this way, even if the bad sentences remain, they are easy to avoid entirely.
another alternative would be not deleting the sentences, but simply hiding them from public display until a decision is made on how to treat them.

the only option I think we should definitely NOT go for is simply leaving things as they are.

pullnosemans pullnosemans February 2, 2016 February 2, 2016 at 10:54:51 AM UTC link Permalink

everything you say is true, and the problem of losing links might be the biggest issue to take care of if we want significant change, but as I said, I think we should suck it up and deal with these issues as best as we can rather than being like, "yeah, there's too many problems, we can't really do anything right now, let's talk about it again next year".

pullnosemans pullnosemans February 2, 2016, edited February 2, 2016 February 2, 2016 at 7:49:07 AM UTC, edited February 2, 2016 at 7:54:33 AM UTC link Permalink

"I had discussed, a while ago, the case of the unadopted Japanese sentences and asked what if we simply delete them? The answer I got was basically that they are not harmful to the point that they should be deleted. Therefore in the case of Japanese, we will keep them.
But if, for another language, the community considers that the unadopted sentences are basically useless should be all deleted, I would not necessarily reject the idea."
(TRANG)

as for this, see my reply to tommy_san's post above. I generally think it's better to lose some not-completely-downright-useless material than to stay inactive and get no improvement to the current problematic situation in the corpora of some languages (english, french, japanese, korean come to my mind).


"So I don't think we could make it a general rule, to limit members to contribute only in their native language. This is something that should rather be decided case by case, for each user."
(TRANG)

I agree. I also think this could well be combined with the creation of a more clearly defined body of trusted/responsible users.


"I'm all up for having more corpus maintainers and advanced contributors. But how do we achieve this? And how do we make sure that we are promoting the right people?
At the moment we lack people who want to or can invest time and effort into building a stronger community of contributors. We probably also lack people who are motivated to take new responsibilities, or who have the right mindset, knowledge and skills to take these responsibilities.
[...] There are people who want to become corpus maintainers but are not ready for it. There are people who would be great corpus maintainers but don't want to take the responsibilities.
Maybe there are certain things that we're doing wrong and we could fix, to create a more engaged community. But I can't see this happening if we don't have very dedicated and active community admins."
(TRANG)

we could have a one-week long poll that is advertised on the front page where users can state that they are willing to become part of an engaged, clearly defined community of people who work to systematically improve this site, by clear guidelines and clearly divided responsibilities. anyone who wants to be a part of it and doesn't seem untrustworthy picks or is assigned a certain job (maybe more important jobs for people already known for their good work on the project), and their activity is watched by the other members of the team. if anyone performs badly or there are problems, it is discussed with them in a group, and if no consensus is found, they can lose their responsibility. this doesn't mean, however, that they cannot regain it or get another one. I think the best way to verify that you're "promoting" the right people is to have a certain fluidity in the system, and have everyone be aware that they are collaborating in a team.
I, for one, would be very happy to be a corpus maintainer for german, since improving the sentences already existing interests me more than creating new ones. I once asked for this, but was told that german already had enough maintainers, which was fine with me (though I thought, "can you have too many maintainers?").


"4. Money"
(TRANG)

for now, I would say that recruiting new contributors via ads (possibly in combination with the poll for creating a dedicated core of contributors as I described above) would be more beneficial than paying people for contributing. this site does run on an open source concept, after all. everything else can be pondered over when there actually are immediate perspectives for getting a significant amount of funding.

pullnosemans pullnosemans February 2, 2016, edited February 2, 2016 February 2, 2016 at 7:25:11 AM UTC, edited February 2, 2016 at 7:27:12 AM UTC link Permalink

"There are actually some (though not many) sentences that I find plainly wrong or clearly unnatural, and thus harmful. I rate them "not OK" and "unsure" (even though I'm not unsure about anything) respectively to warn other users. However, most people don't see my ratings, so these sentences keep getting translated, especially often by new members.
If the community thinks it's better for me to delete these sentences, I can do so. In that case, you'd need to excuse me for accidentally deleting sentences that are correct in some variety of Japanese I'm not familiar with, or even ones that are correct in standard Japanese that include a word or phrase I don't know. You'd also need to excuse me for deleting sentences that could be turned into good example sentences with some changes. I don't have the time or ability to improve them and make sure the new sentences match all the translations, and there are many sentences that, in my opinion, wouldn't make good standalone example sentences anyway."
(tommy_san)


as a non-native speaker both of japanese and english, I can say that I would absolutely be in favour of your doing so. I have the impression that tatoeba lacks japanese speaking members who are reliable and willing to take action, and that this is the main reason why the japanese corpus is still so corrupted, so I think it would absolutely be worth paying the price of having some potentially salvageable sentences be lost if we can get a start on the cleaning up of the japanese corpus for it.

and if you ask me, any english tanaka sentence not yet adopted can be deleted right with the japanese ones. "The outcries of the angels go unheard by ordinary human ears." may be a sentence you could potentially come across in poetry or lord of the ring style fiction, but without context on a site such as this, it just sounds silly to me.

I generally think we need to become bolder in cleaning up this page, even if that means that we lose some material that could maybe eventually at some point in time be useful. this goes for any language with a great number of bad sentences right now.

I'd rather have a change now, and then be able to build up from that with revised concepts.

pullnosemans pullnosemans February 1, 2016, edited February 1, 2016 February 1, 2016 at 2:33:00 AM UTC, edited February 1, 2016 at 2:34:08 AM UTC link Permalink

ah, I see. so the central misunderstanding here is that you were actually mostly speaking of written language, when I thought you were talking about language in general. under this premise I can understand much better why you are saying what you are saying, because written language is largely a human construct and thus indeed primarily taught, not acquired naturally.
in this case, I can also agree that the system of english orthography is among the most erratic and inconsequent alphabet-based systems in the world, if not the most erratic (I assume this is what you meant when talking about english having fewer rules than many other languages). however, I still don't see how this would lead to english writing being more prone to language change.

and so you suggest that since we want tatoeba to serve an educational purpose and linguistic prescription by influential people is simply real, the site should have an orientation toward high-prestige conservative language varieties. I guess that's a relatable opinion, even if I ideologically oppose language prescription on an a priori basis.

glad we worked this out. I ask you to be more explicit about your focus on written language when making statements about the character or profile of languages, or talking about language education. I think it would make your point of view easier to understand and help create dialogue, especially for those whose perspective differs from yours.

pullnosemans pullnosemans February 1, 2016 February 1, 2016 at 2:09:54 AM UTC link Permalink

thanks, agreed.

pullnosemans pullnosemans January 31, 2016, edited January 31, 2016 January 31, 2016 at 11:42:59 AM UTC, edited January 31, 2016 at 12:07:01 PM UTC link Permalink

"If you watch urban dictionary, for instance, you'll see that whenever a "smart" contributor coins a silly definition for a word, it's immediately voted up by many others, who think they're funny, and that is the way crappy definitions end up having a high rank."

that's unfortunate. why exactly are you so convinced that tatoeba will show the same result, especially seeing as tatoeba is not a site with a humorous component like urban dictionary is?


"But there are also countless exemples on Internet (and also in the media, alas) of wrong syntaxes or spellings that are progressively becoming dominant since uneducated users or non-natives come to outnumber educated natives, and their belief that what they write is correct is reinforced by what they see on Internet."

so you are saying language use should generally be dictated by a small elite upper class?


"I know what you will retort : that's how languages evolve (or so people believe...), but it isn't true, otherwise schools and teachers would not even know what spelling or syntax to instruct."

I don't think I really understand this conclusion. what relevance do teachers play for a phenomenon such as language, which is acquired naturally by most humans?


"I know English may evolve a lot under this pressure, especially since English has few rules. But that is not the same in other languages to which far more rules apply, and where mistakes are more obvious in their regard."

have you ever counted the "rules" in english? do you have a statistic with an average of rules per language on the globe?
edit - added later: and in the framework of which grammatical theory are you making this quite bold claim, which even many reputed professors of comparative linguistics would never claim to have a deep enough understanding of the way language works in our brains to be able to make?


"A parallel example in French is the following : A majority of French people mispronounce « Les haricots ». At first, you may say : "So what ? Then their wrong pronunciation is the correct one"."

no, no, nobody says that, you are the one speaking about there being "the" correct one.


"But once you've said that, you haven't helped much the "mispronouncers" getting a job, because this mispronunciation actually works as a social/educational marker for French educated people. It tells them immediately what is the educational level of the person saying it..."

I don't know if this applies to all "educated people", or only to those who take delight in regarding their idiolectal variety of their native language as the only one with a right to exist.


"A voting system would actually strengthen people in their mistakes, to their own detriment, creating havoc in language rules that would end up being impossible to teach."

I would say languages are per se impossible to teach. they can only be learned, through careful observation. all teachers ever could do for me at least was give me access to material (including their own native output) for me to work with.

pullnosemans pullnosemans January 29, 2016, edited January 29, 2016 January 29, 2016 at 5:44:14 AM UTC, edited January 29, 2016 at 5:48:58 AM UTC link Permalink

one more thing about tatoeba that right now paralyzes the possibility for change is the fact that many bad sentences have links to good ones in so many different languages that it is impossible for a single person to change them while verifying that the new, better sentence still is a fitting translation for all the sentences it is linked to. therefore, no bad sentence like this can be changed without either having all linked sentences verified and if necessary changed by various contributors (which would be bullshit) or losing links to any language that the respective contributor does not know well.

pullnosemans pullnosemans January 29, 2016, edited January 29, 2016 January 29, 2016 at 5:23:45 AM UTC, edited January 29, 2016 at 5:32:21 AM UTC link Permalink

an important wall post, I think. let's see if we can actually get something going.

I think the main problem is that tatoeba right now has no clear identity as to where it wants to go (this is only my impression, no implication on trang's vision or anything like that). I have the impression that the project has no real means to actually ensure any quality, its openness is as much a curse right now as it is a blessing.

maybe through a clearer division of roles among contributors things could become more stable. my ideas for features that would reduce the openness of the site, but might nonetheless be beneficial in the long run are:

- increasing the number of corpus maintainers and encouraging them to take more radical action if they think sentences need to be changed or deleted
- forming bodies of trusted contributors who are known to be able to create good sentences and translations and emphasizing their importance to the project
- introducing a feature where contributors can be labeled responsible for creating translations from one specific language into one or maybe several specific language(s)
- creating a forum open only to key members (advanced/trusted contributors, corpus maintainers, etc.) with features for discussion and creating an overview over stuff that needs to be done (evaluating sentences, etc.)

the effects of these features could be increased by limiting the contributing rights of unknown members (e.g. introducing a feature where their sentences have to be evaluated by a corpus maintainer or trusted contributor before being displayed for everyone to see) and advertising on other sites for people to contribute to tatoeba (e.g., for japanese speakers to take on the tanaka corpus and change all the sentences into good, natural japanese).

the second point leads to the question of money. is tatoeba right now creating any money that could be used for ads, or even small monetary rewards for contributors?

all these ideas are based on the notion to increase the motivation for competent people to invest time into making tatoeba into something more stable. I don't know the details about how e.g. wikipedia manages the cleaning up of their articles, but I just generally feel that this site could be much more than it already is, but is right now in a sort of identity crisis as it has become big enough for its ambitions to grow higher.

pullnosemans pullnosemans January 18, 2016 January 18, 2016 at 9:02:53 AM UTC link Permalink

y'all a bunch o' fighunters, I say!

pullnosemans pullnosemans January 2, 2016, edited January 2, 2016 January 2, 2016 at 12:14:21 PM UTC, edited January 2, 2016 at 12:19:36 PM UTC link Permalink

the phenomenon is not at all limited to unowned sentences or sentences from the tanaka corpus. I just refreshed the home page about twenty to thirty times, and I would say that about 30% to 40% of random sentences exhibit this kind of thing, mostly in form of deictic expressions without any reference, such as then, there, she, this one, and so on.

however, it is true that bad tanaka sentences tend to be among the most puzzling ones. it's a real pity that the japanese corpus has this huge problem.

pullnosemans pullnosemans January 2, 2016, edited January 2, 2016 January 2, 2016 at 8:28:18 AM UTC, edited January 2, 2016 at 8:35:11 AM UTC link Permalink

**content and context in example sentences**

has there ever been any discussion about how to make example sentences in the tatoeba project more complete in terms of content and context? I remember sacredceltic raising the issue of the one-sidedness of the english corpus in that a large portion start with "tom", but has there ever been any thought about creating a guideline of what information to include in a sentence?

as a site that works with example sentences, tatoeba has to be aware that example sentences generally have the problem of existing outside of discourse, and therefore can easily become hard or impossible to interpret without giving a sufficient amount of context.

for example, I just changed a sentence of mine that read
"Sie teilen die Entfernung und Richtung des Futters durch einen Tanz mit.",
a more or less close equivalent of japanese "踊りによってその食糧までの距離や方角を伝える。"
and english "They communicate the distance and direction of the food by dancing."

these sentences do not contain actual content as to who communicates, with or to whom they communicate, what food we're talking about, and so on. if the reader happens to not know that this is the way bees communicate, the sentence would probably make little sense to them, and would therefore be hard to use. both sentences suggest that they are taken from a text, especially the japanese one, which contains an anaphoric "その".
I now changed the german one to say that it is bees who communicate, and that it is each other who they communicate with. with this information, it also becomes clear that is general food sources that would be relevant for bees: http://tatoeba.org/jpn/sentences/show/4418682

skimming through my example sentences, I have unfortunately noticed that many of them are such sentences without grammatical or idiomatic problems, but no real semantic content, often because they're translations of sentences with the same problem. I would like to try and go over them and make them more contextual, but this would create the big problem that I would have to unlink them from their less contextual equivalents, maybe leaving comments encouraging those sentences' owners to change them accordingly and re-establish the links.

so because this would be very cumbersome for me and other people, and because I don't think semantically thin sentences like "they shook hands" are not per se bad, but can be useful in early stages of language learning, instead I propose that the site admonish its contributors to avoid creating too many "pronoun - verb - specific assertion without context" sentences, but instead try to create more self-contained sentences that contain general assertions ("the sun is the center of our solar system") or that contain enough context for specific assertions ("whenever I go jogging, I listen to music" rather than "I listen to music") to be interpretable.

I think this would greatly improve the usefulness of this site. especially if this message gets through to the small central group of highly active people in creating sentences, it could make a big change.

pullnosemans pullnosemans January 1, 2016 January 1, 2016 at 6:40:36 AM UTC link Permalink

Guads nais Jàr!

pullnosemans pullnosemans December 30, 2015, edited December 30, 2015 December 30, 2015 at 2:22:11 AM UTC, edited December 30, 2015 at 2:22:28 AM UTC link Permalink

aah, thanks.

anyway, great list!