Wall (6,175 threads)
Before asking a question, make sure to read the FAQ.
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
11 minutes ago
14 minutes ago
2 days ago
2 days ago
2 days ago
2 days ago
2 days ago
2 days ago
Would be really cool if we could add audio someday to the example sentences ;-)
We plan to do so, you will have more details and maybe a proof of concet at the beginning of April :)
Looking forward to that!
Did you change something with the database dump? This Saturday's jpn_indices contain invalid utf8 characters and the affected lines seem to be truncated.
The following sentence ids have problems: 83767, 91272, 140460, 146080, 152054, 190707, 195118, 199753, 205628, 211131, 213530, 235850
Ah, indeed, indeed. I had changed the 'text' field from varchar to varbinary, but kept the length to 500. That's why those entries were truncated. I've fixed it and did a new export of the jpn_indices.
> This Saturday's jpn_indices [...]
How do you know about that by the way? I don't remember making it official yet, that the download files would be upadted on Saturdays. (or did I? o.o)
1000+ sentences in arabic.
I'd like 2 thank everyone that has ever thanked everyone. On behalf of all of us you've thanked I say thank u for thanking us.
lol dane cook is brilliant :D
:D thank you^^.
I believe in ghosts. I believe in aliens. But theres no way u will ever persuade me into believing in alien ghosts. Ridiculous.
I believe in the sentence method. I believe in language websites. But theres no way u will ever persuade me into believing in sentence websites. Ridiculous
yay! first tatoeba joke :P (hmm I wonder if I can consider this a wall abuse..)
omg you're so funny, stop "abusing" the wall :D
I should just mention I never said that :P
But I do think it. Well, especially the "abusing the wall" part, because now I'm working on figuring out how to paginate this wall. Certainly there will be more abuse.
Just wanted to let everyone know, Tatoeba has been updated.
*shock* just figured out TRANG is a she, he he.
Ah, who told on me, that was supposed to be a secret.
lol, I never said this before but I actually stalked this website for quite a while before I finally decided to join, and I always imagined that you'd be like these programmers who like anime and have studied japanese for 5 yrs in their university..you know..with a cool blog about every obsessive detail of their life...and eyeglasses...you know the whole shabang... :D
P.S. guys like that do really exist :D
There's a lot of english sentences that are grammatically correct but I don't think anyone will ever say them, use them, or even see them in any english media...you know they're just "out of this world". What do you think we should do with these? Should we just ignore them for the moment, and focus on those that are totally wrong?
my take is, I'm gonna stay away from translating these and stop reporting them as wrong. I'm just hoping arabic natives can use sentences I'm translating to learn english.
what do you guys think? trang? sysko?
> There's a lot of english sentences that are grammatically
> correct but I don't think anyone will ever say them, use
> them, or even see them in any english media.
I think that it's more correct to say "any _current_ English media". The Tanaka corpus is old, and it used even older sources of sentences. Quite a few of them would not be out of place in books published before 1940, but are rather confusing to those of us in 2010.
I think those that are old-fashioned or highly idiomatic should be kept as demonstrating historical usage but should not be used as guides to writing English (or for translating into English). I think they are good candidates for an [Old-fashioned] tag or something. ;-)
Another problem is those that are written like dictionary entries (lots of 'one' usage) and those that are not really whole sentences. I think these are worth improving, as time permits, but are probably not a high priority for translation into other languages.
I agree to tag them in the future as "old fashioned" "40's english" "book-style" etc... rather than just "modernize"/"oralize" them
I am not sure how promising this is, but there is a Japanese-German sentence database hosted by the University of Hiroshima (Katsumi Iwasaki). It seems to have been created in 2004, without major updates since then. Maybe there could be a collaboration with tatoeba, thus increasing the number of sentence pairs. Of course, I am not sure about whether they want to publish the corpus, I am especially unfamiliar with data policies in Japan.
Here are the links to the search engine, the data description and the researcher's website:
And there's one also for spanish, I think it's free:
Maybe there could be a collaboration with tatoeba with that too :)
for the spanish, yep I need to contact the guy for a long time, but hmmm never find the motivation to write a email :blush: I will try to do so, I promise
you can do it sysko! :)=)
Some more suggestions.
I know that time is limited, so I shall try to keep a high ratio of usefulness to time required to implement. ;-)
1. Add a prominent link from the Tatoeba Project home page to the Tatoeba Project Blog. Actually I think it's worth adding a "Links" item to the list of headings on the top of the page. Useful links would include popular dictionaries, language sites, and sites that host collections of sentences.
2. Wish list. Maybe best as a blog article? I think it would be nice to have an idea of what features are planned, how likely they are to be implemented and how soon. Users could comment on possible features and suggest new ones.
3. Active dictionary links. This would be a long term and high effort suggestion but I think it would be useful to have active linking available from words in example sentences. Some languages (Japanese, Chinese) would require more effort than others, but I think it would be well worth it in the long run.
1. Like sysko said, the top menu has reached its limits. Someone with a 20 characters username (which I think is the maximum length) and using the French interface... might actually not even "fit" up there if (s)he's a Linux user... Something we'll have to check.
What I can do for now though, is to have a link to the blog from the "What's new" (along with the Twitter link). And in the blog, there's a "Links" section which only has Tatoeba, but I can add other things.
2. You will have more hints on what we are working on in the next blog post. I can't be writing about everything we have in mind because there are just so many things. But I can at least mention what's planned for the next few weeks :)
3. This is actually not very easy to implement because each sentence itself is already a link, and clicking on it leads to the page where you can see the comments and the logs. But I agree that it would be definitely useful.
> This is actually not very easy to implement because each
> sentence itself is already a link, and clicking on it
> leads to the page where you can see the comments and the
Yeah, I thought about that. What you could do, though, is implement the links in 'tooltip' style windows. For Japanese it could look rather like ...
click on 成る（する） to get the full dictionary entry in a separate window / tab.
1 - in french version and also regarding to ergonomic issue, 7 items is already a maximum numbers, but in the same time I agree it will be better to have the links in more visible place, but what if we add a wiki, "dialogs" and so ? So I think it need us to review what is needed, where, and to make is as much pratical as possible, I don't really the top menu to be over bloated (but who wants ^^)
2 - In a first time yep it can be a temporary solution waiting a wiki (after finishing all the "small issue" I makke it my 1st priority)
3 - For chinese, adso (which is definitely my swiss army knife for chinese) give th possibility to segment a chinese sentences into "words", at least consistant n-grams, so it would not be "so" difficult, and I'm sure such tool exist also for japanese
> I'm sure such tool exist also for japanese
It does, but it's not 100% accurate. In any case this is the primary reason for the existence of the 'index' data for Japanese sentences.
I think that commenting on English sentences that do not sound right is very inefficient since there's a load of them. so...I created a list where everyone can dump them into for native speakers to correct.
Yes, I think once a sentence has been corrected, it should be removed from the list. Otherwise you would end up with a very long list...
Anyway, this is a good initiative! The only problem is that this list won't have a lot of visibility (for now), contrary to posting a comment, because the latest comments are displayed on the homepage.
So until we have time to do something about it, what you can do, I guess, is to contact other members who are native English speakers, and ask them if they could help with those sentences.
true, will do both then...
I just hope it doesn't look like I'm hijacking the homepage :D
Any suggestions on what to do with sentences that have been corrected, commented on, etc.. shall I just remove them from the list to keep it clean? (I think everyone should just correct them directly and then remove them from the list...)