Wall (7.219 threads)
Consellos
Antes de facer unha pregunta, asegúrese de ter lido o FAQ..
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
gillux
2 days ago
gillux
2 days ago
LeviHighway
2 days ago
EugeneGS
3 days ago
Ooneykcall
3 days ago
LeviHighway
3 days ago
frpzzd
3 days ago
sharptoothed
3 days ago
EugeneGS
4 days ago
Thanuir
4 days ago
[not needed anymore- removed by CK]
once the tags will be added, we will be able to do the following:
tags "unsuitable_for_children" etC. and maintain a list of tags which will be not accesible for non user/ user which has not active "unsafe search" option
oui
moi je suis très très petite
[not needed anymore- removed by CK]
simply because we haven't found yet time to rewrite the hint page :P
If okurigana is incorrect, can we just file this as bugs in MeCab?
In theory. However they are not really bugs in MeCab, but problems resulting from the dictionary used with MeCab. The dictionary used can both be selected (from a very short list ;-) and can be altered or aided by user-defined dictionaries.
So really what it needs is someone familiar with MeCab to find the best dictionary available and to add fixes for the problems noted.
However it probably is never going to be possible to be 100% accurate so to get the best results manual corrections will be needed at some point.
It's not that simple.
Consider the sentence: 君たちの訳文と黒板の訳を比較しなさい。 MeCab suggests わけ as the reading of the solo 訳, whereas we all know it's やく. MeCab's usual dictionary (NAIST-JDIC) has both versions of 訳, and no amount of adding dictionaries is going to "fix" it. MeCab uses some very sophisticated AI to segment sentences, and the dictionaries have parameters derived from training on hand-segmented texts. The trouble is that ...の訳を... could be either, and you need the context of the whole sentence to decide which is which. In fact the weightings for 訳/わけ and 訳/やく as solo lexemes are the same. You could probably fiddle the weights on 訳 to make it produce やく, but most of the solo appearances of 訳 in Tatoeba are in fact わけ.
There is a whole research field of Word Sense Disambiguation (WSD) working on problems related to this, but I don't think there are any packaged solutions for Japanese that can be plugged into Tatoeba. Just be grateful we have MeCab - 20 years ago automatic Japanese segmenters were thought to be impossible to build.
*furigana
Help wanted!
I'm looking for someone to help with Tatoeba / WWWJDIC integration. Specifically I'd like someone with database / web experience to work on tools for validating / completing the index data needed to link WWWJDIC dictionary entries to Tatoeba example sentences. If you're interested post here for more details or PM me.
revisiting sentence variation...
I came across a sentence in arabic (thx to qahwa's comment)...one of my sentences :D...It shows a property of the arabic script that I want to document with 3 or 4 variations:
هل استلمتَ الرسالة؟
Did you(male) receive the letter?
هل استلمتِ الرسالة؟
Did you(fem) receive the letter?
هل استلمَتْ الرسالة؟
Did she receive the letter?
without the harakaat (vowel marks) they're all written the same.
Now if I do add them, I'm afraid they'll just get reported as similar sentences and get merged/deleted/etc... (I mean the english sentences ofc)
what's tatoeba's 'official' statement on how to deal with this?
Similar things happen in Portuguese
Esse é seu brinquedo.
can be
[Hey, you, ]
This is your toy.
[Bob likes to play.]
This is his toy.
[Mary likes to play.]
This is her toy.
[Tex the armadillo likes to play.]
This is its toy.
There are unambiguous ways to say these sentences in Portuguese, but they are not used that often.
I don't think that this is the same problem as in Arabic.
It doesn't cause problems when Portuguese duplicates like your example are merged. But in Arabic it does. The example that saeb posted is the same *without* vowel marks, but with vowel marks, it's not anymore the same, and the pronounciation isn't the same neither. So it *looks* like a duplicate, but it isn't.
I don’t know about Arabic, but I personally prefer to differenciate sentences that are different in speech.
E.g. in Russian, Belarusian and Ukrainian one doesn’t normally mark stress, but when it’s important, I do (as in the case with зáмок/замóк in sentences No. 385729 and 385728).
New link for the typing game:
http://braulio.home.dyndns.org/
Now a little more user friendly and with longer texts.
A general remark: Given the huge amount of Japanese data, I think we need more Japanese contributors to help us with the correction and to keep the Japanese sentences more consistent. Talking to my Japanese language partner recently, I came to the conclusion that with its current features, tatoeba is rather unattractive for Japanese natives who want to learn another language, because most sentences are already translated to Japanese. For example, what would be the benefit for my partner who learns German?
There are probably more sentences in German that aren't translated into Japanese than you realize.
What you can do is ask Trang or Sysko to add a list
"ger->jpn translations needed"
They can filter the data to see exactly what sentences in German aren't linked to a sentence in Japanese.
Dear Trang, dear Sysko:
Would it be possible to do this on the contribution webpage, e.g. when one selects "ger->jpn",to show example sentences in German which have no Japanese equivalent, or is this to heavy a burden for the database? As the current system shows only 15 sentences, maybe this might also reduce the complexity of the search.
It's in our plans :)
...and has been for a long long time. But to be honest, I cannot tell you for sure when we can implement it. Perhaps in two weeks, perhaps in one month, perhaps in two months.
In the meantime, I can generate a list of German sentences that have no *direct* translation in Japanese. Most of them will have an indirect translation though.
...perhaps in two years :P
I've done some checking and there are 9570 German sentences that are not translated into Japanese. Starting with 123 and ending with 399357.
I can send you them in an email if you want.
Yes, please do so, though I think it would be preferable to have a list or even a search function for such sentences.
just to say it's now possible to view all "german sentences not translated in japanese" here http://tatoeba.org/eng/sentence...pn/indifferent (it will display indirect translations in Japanese, as this way it can be a fine way to view sentences than can be linked)
Here are a few things...
1) If your partner has a good level in German, he/she can practice translating from Japanese into German, and you would be checking his/her sentences.
Note that you can view someone's sentences by going to their profile, then clicking on "See this user's contribution" (below the link to send a private message). Then on the user's contributions page, you go to "See all" next to "Latest sentences" (yea it's a bit complicated, but we'll be improve the profile someday). And you can select "German" to only keep German sentences.
2) Your partner can enter in Tatoeba Japanese sentences that he/she would like to know how to say in German, and you can try to translate his/her sentences. It would be good of course to search first if the sentence doesn't already exist.
Basically, you could be chatting together and, as the conversation goes, your partner could add Japanese sentences he/she can't figure out how to say in German. Your partner would then send you the link to the sentence once it's added, and you would translate right after that. This is pretty much what most tandems do anyway, when they chat together "How do you say...?" But instead of keeping that knowledge in your private chat logs, you can share it with the rest of the world :)
It would be a good idea to also make a list out of it, so that you can keep track of the various sentences you learned.
sysko had made such a list: http://tatoeba.org/eng/sentences_lists/show/73/eng
It resulted from him and Dorenda chatting together in IRC.
3) You partner can practice translating from German to Japanese, while also contributing. Because as I said, most of the German sentences do not have a DIRECT translation in Japanese, but they do have an INDIRECT translation. So your partner could take on the task to link them.
But not just by reading and saying "okay, those translations match". I can create a deu->jpn list for him/her. She would then look at the German sentences from the list, and try to think of what the translation is. Then he/she can click on the sentence to view its details, and most of the time there should be an indirect Japanese translation. If that indirect translation matches what he/she thought of, they he/she can link the two sentences.
The only problem with this is that not everyone can link or unlink sentences yet. Only "trusted users" can (and even then, they can't link or unlink anything). The link/unlink feature requires you to understand very well the structure of Tatoeba, so it's not available to everyone because it's going to be confusing to new users more than it's going to be useful.
But I hope to make the link/unlink feature available to everyone through the concept I just explained.
So those are the ideas I have in mind. There are certainly more creative ones, but I'd have to think harder ^^
I have a game for you: "Tatoeba Typing"
http://braulio.home.dyndns.org:1004/
or
http://braulio.home.dyndns.org:1004/?lang=fra
http://braulio.home.dyndns.org:1004/?lang=deu
http://braulio.home.dyndns.org:1004/?lang=por
http://braulio.home.dyndns.org:1004/?lang=cmn
etc...
Please perform the following substitutions to make it usable for Russian, Belarusian and Ukrainian:
« » (non-breaking space) to « » (space)
«—» (M-dash) to «-» (or «--»)
«–» (N-dash) to «-» (I don’t add these, but who knows?)
«, », „, “ (different quotes) to «"».
I prefer to add these when adding Russian sentences, but most people have no way to type them.
Perhaps «’» should be changed to «'», but I'm not sure. Some people have only the first, some only the second. (Microsoft didn't have this in their Ukr. layouts up to Vista, so some people even use * instead!) It whould be the best way to add this as an alternative in a JavaScript.
I had noticed this problem, mainly with smart quotes. First I thought it was a simple matter of making some substitutions. But I guess the best solution is to let the user choose what character he/she will use.
For now I will make the substitutions you suggested (including «’» for «'»).
[not needed anymore- removed by CK]
Yes, actually, for a long time already I'm planning to have a "links" section.
Huh, I thought a word counted as '8 characters'. It looks like I've suddenly got much faster. :-P
I counted it as five, since this is what I've seen elsewhere.
That's a nice utilization!
It matches the speed I get at http://www.typeonline.co.uk/typingspeed.php pretty closely - shorter sentences go a fair bit faster, probably because I can memorize them before the timer starts.
Typing game is working again. Database server wasn't running after Ubuntu update.
I was looking at the list: "English sentences in need of correction, confirmation from a native, etc..." and I notice I can no longer remove sentences from it. Some have already been corrected and are now OK, so how do they get removed from that list?
I think you need to press on 'edit this list' for the remove buttons to appear...
Ah. Thanks.
Drumbeat - Register! Join! Like! (please ^^)
http://www.drumbeat.org/project/tatoeba-project
I have recently submitted Tatoeba to Mozilla's Drumbeat platform. This is basically a platform for people to promote their big ideas that can make the Web better.
I'll need all of you to register on Drumbeat, and to "join" and "like" Tatoeba! Let's make it to the top 10 most popular projects! :D
http://www.drumbeat.org/project-index?sort=popular
More importantly though, I'd like to apply to this Mozilla/Shuttleworth fellowship:
http://www.mozilla.org/grants/e...ellowship.html
Submitting a project on Drumbeat is the first step, which is why it is very (very) important to me that as many people as possible "join" and "like" Tatoeba on Drumbeat.
If I am granted this fellowship, it basically means a whole year where I can (finally) work *full time* on Tatoeba =]
Knowing that I currently work on it in my spare time, I'll let you imagine what that means :P
The current state of Tatoeba is really only the tip of the iceberg of what it could really be.
Now, the deadline is June 7th, so there isn't that much time left for me to:
1) Make a 5 mn video explaining the idea.
2) Write description of how the internet will be better if Tatoeba succeeds.
3) Define a roadmap.
...all of this while promoting Tatoeba. There is no way I can promote Tatoeba alone, so I'm really counting on you all, to get more people to support Tatoeba on Drumbeat :)
We made it to top 5 ^_^
http://www.drumbeat.org/project-index?sort=popular
Thank you everyone who voted!!
Now I can peacefully let fate do its job and go back to my normal activities in Tatoeba.
^_____^ i'm too happy about it <3
!!! TATOEBA LOVERS OF THE WORLD ;) stand up for this wonderful project !!!
Only 2 days left till the deadline!
Only 9 more votes needed to get into the top 5!
Join and vote please:
http://www.drumbeat.org/project/tatoeba-project
;)
join and luv motha'lova's...matta' 'f fact, get you' motha', fatha', bro's 'n homies, dogs 'n cats ta join 'n luv
join or I shoot motha'lova's!
http://www.drumbeat.org/project/tatoeba-project
@muiriel, the deadline is only for application to the fellowship ^^ I think we have a bit more time than that to promote :)
But even then, I'm confident we can get in the top 5 by tomorrow =P
And yes, I'm finally done with this application. Perhaps next week I can finally sleep at night :D
NOTE: If anyone is going to read the details and wonders why the content formatted strangely, it's a bug. Some HTML tags get stripped off, and I have no idea how to get around that...
Anyway, for those who didn't see, I made a 5 minute video for Tatoeba. You can watch it from the project homepage on Drumbeat:
http://www.drumbeat.org/project/tatoeba-project
or from YouTube: http://www.youtube.com/watch?v=ac9SmJuwHqk.
If someone has a looot (and I mean, a looot) of time in their hands and wants to make another, better, funnier, smarter, prettier -- or whatever -- version of the video, please go ahead :D
I would actually be quite interested in watching videos of "What is Tatoeba?" made by the community =]
Or videos of fake commercials for Tatoeba, that would be funny ^^
*push*
I think Tatoeba still needs us. Trang needs us.
Register! Join! Like!
http://www.drumbeat.org/project/tatoeba-project
;)
*push #2* :)
We do want Trang to work full-time on Tatoeba. :)
I want tags, and lots of other features! :)
Thanks Demetrius, thanks muiriel ^^
So far we have 39 votes.
That makes us the 8th most popular project on Drumbeat!
We only need 5 more votes, and we'll be 6th.
50 votes, and we'll be 5th.
I'll try to have the project details and video done before the weekend. Then I'll see what I can do to get as many votes as possible. I haven't contacted yet all the people I could :P
btw, 'dictionnary of sentences' in the project description has to be fixed...
Ah right, thanks ^^ I still need to improve this description though... I'm not very satisfied with it.
Not to mention that if we manage to reach the top projects and receive some sort of funding, we can start sending all sorts of goodies to our beloved and devoted contributors :D
T-shirts, caps, mugs, scarfs, plushies, USB flash drives, beautiful posters to decorate you walls with your favorite sentences on it, the revolutionary pencil that will translate your sentences as you write them... You name it!
And then in twenty years you will be able to brag to your friends that you were among the veterans of this epic quest, which brought peace and understanding among all civilizations in the world, and it was all thanks to you! (Or something like that)
We still have some catching up to do to these guys: http://www.drumbeat.org/project...rsal-subtitles
What's interesting is that they're also a kind of translation project.
It was actually thanks to them that we learned about Drumbeat :) I saw a small link featuring their project on Firefox' default homepage, when I was at work. It directly caught my attention.
We're planning to contact them some time, because there is definitely a possible collaboration here. As far as we're concerned, we already had a discussion (a long long time ago) in the team, about how Tatoeba could be a nice platform to add and translate subtitles of videos. But, it's still not in the urgent things to do.
Anyway, thanks for joining and liking ^_^ We have 16 likes so far. Another 5 and we make it to the top 10!
we need TRANG to start tweeting! who's with me? (*cheers* tweet, tweet) :P
I'm with you
Tweet ! Tweet!
Yes, yes, tonight I will :)
Yep come on join us, tell the world you like us (even if you don't like us:p)
it take a few time to register, and the benefit for us (you, Trang, me, everyone) can be really amzing:)
for a world where every sentences can be expressed in any languages :D
We're in the first page! We made it to the top 10! :D
(...for now)
=> http://www.drumbeat.org/project-index?sort=popular
Can we make it to the top 5... =]
Also created a Facebook group for Tatoeba.
http://www.facebook.com/group.p...29340017083187
If you haven't annoyed your friends with Tatoeba yet, now you have an easy way to do it :P