Wall (7,084 threads)
Tips
Before asking a question, make sure to read the FAQ.
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
sharptoothed
13 hours ago
lbdx
2 days ago
frpzzd
3 days ago
sharptoothed
4 days ago
atitarev
9 days ago
araneo
9 days ago
atitarev
9 days ago
LanguageExpert
14 days ago
Mesi
14 days ago
lbdx
15 days ago

What do those [M] and [F] tags mean that some sentences have attached to the end?
And in those sentences that have stuff like {this}{1}, should I add that in the translation as well, or is it just some remnant from the past and not important? I'm wondering, because we're told not to annotate sentences, but I'm not sure if this is considered annotating or not.

The [M] and [F] flags are only required in English sentences used in WWWJDIC. Ideally that information would be stored separately from the main sentence text and added as required later.
The {stuff}{1} like {this}{2} is an old mark-up system that is not currently in use. I would not add it anywhere it isn't already used, and Jim doesn't want it near the sentences he uses with WWWJDIC. ;-)

I see. Thank you. :)

I was just wondering, would mining sentences from speeches be okay? (since speeches are in the public domain right?)

I actually have no idea if every speech is in the public domain. But if you are only adding one or two sentences from a speech, then you should be fine. If you are adding the whole speech, then do make sure that it is in the public domain.
I would still encourage you to translate what is already here though. It wouldn't be very productive for you to add manually, one by one, sentences from a long text :P
Because yes, we will someday have a feature for mass adding.
As for having a special section for long text, we surely will have that too, someday. But in a much much more distant future than the mass adding. There is still a lot to do with sentences, before we move on to long texts.

you know, a translated sentence will always be a translated sentence...at least for Arabic, I can always tell if a sentence was translated no matter how good the translator is...so I'm mixing both mining sentences from native sources and translating existing sentences.
(looking forward to all the cool stuff that will be added to tatoeba :), I'm patient though afterall it took wikipedia a couple of years to become what it is...)

There is no natural way to say "The book is on the table" in Arabic? I don't speak Arabic, but I guess a good translator wouldn't try to keep the same sentence structure or even the very exact meaning if this produces an unnatural sentence. In Portuguese, French and Spanish I'm pretty sure we can say that there is a book on a table in a very natural way.

:D you got me there...simple short sentences are hard to argue against :)...but I'm pretty sure I've never used such a...uhm lame and simplistic sentence.
I don't know...I guess to get a native sounding sentence...you need context...a life situation...but then your translated sentence might end up being..."too native"...basically loose direct connection to what's being translated...because of idiomatic and figurative usage
Maybe I'm thinking too much :D

One more question (sorry), is there any plans in the future for mass adding of sentences to tatoeba?
[for example importing sentences to tatoeba from a txt file of for e.g. sentences separated by white characters]

No problem at all with questions, that show us there are people interested in Tatoeba :D, yep in the future we would like too, because I have a big anki deck with a lot of example sentences, and an other user too (and maybe others), and as long as the sentences respect copyright law and so, it will be really good to have them in tatoeba too
but to be honnest I don't when we will implement this, in fact if several people send us tab separated files, then it will be implemented sooner :P
but as I've said to an other user, at least for the moment, me and Trang can import them in tatoeba database even if there's no possibility for users yet

hmmm...tell me more about the tab separated files...how should the format go...sentence,tab,translation...?

yep this way
sentence1[tab]translation1
sentence2[tab]translation2

so I take it there's a \n between the 2 entries? (mass adding here I come :P)

yep SENTENCE1\tTRANSLATION1\nSENTENCE2 etc... :)

In the absolute it's not a problem for me, but maybe then we can put them in a special section, reserved for long text, this way you will not need to worry about the current sentence length limitation and moreover this way people who're looking for long text, will find them easily
what do you think?
and Trang ?

That would be sweet! you know, then people could just go on adding short fairy tales...

and is there a limit on the number of characters a sentence on tatoeba can contain?

500 characters exactly, I think we will need to add this when people add sentences :P

It took a while, but finally 2,000+ sentences in arabic...I already feel it's gonna take longer to reach 3,000 (zipang! how did you reach over 9,000 in no time?! god-like)

I think Zipangu has developped some mantra for this :P
anyway, congrat to have reached 2,000+ sentences in arabic it's already amazing :)

thx sysko :), are you studying arabic behind our backs?...other than that idk how arabic can be amazing :P (lol @mantra)

There is a set of duplicate sentences which it would be good to clear up, but I don't really know how.
The Japanese "もし君が私と一緒に行かないのなら、私は行きたくない。" and the English "I don't want to go if you don't go with me. [M]" appear twice. Each has a French sentence attached, and the two French sentences are slightly different.
I can stop the duplication from appearing in WWWJDIC by deleting the indices of one of them, but it would be better if it could be collapsed into one set.

If it was to be done manually, then one of the French sentences should be :
1) unlinked from its Japanese and English translations
2) link to the other Japanese and English sentences (which would then both have 2 French translations)
3) the Japanese-English pair without French translations has to be deleted.
But this should be done automatically by the duplicate deletion script, when we run it again.

I know how to do (1), but not (2), unless it is by entering it again.

Well, there is currently no way for you to do (2). I would have to do it.
Anyway, it is better to leave things as they are, so that we can check if our script handled the deletion and relinking properly in this specific case.

For audio on tatoeba...have this site been considered for a collaboration of some sort:
http://rhinospike.com
and if recording off a laptop's mic isn't an option, how should the recordings be done...how about in the future would it be more convenient?

What about allowing users to directly link to audio files hosted on free (or not) hosting sites that allow audio streaming (for easier integration)?

We've discovered Rhinospike quite recently actually. But you know, we were in the phase of releasing all these new things, we didn't have time to consider contacting anyone (because we still need to live our lives and stuff ^^). Anyway, we just checked their license, it appears they are distributing their audio under CC-BY, just like we do! So yes, we will be contacting them and see what happens :)
Now, concerning not recording with your laptop's microphone, the key point was not really that you cannot record with your laptop's mic, but that *qualiy* is important. Usually, your laptop's mic will not give the best quality. However, I do not have any experience in sound, I don't know what is the cheapest way to get the best sound. But this is more something that you'll have to discuss with Shtooka.
Now the problem we're facing is a "Fast, cheap, good. Pick two" type of problem. Tatoeba is cheap for sure. If we pick "fast", then we could have lots of audio, not necessarily optimal for beginners, but always better than nothing. If we pick "good", then the audio we get will be more sustainable, in a way (no need to throw it and replace it), but this will exclude people who will not have the patience and determination to aim for quality (like 99% of people).
Despite of this, I decided to pick the "cheap and good". We are not in a hurry, and actually, we NEED time. The whole audio integration process has barely started. We don't really have a good platform to add audio... Right now, it's quite artisanal. You would have you rename your audio files with the id of the sentence and send them to us via email or something, so that we upload them in our server. Although, the good thing is, Shtooka has a good software that will let you record "massively".
Anyway, besides of that, audio is not really the "core" function of Tatoeba. Not that I don't want it, I've ALWAYS wanted it. But Tatoeba itself still needs a lot of improvements in order to reach a broader audience... not just in terms of features but also in terms of performance (that is, making it faster, and making sure it doesn't crash if we suddenly have thousands of visitor a day).
We decided to introduce audio in order to get awareness and mostly feedback (just like you did), so that we can start thinking about how we can adapt Tatoeba to make the audio integration easier. But things won't really be settled before AT LEAST a couple of months ^^

Congratulations on the new kana versions of Japanese sentences. I like them much better than the old romaji. I see that it's overdoing the spaces a little, e.g. 背の高さ is becoming せ の たか さ when it's really a single "word", but that is really a minor point.
The generated simplified Chinese looks good too (not that I can make out too much of it.)

for those who wonder, the wall is now ordered by last reply and should be a bit faster than before :)

* To those who can link/unlink.
You have to be careful when unlinking. In order not to get it wrong, always link everything you can before you unlink.
The thing is, if you have the chain A-B-C, and you cut A-B, you won't be able to link A-C because C will not appear anymore.

So, any chance of 'trusted user' status? There's a sentence here
http://tatoeba.org/eng/sentences/show/21085
that needs unlinking from one of the others.

Ah, I forgot to say, even if you have the right to (un)link, you cannot (un)link everything.
If you want to unlink A-B, you have to be the owner of either A or B.

is the owner of the other sentence informed if you unlink your sentence and his sentence?
If not, I suggest to implement this... even "trusted users" shouldn't be trusted too much ;).
great update, btw =)))!

I trust people a lot :P
But yes, your suggestion will be implemented. It's just a matter of time (as always).

Got it. I think that one is sorted out now.
Note that you need to refresh the page after 'owning' a sentence to see the link / unlink icons.

クレジット or 謝辞 ?
Anyone have an opinion on whether クレジット, 謝辞 (or, indeed, something else entirely) should be used with the following interface elements?
https://translations.launchpad....AC%9D%E8%BE%9E

Hellu!
How can I help to translate the Tatoeba interface into Italian?
There are a few errors and not everything is translated.
thanks!

You need to register here, as this is the service we use to manage the translation of tatoeba
https://translations.launchpad..../+translations
and then from there you can translate the interface/ correct mistakes
feel free to ask if you have questions :)

Very goood, I've started with the translation!
When will be everything updated?
Anyway it's a working in progress project (but I'm really interested to complete it) because I can't find some of the sentences inside the site >.<

cool
it will be updatet soon :p
http://tatoeba.org/fre/wall/show_message/360 when you don't find some sentences, and if you still don't find ask us ;-) you can send to me or Trang a pm

(just trying a thing)