Wall (7,120 threads)
Tips
Before asking a question, make sure to read the FAQ.
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
sharptoothed
2 days ago
sharptoothed
2 days ago
TATAR1
2 days ago
AlanF_US
3 days ago
sharptoothed
4 days ago
Shanaz
7 days ago
Qaztat
7 days ago
TATAR1
7 days ago
Tartar
7 days ago
menaud
10 days ago

Hi, I'd like to share two of my ideas.
1. Export to csv is a great feature! However, I had a little problem with that csv - when I opened it in MS Excel, it was not recognized as a Unicode file, so non-Latin letters were displayed incorrectly.
After I opened and re-saved it in Notepad, Excel correctly recognized the encoding of the file.
I compared the original CSV and the re-saved one, and I noticed that the latter contained extra three bytes at the beginning with hex codes 0xEF, 0xBB, 0xBF. I'm not very skilled in the technical stuff, but I believe this is some attribute that indicates the encoding of the file.
If you could add these three bytes by default that would be great! If not, it's also ok with me - notepad is always handy.
2. Could you please implement the notification for new personal messages? We've already discussed this with Demetrius. He's told me that this hadn't been implemented because of potential spam threat, but maybe you could make this checkmark unchecked by default, and only those who are fully aware of what they are doing will check it?
It's really convenient to know immediately that you have a new message.
That's all for now :)
Many thanks to the Tatoeba team for such a great project!

> three bytes at the beginning with hex codes 0xEF, 0xBB, 0xBF.
This is all about the BOM. Microsoft programs loves the BOM, but many Linux, Unix and Mac OS programs hate the BOM.
Technically the BOM is not required for 8-bit Unicode (which Tatoeba uses) but it is still valid.
There are a bunch of free utilities around on the web to add/remove BOMs.
(P.S. BOM = Byte Order Marker).

> when I opened it in MS Excel, it was not recognized as a
> Unicode file
This is a known Excel bug (= Excel sucks ;-). Here is how to import a file in Excel 2007 (and later).
1. Create a new, blank Excel workbook.
2. Click 'Data'
3. Click 'From text'
4. Select file and follow instructions.

I've just tried it - and it works, thanks for another way around.
But for most of the users (including me) it's way faster and easier to click "open" after the file is downloaded, than opening Excel and opening it through the Import feature. With these three bytes that I mentioned Excel opens the file fine.
However, as I've already mentioned, I'm not a technician, maybe they work only for Windows or maybe there are other potential problems.

this site can be so useful for the new learners,,,but it still has some empties,,,i really need such a site,,but i wish the establisher of this site,,make this site perfect,as its name tatoeba,,,,thanks alot,all the best ...............

Just about everything on this site is open source and available for use. You can take a copy for yourself and do what you want with it (within the terms of the license).
But the best way to make this site perfect is to take part in improving it. For example, by helping the Tatoeba interface support more languages.

what is your goal?

To have examples of sentences for every word, in all languages in the world. ^^
It will help:
- language learners (and teachers -- they can create exercises based on this data)
- program developers (you can download all the sentences from http://tatoeba.org/eng/download...mple_sentences )
It's also good because rare languages are supported, like Uyghur or Shanghainese. It's hard to find examples in these languages.

dear sysko!i meant TATOEBA project!
dear Dametious!if i translate one sentence to Persian wrongly,do you check whether it's true?

I can’t check Persian. :( I don’t know it.
But other people can.
I can check only Russian and Ukrainian translations.

Some corrections:
*I can’t check Persian _yet_.
*I don’t know it _yet_.
*But other people can _for now_.

=)))
No, this is not going to happen. If I were to learn Iranian languages, I would take up Tajik, because it’s written in Cyrillic. ^^

It's actually a brilliant idea to add Tajik sentences to here, we merely need to find a Tajik native :)

Language of Tajikestan is Persian!
Persian of Iranian is different from Tajik,but the difference isn't great,
we call their language FARSI-E-DARI,
but our language is FARSI

Farsi, Tajik, Dari.

Dari is a branch of Farsi!

Dari and Farsi are separate languages. What you are saying is like saying that Dutch is a branch of English.

No!do you know?i'm from the north east of Iran where is near to Afghanestan,Tajikestan,Uzbakestan,i have heard how Tajik,Afghan people talk.
i have heard their language.in scientific text,it is written,that Dari & Tajiki are a kind of Persian(or Farsi).
i understand their(tajik & afghan) word!!!!!

Wiki gives an indicative list of differences between Farsi and Dari here:
http://en.wikipedia.org/wiki/Dari_%28Persian%29

perhaps,i have too much patriotic emotion!:)

Well, understanding does not mean the same language. Ukrainians understand 99% of Belarusian and vice versa, but it does not mean they are the same language.

I'm not sure whether it's relative or not, but we and Iranians might have different understanding of what language really is.
I used to study Modern Standard Arabic and I communicated a lot with Arabs from different countries. They were all damned sure, that Standard Arabic, Egyptian Arabic, Saudi Arabian Arabic and all other kind of Arabics are the language. They are just different dialects. That's what they are taught at schools. Any Western linguist, however, would classify Egyptian Arabic and say MSA as two different though closely related languages.

you took the words from my mouth...exactly my view of how it should work ;D

Yes, but I can read Tajik letters, and I can’t read Persian. :)

So do I. In fact, Cyrillic or Latin characters are much more precise than Arab ones in rendering the real pronounciation, because they do show vowels, which the Arab alphabet does not show.

it shows some vowel such as:a,ou
in addition to,in Farsi,we don't have much vowels,
any way,Arab script became usual,when Arabians attacked to Iran,to extend their religion,about 1300 years ago
after this war,First Persian poet was written by Rodaki,he lived in where we call Uzbakestan,this happend about 2centuri later than

The Arabic letters are not at all suitable for Indo-European languages such as Farsi or Tajik simply because they do not allow writing words with the same consonant profiles and different vowels (homography), which leads to misunderstanding.

yes,your word is right,before 1300 years ago(before arab attack to Iran)
irainian write by a script that is likes sanskcrit!

Well, it depends on the language.
Arabic script for Uyghur shows all the vowels.

That would be my decision as well :)

Transliteration seems to be up for Uzbek.
Question: do you know why е is transliterated as "ye"?

i try to check their sentences!

yeah!this site is good for learners.i learned some grammar structure.

you mean, what is the goal of Tatoeba project ?:)

Is it just me, or has there been a sudden flood of new users here today?

In fact seems tonight something has made tatoeba famous in Iran, if someone can explain why ? Anyway I hope they will find tatoeba usefull, tough we only have 31 sentences in Persian yet :(

do you know?your site introduced by BBC Persian channel!
i think the reason is this!!!!!!!!!!!

Have you a link or so ?:)

what do you mean?
explain more,please
i am not bilingual,i speak English difficulty,ha ha ha
however,i register;i join your community

I mean, do you have a youtube link, or in BBC Persian website, where we can see this ?:) I'm really curious ^^
No problem for not being bilingual, I'm also struggling with English hehe

i can't hyperlink ,because Iranian don't acsess
this site+Youtube.they are filtered
in addition to if previous session of CLICK program(this program introduce you)exist their site,you can't understand Persian

http://www.bbc.co.uk/persian/tv...tv_click.shtml Saeb finds a link :)
I have a Iranian friend, so when I will see him, I can ask him to help me understand :)

We do need a safe search!
Now I'm not sure I should have added 398986 ;)

I was really happy to know you Demetrius.

Exponential growth!

Help me to find the meaning of words.
i don't need interpret scentences!

As blay_paul said, the goal of tatoeba is to do what dictionnaries don't, so to be used in parallel with them, not to replace them. The main reason is that it's no use to create duplicate works, and we prefer to make only one things and make it good, rather than too much things of poor quality/quantity.

Thanks,i already knew some of this sites!
i didn't express my mean correctly1
i mean smart site,these sites are designed bad!
it doesn't matter for me online dic is to mother tongue!
if a dictionary exist free to download,let me know!

Tried a dictionary? It would probably help if we knew which language you were interested in as well.

thank you for reply!
i need a prefect dictionary
my mother tongue is Persian.

I don't know of any perfect dictionary.
Are any of the following any good?
http://www.aryanpour.com/
http://www.math.columbia.edu/~s...asood/cgi-bin/
http://www.farsidic.com/
http://www.farsisites.com/engli...si-dictionary/

Thursday WWWJDIC examples update summary.
36 records deleted.
17 new records.

double-indirect-links and 'do-not-link!' links.
I think it would be useful to, possibly as an option, show sentences that are 'three times removed'. That is not linked as a translation, not indirectly-linked as a translation of a translation but double-indirect-linked as a translation of a translation of a translation.
It is at this level that you often get accidental duplicates (and near duplicates) being added. The person adding a new translation can't see that there is already one very like it just a few clicks away.
Do-not-link links.
Some sentences in the same 'web' of translations do not match each other and should not be linked. e.g.
A: I went to see him yesterday.
B: 昨日あいつに会いに行った。
C: I went to see her yesterday.
A matches B, and B matches C but A does not match C.
It would be nice if you could put a "do not link" (X) between A and C instead of C showing up as a normal indirect link. This would be a way of saying "We know these don't match, it isn't a mistake."

I've been meaning to bring up the same point, but feared I would sound repetitive. Hasn't somebody mentioned this idea somewhere at some point? I forget what was the reason for not implementing it (well, it's probably a lot of coding, again...)
But yea, I agree. Adding the third layer would be really, really helpful - especially when doing the chain translating. I also think that it's very true that this is where the majority of duplicates come from.

I think sysko said something about this being planned for the new version. It would be very useful indeed to be able to crank up or down the number of nodes one should go through to find related sentences.
Not the least for linking a new translation. Often when translating English sentences into Icelandic, I see that one of xtofu's translation of the Japanese matches the Icelandic. This is the reason I set up my manual linker. It's a bit annoying and the ability to see more distantly related sentences would make life a lot easier.
As for the do-not-match links, these could be one specific example of the "qualified links" that are also planned with the new version. :-)

Looks like there’s something wrong with the “show random sentences” feature — when you search for random Cantonese sentences, only those with English characters are shown.

my bad, it's corrected now :)

Thanks for the quick fix, sysko! :-D

** Public service announcement: Maintenance tags **
There has been some discussion on IRC regarding standardisation of maintenance related tags and moderators, Pharamp in particular, have requested that trusted users use slightly more descriptive tags to label sentences that need changing.
The suggested split uses four tags:
- @change grammar
- @change spelling
- @change punctuation
- @change language
There has been some debate whether this is actually useful. I'm personally not convinced, but trying it out is harmless and will give us the experience we need to have an informed discussion of how to manage these issues most efficiently.
There seems, however, to be general agreement that we should deprecate @check in favour of
- @translation check (for issues with translations)
- @Needs Native Check (for issues with sentences)
The naming of @NNC has been debated, but in the interest of solving one problem at a time, I figure we'll leave this for the time being.
In the last set of tags there are:
- @possible copyright violation
- @delete
that are pretty self explanatory.
Finally, as mentioned in a comment to another wall post[1] these tags should always come with a comment outlining the perceived issue.
For a handy overview of these with links to the list of tagged sentences on Tatoeba, see
http://martin.swift.is/tatoeba/mods.html
[1] http://tatoeba.org/eng/wall/show_message/2764

...and also it’s easier to look for the sentences that need attention with just one @tag, at least unless we have an advanced tag search that allows to look for «@change grammar AND @change spelling AND @change punctuation».

Ditto.

My impression was that Pharamp had the support of enough @change patrollers to give this a go. I personally agree that it's wiser to keep them in under one tag, seeing how the comments should make it obvious what the issue is and a proficient speaker is generally needed to ensure the fix is correct.
As I understand her argument, the split would be helpful for moderators able to complete simple tasks such as adding periods, fix capitalisation or the language flag. Please wait for her to give a more accurate account of it.
We could perhaps revive the @moderator tag or set up a new one for such simple tasks that don't require one to be proficient in the language in question to take action.

>The suggested split uses four tags:
> - @change grammar
> - @change spelling
> - @change punctuation
> - @change language
Do you mind my containing use "@change" for East-Slavic languages? It’s likely I’ll be changing them anyway, but @change is easier to type (and I don’t really see advantages of this split).

I think you will done all the work soon.
(Sanırım yakında tüm işleri bitirirsiniz.)
for all the admins.