menu
تتويبا
language
سجّل لِج
language العربية
menu
تتويبا

chevron_right سجّل

chevron_right لِج

تصفح

chevron_right Show random sentence

chevron_right تصفح حسب اللغة

chevron_right تصفح حسب القائمة

chevron_right تصفح حسب الوسم

chevron_right تصفح ملفات الصوت

المجتمع

chevron_right الحائط

chevron_right قائمة بجميع الأعضاء

chevron_right لغات الأعضاء

chevron_right المتحدثون الأصليون

search
clear
swap_horiz
search
Zifre {{ icon }} keyboard_arrow_right

الملف الشخصي

keyboard_arrow_right

الجُمل

keyboard_arrow_right

المفردات

keyboard_arrow_right

Reviews

keyboard_arrow_right

القوائم

keyboard_arrow_right

المفضلة

keyboard_arrow_right

التعليقات

keyboard_arrow_right

التعليقات على جمل Zifre

keyboard_arrow_right

رسائل الحائط

keyboard_arrow_right

السجلات

keyboard_arrow_right

تسجيل صوتي

keyboard_arrow_right

المدوّنات

translate

ترجِم جمل Zifre

رسائل Zifre على الحائط (المجموع ١٥٤)

Zifre Zifre ٢٥ مارس ٢٠١٢ ٢٥ مارس ٢٠١٢ ٤:٠٩:٢٩ م UTC link Permalink

When a new user adds tons of sentences with the same repeated errors (punctuation, capitalization, etc.), would it be okay for corpus maintainers to leave comments on the first few and then just correct the rest immediately?

In my opinion, having to wait two weeks is really harming the quality of the corpus because the @change list just gets longer and longer. Many users show up, add a bunch of incorrect sentences, and then never come back to fix them. Then we waste time commenting on and tagging each one, and the sentences still never get fixed.

Zifre Zifre ٢٦ فبراير ٢٠١٢ ٢٦ فبراير ٢٠١٢ ٨:٤٨:١٥ م UTC link Permalink

It's probably based off an old country list...

Zifre Zifre ٢٢ فبراير ٢٠١٢ ٢٢ فبراير ٢٠١٢ ٨:٥٠:٤٤ م UTC link Permalink

I think we should discourage this (and all duplicates differing only by orthography, such as other British/American spelling differences and traditional/simplified characters in Chinese). However, I think any current pairs like this can be left alone.

Probably the easiest way to solve this problem without sacrificing either the interconnectedness or the "rawness" of the corpus would be to implement a type of link between sentences of the same language that would automatically share all links to other sentences (because there it is not possible for a translation to be appropriate for one sentence but inappropriate for one that differs only in spelling).

Zifre Zifre ١٦ فبراير ٢٠١٢ ١٦ فبراير ٢٠١٢ ١٠:٣٦:٤٥ م UTC link Permalink

I think they should be totally merged, not just linked. Simplified and traditional characters are just two orthographies for the same language.

Zifre Zifre ١٣ فبراير ٢٠١٢ ١٣ فبراير ٢٠١٢ ١١:٠٦:٥٢ م UTC link Permalink

I think both orthographies are commonly used.

Zifre Zifre ١٢ فبراير ٢٠١٢ ١٢ فبراير ٢٠١٢ ١٢:٢٣:٣٢ ص UTC link Permalink

Private lists are public but can only be modified by the owner. (I think the feature is poorly named.)

Zifre Zifre ١٠ فبراير ٢٠١٢ ١٠ فبراير ٢٠١٢ ١٢:٤٦:٥٩ ص UTC link Permalink

En Linux, oni ankaŭ povas uzi kunigo-klavon.

Zifre Zifre ١٠ فبراير ٢٠١٢ ١٠ فبراير ٢٠١٢ ١٢:٤٠:٤٢ ص UTC link Permalink

+1

It is amazing how off-topic this is become...

But I agree. Just "Taiwan" is pretty neutral. I don't think it should be "Republic of China" or anything like that.

Zifre Zifre ٧ فبراير ٢٠١٢ ٧ فبراير ٢٠١٢ ٣:٠٧:٢٤ ص UTC link Permalink

Capitalization is not distinctive in most poetry. (Line breaks are, however.) So I think it should be changed to fit normal rules for Tatoeba.

However, certain poets (e.e. cummings comes to mind) used capitalizations in unusual ways. In those cases, I'm not really sure what we should do.

Zifre Zifre ٥ فبراير ٢٠١٢ ٥ فبراير ٢٠١٢ ٧:٣٠:١٣ م UTC link Permalink

http://tatoeba.org/spa/wall/sho...#message_11256

Zifre Zifre ٥ فبراير ٢٠١٢ ٥ فبراير ٢٠١٢ ٦:٠٨:١٥ م UTC link Permalink

There seems to be some consensus that the "lie" tag should be renamed to something like false, not true, untrue, misleading, etc.

Zifre Zifre ٤ فبراير ٢٠١٢ ٤ فبراير ٢٠١٢ ٣:٤٢:٣١ م UTC link Permalink

Well I think most "false" statements will not change too often.

For the ones that do, it's not super critical that we catch them immediately. (Sentences that were at one point true are probably the false sentences that are least harmful in a corpus where you are trying to filter them out all false sentences.)

Zifre Zifre ٤ فبراير ٢٠١٢ ٤ فبراير ٢٠١٢ ٣:٣١:٣٩ م UTC link Permalink

Yeah, I meant to say something about this.

I think the tag should be restricted to sentences with relatively unambiguous references. Tom, Mary, you, and I could be anybody.

As for time, the tag should reflect the truth at the present time. So we can remove and re-add it as necessary.

Zifre Zifre ٤ فبراير ٢٠١٢ ٤ فبراير ٢٠١٢ ٣:١٣:٥٠ م UTC link Permalink

What do you mean?

Zifre Zifre ٤ فبراير ٢٠١٢ ٤ فبراير ٢٠١٢ ٣:١٣:١٨ م UTC link Permalink

> Anyway, even for all those uses you all describe, it seems obvious to me that what you want is a "false" or "not true" tag, not a "lie" tag. Not because of offensiveness, but simply because not everything that is false is a lie.

Yes, I do think it should be renamed.

Zifre Zifre ٤ فبراير ٢٠١٢ ٤ فبراير ٢٠١٢ ٤:٠٢:٤٦ ص UTC link Permalink

> I just think it is a useless tag. Who cares about whether Tatoeba example sentences happen to be true or false? It says "example sentences" on the title page, not "wisdoms to guide people through their lives".

It may not be useful for you or even for anyone using the Tatoeba web interface. But we all seem to forget the goal of Tatoeba is to produce a high quality corpus of sentences that can be used for a variety of purposes. Some of these purposes may require filtering out sentences that are false, which becomes a lot easier when you have a tag for it.

Zifre Zifre ٤ فبراير ٢٠١٢ ٤ فبراير ٢٠١٢ ٣:٥٨:٥٨ ص UTC link Permalink

I think a comment should always be added, but tags make it easy for someone using the Tatoeba corpus to filter them out the sentences with lies, if that is what they desire.

Zifre Zifre ٤ فبراير ٢٠١٢ ٤ فبراير ٢٠١٢ ٣:٥٦:٢٨ ص UTC link Permalink

I think the "Lie" tag is useful, but like CK, I think it should be renamed. (I'd suggest "false" or "not true").

I don't think the tag is necessary for sentences like "Chuck Norris is a platypus." or "2 + 2 = 5". However, there is nothing wrong with using it in these cases. I wouldn't encourage it though.

The tag is most useful for things that sound like they could be facts, but aren't. If I said:

"According to an article in a recent issue of Scientific American, there is a 96% chance that there is intelligent life within 1,000 light-years of earth."

it seems possible that some people might interpret it as fact even though I just made it up. They might tell their friends or even just waste five minutes looking it up to see if it's true. This could be prevented with a simple tag.

The tag should NOT be used for things like "I ate a sandwich yesterday." even if it is indeed false for the author. If you really feel the need to let everyone know that you didn't eat a sandwich, do it in a comment.

So, here are my criteria for using this tag:

1) It must be objective - nothing that is reasonably disputable
2) It must not be about the author
3) It must not be in inside-joke (sentences about Tatoeba members probably count)
2) It should not be obviously false

Zifre Zifre ٢٤ يناير ٢٠١٢ ٢٤ يناير ٢٠١٢ ١٠:٤٧:٤٠ م UTC link Permalink

Cool, thanks. Marcelo is right though, it should be "@needs native check".

Zifre Zifre ٢١ يناير ٢٠١٢ ٢١ يناير ٢٠١٢ ١٠:٠٧:٣٧ م UTC link Permalink

I don't think they're terribly useful. I would prefer if we removed them for now, and possibly re-add that information when we get metadata (but only if users can provide compelling use cases). Hopefully, it would be automated in most cases. (It is easy for a computer to count syllables in any language with a phonemic orthography and counting words in a language with spaces is trivial. However, it would get tricky when things like numbers and foreign names are added.)