menu
Tatoeba
language
Εγγραφή Σύνδεση
language Ελληνικά
menu
Tatoeba

chevron_right Εγγραφή

chevron_right Σύνδεση

Εξερεύνηση

chevron_right Εμφάνιση τυχαίας φράσης

chevron_right Εξερεύνηση ανά γλώσσα

chevron_right Εξερεύνηση με βάση τον κατάλογο

chevron_right Εξερεύνηση ανά ετικέτα

chevron_right Εξερεύνηση ηχητικών αρχείων

Κοινότητα

chevron_right Τοίχος

chevron_right Λίστα όλων των μελών

chevron_right Γλώσσες των μελών

chevron_right Φυσικοί ομιλητές

search
clear
swap_horiz
search

Τοίχος (7.144 νήματα)

Συμβουλές

Πριν να κάνετε μια ερώτηση, σιγουρευτείτε ότι διαβάσατε το FAQ.

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

Τελευταία μηνύματα subdirectory_arrow_right

frpzzd

32 λεπτά πριν

subdirectory_arrow_right

EugeneGS

43 λεπτά πριν

subdirectory_arrow_right

frpzzd

3 ώρες πριν

subdirectory_arrow_right

EugeneGS

χθες

subdirectory_arrow_right

frpzzd

χθες

subdirectory_arrow_right

gillux

χθες

feedback

frpzzd

2 μέρες πριν

feedback

sharptoothed

4 μέρες πριν

subdirectory_arrow_right

marafon

5 μέρες πριν

subdirectory_arrow_right

Pfirsichbaeumchen

5 μέρες πριν

CK CK 12 Ιουνίου 2010, τροποποιήθηκε την την 25 Οκτωβρίου 2019 12 Ιουνίου 2010 - 4:39:35 μ.μ. UTC, τροποποιήθηκε την 25 Οκτωβρίου 2019 - 8:09:35 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

[not needed anymore- removed by CK]

{{vm.hiddenReplies[1243] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
TRANG TRANG 13 Ιουνίου 2010 13 Ιουνίου 2010 - 12:40:45 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

Well, the definition of "sentence" is still not exactly clear, to me at least. I mean, sometimes a "sentence" can be just a word... Like "Hello".

One thing is sure: you should avoid adding something that is clearly a partially formed sentences. Instead of "to be in love", you should add "He is in love" (for instance).

But in general, we are not very strict on the matter of what is accepted or not because we haven't decided yet what is a sentence.

We have only decided that a sentence has punctuation :)

The other problem is that I actually wouldn't even what word to use instead of "sentence"...

{{vm.hiddenReplies[1245] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
blay_paul blay_paul 13 Ιουνίου 2010 13 Ιουνίου 2010 - 6:24:04 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

> The other problem is that I actually wouldn't even
> what word to use instead of "sentence"...

I think 'sentence' is a useful approximation - especially if you take off your grammatician hat. ;-)

hamid hamid 9 Ιουνίου 2010 9 Ιουνίου 2010 - 11:09:46 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

I thought tatobea supports my native language(Farsi).
But now I know it was just a thought.

{{vm.hiddenReplies[1211] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
hamid hamid 9 Ιουνίου 2010 9 Ιουνίου 2010 - 2:31:03 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

Thanks. So, I start to add some sentences. And after a while, I will call you to add my language to your list.
I think this is better. Because I'm not that active or have no free time.

{{vm.hiddenReplies[1214] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
sysko sysko 9 Ιουνίου 2010 9 Ιουνίου 2010 - 2:33:15 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

no problem, even if there's only a douzen, it's enough, most of the times it's enough to attract more people contribute in your language :)

{{vm.hiddenReplies[1215] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
hamid hamid 9 Ιουνίου 2010 9 Ιουνίου 2010 - 2:47:39 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

Ok. But could you tell me how to stick a flag?

{{vm.hiddenReplies[1218] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
sysko sysko 9 Ιουνίου 2010 9 Ιουνίου 2010 - 2:50:12 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

the flag will be added by us on the interface, in the same time as the lanugage itself

Pharamp Pharamp 9 Ιουνίου 2010 9 Ιουνίου 2010 - 2:39:32 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

Hi Hamid :)
Which flag do you think should be used?
Iran, Afghanistan...?

{{vm.hiddenReplies[1216] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
hamid hamid 9 Ιουνίου 2010 9 Ιουνίου 2010 - 2:43:26 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

Ofcourse Iran (Farsi).
afghanistan is Pashto.

blay_paul blay_paul 9 Ιουνίου 2010 9 Ιουνίου 2010 - 11:31:48 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

You can add sentences now and correct the flag when it's added to the list. Probably wouldn't take long.

sysko sysko 9 Ιουνίου 2010 9 Ιουνίου 2010 - 1:14:17 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

Hi hamid, in fact it's not because your language is not in the list that we don't want it/will not support, it's just one has to know we add a sentence in the list as soon as we have some sentences in it, otherwise the list will be full of hundreds of languages with 0 sentences, which will not be confortable for users
but if you're ready to add some sentences in your language, we will be glad to add it in the list :)

TRANG TRANG 13 Ιουνίου 2010 13 Ιουνίου 2010 - 12:28:34 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

Your language has been added :)

You can now set your sentences to "Persian".

Little note: for the name of the language we used "Persian" and not "Farsi", as Wikipedia says it's "the more widely used name of the language in English".

http://en.wikipedia.org/wiki/Persian_language

xtofu80 xtofu80 3 Ιουνίου 2010 3 Ιουνίου 2010 - 8:12:20 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

I am not sure whether this is worth discussing, but there are some sentences which are really redundant, e.g.
162883, 83091, two rather long sentences which only differ in the subject being "my mom" vs. "my dad".
Shouldn't we remove one of such pairs and concentrate on the gist instead of wasting our efforts on translating countless variants?

{{vm.hiddenReplies[1085] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
CK CK 12 Ιουνίου 2010, τροποποιήθηκε την την 25 Οκτωβρίου 2019 12 Ιουνίου 2010 - 3:31:00 π.μ. UTC, τροποποιήθηκε την 25 Οκτωβρίου 2019 - 8:09:42 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

[not needed anymore- removed by CK]

{{vm.hiddenReplies[1239] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
xtofu80 xtofu80 12 Ιουνίου 2010 12 Ιουνίου 2010 - 10:43:16 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

Hi CK,
I completely agree with your notion of near duplicates versus clutter.
I think that besides "dealing" with clutter that already exists, we should also put some effort into guidelines about creating new content.

TRANG TRANG 11 Ιουνίου 2010 11 Ιουνίου 2010 - 7:08:29 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

Okay I haven't replied to this yet so I will, to make it clear about "variations" of sentences.

Our position is: people can do whatever they like. If they want to add all the possible variations, they can. If they don't want to, they don't have to.

It doesn't hurt to have "near duplicates". It just make Tatoeba a bit noisy. But that's our job, as engineers, to figure out how to filter and organize data so that it can be used efficiently for language learners.

Meanwhile, as sysko said, variations of sentences can be very useful for language processing, so we shouldn't delete them.

{{vm.hiddenReplies[1237] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
blay_paul blay_paul 11 Ιουνίου 2010 11 Ιουνίου 2010 - 7:43:26 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

Just to clarify the clarification. Near duplicates will be removed from WWWJDIC - but not by deleting them from Tatoeba. So feel free to point out Japanese sentences and English sentences linked to Japanese sentences that are near duplicates.

{{vm.hiddenReplies[1238] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
xtofu80 xtofu80 19 Ιουνίου 2010 19 Ιουνίου 2010 - 1:26:52 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

Hi Paul, I saw you always post a comment "Not for WWWJDIC" in each sentence. Shouldn't that be solved by using tags?

{{vm.hiddenReplies[1315] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
blay_paul blay_paul 19 Ιουνίου 2010 19 Ιουνίου 2010 - 1:33:44 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

I could, but I started doing that before tags existed.

It also gives people a chance to notice what sentences I'm excluding and ask why (or just complain ;-).

{{vm.hiddenReplies[1316] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
xtofu80 xtofu80 19 Ιουνίου 2010 19 Ιουνίου 2010 - 1:53:38 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

So do you filter the sentences according to your comment, or do you mark them somewhere else AND put a comment in?
I just want to know how we should approach sentences we find should not appear there (e.g. hiragana-kanji variants of exactly the same sentence.)

{{vm.hiddenReplies[1317] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
blay_paul blay_paul 19 Ιουνίου 2010 19 Ιουνίου 2010 - 2:08:01 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

In the secret sentence annotation page, where the Japanese index can be entered / edited, I put -1 in the meaning field.

No one else can see that so the note is just to let people know what I'm doing (generally excluding near-duplicate sentences from WWWJDIC).

sysko sysko 3 Ιουνίου 2010 3 Ιουνίου 2010 - 10:34:31 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

In an other side I'm working with an other guy on a machine-learning based automated translator, and this kind of "near" duplicate sentences are REALLY usefull

{{vm.hiddenReplies[1092] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
sysko sysko 3 Ιουνίου 2010 3 Ιουνίου 2010 - 10:37:24 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

in fact as a learner I also like to find sometimes this kind of sentences where only a part change, it's easier to see some grammar point this way (because for example in French sentences changing a "my mom" by "my dad" could change the verbs / adjectiv and so in the sentences, which is always interesting to see this variation on the same sentence)

{{vm.hiddenReplies[1093] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
Swift Swift 3 Ιουνίου 2010 3 Ιουνίου 2010 - 5:23:29 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

On this point, I've chosen to add these nuances in comments. There are otherwise just going to be way too many similar sentences.

blay_paul blay_paul 3 Ιουνίου 2010 3 Ιουνίου 2010 - 9:24:01 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

> Shouldn't we remove one of such pairs and concentrate on
> the gist instead of wasting our efforts on translating
> countless variants?

There is a constant effort to remove near - duplicates. At the current rate we're probably losing a couple of dozen a week, if not more.

However removing duplicates does not produce _new_ content. And new content is what's needed to fill out Tatoeba and make it more appealing.

{{vm.hiddenReplies[1086] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
xtofu80 xtofu80 3 Ιουνίου 2010 3 Ιουνίου 2010 - 10:06:29 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

Yes, you are right, producing new content is also important, though I as a native German speaker am right now mostly busy with adding German translations to the already existing Jap-Eng. sentence pairs. And that's when I came across these near-duplicates.
Currently I am thinking about how I could involve my Japanese language exchange partner to produce some content. At least, I will check with her some sentences I found dubious.

So how would be the best procedure if I come across such a sentence pair? Make a comment? Add it to the "mark for deletion" list?

sysko sysko 3 Ιουνίου 2010 3 Ιουνίου 2010 - 10:44:48 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

moreover I think here the problem is not to have or not this countless variant (for the reasons below I would prefer to keep them), but rather "how to show to contributors only 'usefull' sentences"

blay_paul blay_paul 10 Ιουνίου 2010 10 Ιουνίου 2010 - 1:00:04 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

Please could we have the duplicate removal script run soon? (Before Saturday, anyway)

{{vm.hiddenReplies[1233] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
Pharamp Pharamp 10 Ιουνίου 2010 10 Ιουνίου 2010 - 5:32:08 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

I would like to ask also a manual update of Launchpad translations (Tatoeba > Launchpad sense) for translating all the new stuff :) merci^^

TRANG TRANG 11 Ιουνίου 2010 11 Ιουνίου 2010 - 12:55:13 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

Done.

CK CK 10 Ιουνίου 2010, τροποποιήθηκε την την 25 Οκτωβρίου 2019 10 Ιουνίου 2010 - 1:29:24 π.μ. UTC, τροποποιήθηκε την 25 Οκτωβρίου 2019 - 8:10:02 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

[not needed anymore- removed by CK]

{{vm.hiddenReplies[1227] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
TRANG TRANG 10 Ιουνίου 2010 10 Ιουνίου 2010 - 11:36:05 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

But that would imply that everyone uses Firefox (or has to) ^^

Also, since Tatoeba has a broader public than students in a classroom, it wouldn't necessarily be a good thing to drop the furigana. A lot of people would rather have something that is 80-90% accurate than not having anything at all, because it saves them time. And in the end, it's people's own responsibility to decide whether they want something perfect or not.

Generating furigana is not what slows down Tatoeba the most, you wouldn't see a difference in speed if we took it out, so it wouldn't justify that we take it out.

But as a teacher, you can (and actually *must*) educate your students not to rely on the furigana line, and use Rikaichan instead. It's not incompatible. I use it myself (and always did) whenever I want to figure out the reading of a Japanese sentence in Tatoeba, despite the fact that the reading is already displayed.

CK CK 9 Ιουνίου 2010, τροποποιήθηκε την την 25 Οκτωβρίου 2019 9 Ιουνίου 2010 - 5:05:24 μ.μ. UTC, τροποποιήθηκε την 25 Οκτωβρίου 2019 - 8:10:08 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

[not needed anymore- removed by CK]

{{vm.hiddenReplies[1221] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
sysko sysko 9 Ιουνίου 2010 9 Ιουνίου 2010 - 5:21:26 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

once the tags will be added, we will be able to do the following:
tags "unsuitable_for_children" etC. and maintain a list of tags which will be not accesible for non user/ user which has not active "unsafe search" option

Pharamp Pharamp 10 Ιουνίου 2010 10 Ιουνίου 2010 - 10:24:18 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

oui
moi je suis très très petite

CK CK 10 Ιουνίου 2010, τροποποιήθηκε την την 25 Οκτωβρίου 2019 10 Ιουνίου 2010 - 9:21:49 π.μ. UTC, τροποποιήθηκε την 25 Οκτωβρίου 2019 - 8:09:55 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

[not needed anymore- removed by CK]

{{vm.hiddenReplies[1229] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
sysko sysko 10 Ιουνίου 2010 10 Ιουνίου 2010 - 9:33:26 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

simply because we haven't found yet time to rewrite the hint page :P

Demetrius Demetrius 9 Ιουνίου 2010 9 Ιουνίου 2010 - 5:58:33 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

If okurigana is incorrect, can we just file this as bugs in MeCab?

{{vm.hiddenReplies[1223] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
blay_paul blay_paul 9 Ιουνίου 2010 9 Ιουνίου 2010 - 6:57:14 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

In theory. However they are not really bugs in MeCab, but problems resulting from the dictionary used with MeCab. The dictionary used can both be selected (from a very short list ;-) and can be altered or aided by user-defined dictionaries.

So really what it needs is someone familiar with MeCab to find the best dictionary available and to add fixes for the problems noted.

However it probably is never going to be possible to be 100% accurate so to get the best results manual corrections will be needed at some point.

{{vm.hiddenReplies[1225] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
JimBreen JimBreen 10 Ιουνίου 2010 10 Ιουνίου 2010 - 8:44:49 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

It's not that simple.

Consider the sentence: 君たちの訳文と黒板の訳を比較しなさい。 MeCab suggests わけ as the reading of the solo 訳, whereas we all know it's やく. MeCab's usual dictionary (NAIST-JDIC) has both versions of 訳, and no amount of adding dictionaries is going to "fix" it. MeCab uses some very sophisticated AI to segment sentences, and the dictionaries have parameters derived from training on hand-segmented texts. The trouble is that ...の訳を... could be either, and you need the context of the whole sentence to decide which is which. In fact the weightings for 訳/わけ and 訳/やく as solo lexemes are the same. You could probably fiddle the weights on 訳 to make it produce やく, but most of the solo appearances of 訳 in Tatoeba are in fact わけ.

There is a whole research field of Word Sense Disambiguation (WSD) working on problems related to this, but I don't think there are any packaged solutions for Japanese that can be plugged into Tatoeba. Just be grateful we have MeCab - 20 years ago automatic Japanese segmenters were thought to be impossible to build.

Demetrius Demetrius 9 Ιουνίου 2010 9 Ιουνίου 2010 - 6:43:11 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

*furigana

blay_paul blay_paul 9 Ιουνίου 2010 9 Ιουνίου 2010 - 10:38:48 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

Help wanted!

I'm looking for someone to help with Tatoeba / WWWJDIC integration. Specifically I'd like someone with database / web experience to work on tools for validating / completing the index data needed to link WWWJDIC dictionary entries to Tatoeba example sentences. If you're interested post here for more details or PM me.

saeb saeb 8 Ιουνίου 2010 8 Ιουνίου 2010 - 4:53:40 π.μ. UTC flag Report link Μόνιμος σύνδεσμος

revisiting sentence variation...

I came across a sentence in arabic (thx to qahwa's comment)...one of my sentences :D...It shows a property of the arabic script that I want to document with 3 or 4 variations:

هل استلمتَ الرسالة؟
Did you(male) receive the letter?

هل استلمتِ الرسالة؟
Did you(fem) receive the letter?

هل استلمَتْ الرسالة؟
Did she receive the letter?

without the harakaat (vowel marks) they're all written the same.
Now if I do add them, I'm afraid they'll just get reported as similar sentences and get merged/deleted/etc... (I mean the english sentences ofc)

what's tatoeba's 'official' statement on how to deal with this?

{{vm.hiddenReplies[1205] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
brauliobezerra brauliobezerra 8 Ιουνίου 2010 8 Ιουνίου 2010 - 3:01:09 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

Similar things happen in Portuguese

Esse é seu brinquedo.

can be

[Hey, you, ]
This is your toy.

[Bob likes to play.]
This is his toy.

[Mary likes to play.]
This is her toy.

[Tex the armadillo likes to play.]
This is its toy.

There are unambiguous ways to say these sentences in Portuguese, but they are not used that often.

{{vm.hiddenReplies[1208] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
MUIRIEL MUIRIEL 8 Ιουνίου 2010 8 Ιουνίου 2010 - 8:41:18 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

I don't think that this is the same problem as in Arabic.
It doesn't cause problems when Portuguese duplicates like your example are merged. But in Arabic it does. The example that saeb posted is the same *without* vowel marks, but with vowel marks, it's not anymore the same, and the pronounciation isn't the same neither. So it *looks* like a duplicate, but it isn't.

{{vm.hiddenReplies[1209] ? 'expand_more' : 'expand_less'}} απόκρυψη απαντήσεων εμφάνιση απαντήσεων
Demetrius Demetrius 9 Ιουνίου 2010 9 Ιουνίου 2010 - 4:07:28 μ.μ. UTC flag Report link Μόνιμος σύνδεσμος

I don’t know about Arabic, but I personally prefer to differenciate sentences that are different in speech.

E.g. in Russian, Belarusian and Ukrainian one doesn’t normally mark stress, but when it’s important, I do (as in the case with зáмок/замóк in sentences No. 385729 and 385728).