Our data is released under various Creative Commons licenses.More information
If you love this content, please consider a donation.
Tips
Before asking a question, make sure to read the FAQ.
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
Latest messages
Wall (5241 threads)
Three duplicates of the exact same phrase, "Näkemiin!". They have been there for a while now.
https://tatoeba.org/fin/sentences/show/752738
https://tatoeba.org/fin/sentences/show/7522877
https://tatoeba.org/fin/sentences/show/7522879
I was under the impression that the Horus-bot would combine these. Is there a problem with "ä" or did I misunderstand how the bot works?
https://tatoeba.org/fin/sentences/show/752738
https://tatoeba.org/fin/sentences/show/7522877
https://tatoeba.org/fin/sentences/show/7522879
I was under the impression that the Horus-bot would combine these. Is there a problem with "ä" or did I misunderstand how the bot works?
hide replies
show replies
As far as I understand, they should be linked. I linked them now. Please check if Horus will delete the oldest sentences now.
hide replies
show replies
So Horus only notices linked duplicates? That makes sense, also from performance point of view. Thanks.
hide replies
show replies
But there must be another explanation. Horus still doesn't "see" the sentences.
2019-01-11 19:21
It was reported a while ago that Horus missed some duplicates: https://github.com/Tatoeba/tatoeba2/issues/1722
so I guess that we'll have to wait for the issue to be fixed (likely after the code migration is completed and after everything about that will be stable)
so I guess that we'll have to wait for the issue to be fixed (likely after the code migration is completed and after everything about that will be stable)
2019-01-10 08:49
Happy birthday, Denis (deniko)! З днем народження! 😊
2019-01-05 14:45
La reconnaissance automatique de langue est polluée par des langues dont le nombre de phrases est faible.
Je pense qu'il faudrait exclure de la reconnaissance les langues qui n'ont pas encore atteint une certaine masse de phrases, de manière à ce que l'algorithme d'identification puisse s'appuyer sur des échantillons plus significatifs de segments.
Je pense qu'il faudrait exclure de la reconnaissance les langues qui n'ont pas encore atteint une certaine masse de phrases, de manière à ce que l'algorithme d'identification puisse s'appuyer sur des échantillons plus significatifs de segments.
hide replies
show replies
J'ai créé un ticket sur GitHub: https://github.com/Tatoeba/tatoeba2/issues/1731
Est-ce que vous avez quelques exemples de phrases dont la langue a été mal détectée?
Est-ce que vous avez quelques exemples de phrases dont la langue a été mal détectée?
hide replies
show replies
La reconnaissance de langue ne pourra jamais être parfaite, mais peut sûrement être amélioré. Des plus grosses langues sont aussi touchées. J'ai récemment eu des problèmes avec le portugais, et l'esperanto.
Et dernièrement....
https://tatoeba.org/fra/sentences/show/7699348
Et dernièrement....
https://tatoeba.org/fra/sentences/show/7699348
2019-01-06 13:44
A peu près toutes les dernières phrases courtes en français que j'ai créées ont été mal identifiées comme des langues improbables : chavanaco, bavarois,...
Donc je soupçonne que c'est le faible échantillonnage de ces langues qui les fait être détectées comme candidates...
Donc je soupçonne que c'est le faible échantillonnage de ces langues qui les fait être détectées comme candidates...
2019-01-06 13:47
cette phrase-ci vient d'être identifiée comme de l'émilien...
https://tatoeba.org/fra/sentences/show/7699792
https://tatoeba.org/fra/sentences/show/7699792
hide replies
show replies
2019-01-10 18:26 - 2019-01-10 18:27
Bon, là c’est de pire en pire : des phrases françaises assez longues sont reconnues comme de l’anglais (!) ou de l’interlingua.
Je pense qu’il ne peut s’agir que d’un canular...
Je pense qu’il ne peut s’agir que d’un canular...
** Stats - 2019-01-05 - English Sentences on List 907 Translated by Native Speakers **
http://tatoeba.ueuo.com/stats-2019-01-05.html
This shows the percentage of English sentences on List 907 translated into each language that has identified native speakers working here.
http://tatoeba.ueuo.com/stats-2019-01-05.html
This shows the percentage of English sentences on List 907 translated into each language that has identified native speakers working here.
hide replies
show replies
** Stats - 2019-01-05 - Username & Number of Audio by CK **
http://tatoeba.ueuo.com/stats-190105-audio.html
See how many English sentences by various members have audio files by CK.
http://tatoeba.ueuo.com/stats-190105-audio.html
See how many English sentences by various members have audio files by CK.
How is it even possible for a sentence to have 0 direct translations and a few indirect ones?
https://i.imgur.com/8qXBncU.png
https://tatoeba.org/eng/sentences/show/6116885
According to the logs, it was added as a translation for this:
https://tatoeba.org/eng/sentences/show/6116872
Which doesn't exist for some reason, so that might be part of the explanation, but still a weird bug. Also, why doesn't #6116872 exist? There is nothing in the logs that might indicate it was deleted.
https://i.imgur.com/8qXBncU.png
https://tatoeba.org/eng/sentences/show/6116885
According to the logs, it was added as a translation for this:
https://tatoeba.org/eng/sentences/show/6116872
Which doesn't exist for some reason, so that might be part of the explanation, but still a weird bug. Also, why doesn't #6116872 exist? There is nothing in the logs that might indicate it was deleted.
hide replies
show replies
2019-01-05 18:46 - 2019-01-05 22:21
The deleted sentence #6116872 still shows up in other sentences' logs as being a direct link (for example in #6117014).
There are quite a lot of sentences that seem to have been deleted improperly, probably they got lost during database problems. (e.g. #6087670, #6116982, #6116989).
Well, this doesn’t really belong here, but I tried to find out how many such dead links existed. I couldn't, because to my surprise, the downloadable ‘Links’-file contains thousands and thousands of sentence numbers that aren’t in the ‘Sentences’-file. The ‘Links’-file seem to keep the sentence numbers of all the deleted sentences having had links, but also of red sentences (I heard that these are excluded from download), and quite a lot of numbers that have no logs at all.
I wonder how these numbers still make their way into the export files.
There are quite a lot of sentences that seem to have been deleted improperly, probably they got lost during database problems. (e.g. #6087670, #6116982, #6116989).
Well, this doesn’t really belong here, but I tried to find out how many such dead links existed. I couldn't, because to my surprise, the downloadable ‘Links’-file contains thousands and thousands of sentence numbers that aren’t in the ‘Sentences’-file. The ‘Links’-file seem to keep the sentence numbers of all the deleted sentences having had links, but also of red sentences (I heard that these are excluded from download), and quite a lot of numbers that have no logs at all.
I wonder how these numbers still make their way into the export files.
hide replies
show replies
Thanks for that, and especially for this bit:
> to my surprise, the downloadable ‘Links’-file contains thousands and thousands of sentence numbers that aren’t in the ‘Sentences’-file
So this actually explains why there could be sentences with no direct translations but with the indirect ones. Obviously, this also mean there could be a lot of examples of sentences containing some indirect translations that are not immediate translations of any direct translations of those sentences - and yeah, sorry for this convoluted/awkward phrase.
The indirect translations are obviously reconstructed from the links database, and if it contains links to non-existent sentences, then this is it, different sentences could be indirectly linked via deleted sentences.
Before your explanation I hadn't been able to imagine how that might have happened, so I had been thoroughly confused.
> to my surprise, the downloadable ‘Links’-file contains thousands and thousands of sentence numbers that aren’t in the ‘Sentences’-file
So this actually explains why there could be sentences with no direct translations but with the indirect ones. Obviously, this also mean there could be a lot of examples of sentences containing some indirect translations that are not immediate translations of any direct translations of those sentences - and yeah, sorry for this convoluted/awkward phrase.
The indirect translations are obviously reconstructed from the links database, and if it contains links to non-existent sentences, then this is it, different sentences could be indirectly linked via deleted sentences.
Before your explanation I hadn't been able to imagine how that might have happened, so I had been thoroughly confused.
I noticed that Catalan in the Portuguese UI is called Calalão instead of Catalão.
https://i.imgur.com/qyT3OIb.png
Can someone fix this?
https://i.imgur.com/qyT3OIb.png
Can someone fix this?
hide replies
show replies
I've fixed this - http://prntscr.com/m3iz0v
Now Trang will have to update the translations.pot. She usually does that on Sundays at night.
Now Trang will have to update the translations.pot. She usually does that on Sundays at night.
hide replies
show replies
2019-01-04 00:00 - 2019-01-04 00:07
Üblich sind „diese Form“ und »diese Form« (umgedrehte Guillemets, sogenannte Chevrons). Die erste Form wird auch handschriftlich verwendet; in Büchern, besonders Romanen u. ä., verwendet man meistens, wie Du schon festgestellt hast, die zweite. Der Grund liegt wohl darin, daß dadurch weniger Lücken im Text entstehen, dieser dadurch gleichmäßiger aussieht und angenehmer zu lesen ist. Bei Tatoeba haben wir uns jedoch auf die bekannte Form für den deutschen Korpus geeinigt.
hide replies
show replies
Danke für die ausführliche Antwort! : )
Und was ist mit den Gedankenstrichen?
Und was ist mit den Gedankenstrichen?
hide replies
show replies
2019-01-05 00:08
Was willst Du da genau wissen? 🙂
hide replies
show replies
Benutzt man sie wirklich in Dialogen vor den Sätzen der Sprecher?
Und wenn ja, müssen sie vorne in Zeilen sein?
Könntest Du evtl. einen kleinen Dialog schreiben, wie ihn aussehen muss, nicht so wie hier, wo alles nacheinander ohne reale Zeilen geschrieben ist.
Und wenn ja, müssen sie vorne in Zeilen sein?
Könntest Du evtl. einen kleinen Dialog schreiben, wie ihn aussehen muss, nicht so wie hier, wo alles nacheinander ohne reale Zeilen geschrieben ist.
hide replies
show replies
2019-01-06 02:44
Zwischen Sätzen kennzeichnet der Gedankenstrich den Wechsel des Sprechers oder des Themas. In Romanen u. ä. sieht man es oft, daß statt dessen ein neuer Absatz begonnen wird:
Der fremde Bürgersmann staunte. „Du eine Mörderin?“ rief er zweifelnd. „Unmöglich! Zwar sind die Menschen aller Ränke und Bosheiten fähig, das weiß ich wohl, doch siehst du nicht wie eine Mörderin aus.“
„Und doch ist es so“, erwiderte sie trübsinnig. „Wenn Ihr mögt, erzähle ich Euch meine Geschichte.“
„So sprich!“ forderte der Berggeist.
Das ist aus einem zufällig gegriffenen Buch, in dem übrigens ebendiese Anführungszeichen verwendet werden. In einzeiligem Tatoeba-Format müßte man anstelle der Absätze Gedankenstriche setzen.
Der fremde Bürgersmann staunte. „Du eine Mörderin?“ rief er zweifelnd. „Unmöglich! Zwar sind die Menschen aller Ränke und Bosheiten fähig, das weiß ich wohl, doch siehst du nicht wie eine Mörderin aus.“
„Und doch ist es so“, erwiderte sie trübsinnig. „Wenn Ihr mögt, erzähle ich Euch meine Geschichte.“
„So sprich!“ forderte der Berggeist.
Das ist aus einem zufällig gegriffenen Buch, in dem übrigens ebendiese Anführungszeichen verwendet werden. In einzeiligem Tatoeba-Format müßte man anstelle der Absätze Gedankenstriche setzen.
Is it possible to use the OR in the owner field when searching sentences?
Motivation: Sometimes one would like to search sentences among the sentences of several specific users. Of course one can do it separately, but it is good if one can do it at once.
Motivation: Sometimes one would like to search sentences among the sentences of several specific users. Of course one can do it separately, but it is good if one can do it at once.
hide replies
show replies
2019-01-03 21:56
This was requested before.
https://github.com/Tatoeba/tatoeba2/issues/797
I also requested a similar feature.
https://tatoeba.org/eng/wall/sh...#message_29459
Having both of them would be very useful. This way, typing 'A, B' into the owner field would give results only from the users A and B, while typing '-A, -B' would exclude results from them.
https://github.com/Tatoeba/tatoeba2/issues/797
I also requested a similar feature.
https://tatoeba.org/eng/wall/sh...#message_29459
Having both of them would be very useful. This way, typing 'A, B' into the owner field would give results only from the users A and B, while typing '-A, -B' would exclude results from them.
** Happy New Year, my friends! **
2019 is coming up. I'd like to thank Tatoeba and its members for everything that was done in 2018. Now it's time to celebrate a New Year, to make Tatoeba bigger!
Happy New Year!!!
https://tatoeba.org/eng/sentences/show/361350
2019 is coming up. I'd like to thank Tatoeba and its members for everything that was done in 2018. Now it's time to celebrate a New Year, to make Tatoeba bigger!
Happy New Year!!!
https://tatoeba.org/eng/sentences/show/361350