Wall (7,017 threads)
Tips
Before asking a question, make sure to read the FAQ.
We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.
soridsolid
16 hours ago
sharptoothed
3 days ago
small_snow
8 days ago
gillux
9 days ago
PaulP
9 days ago
LanguageExpert
10 days ago
DJ_Saidez
10 days ago
Ergulis
10 days ago
PaulP
10 days ago
maaster
12 days ago
I may have breached the website's rules with my previous account, but this time, I just don't get it. Tatoeba certainly needs to clarify this both to me and the rest of the contributors and users of this website.
If you suspended me for posting sentences with the Amazigh name "Ziri," I find this (and I'm sure that others would also find that) totally unfair given the fact that others give themselves the right to post hundreds of thousands of sentences with the name "Tom."
Just for your knowledge (everyone): I was encouraged to settle this issue with the admins via private messages, but it turned out that it wasn't worth it since several of my messages were ignored or made fun of.
Again, this is a serious discussion and I'm here to try and find a solution (if there ever can be a solution), not to quabble with those people who have nothing constructive to add to this discussion and whose only goal is to gloat over my suspension. Still, solutions should be reached through true, sincere, transparent, and serious dialogue, not through arbitrary and unjustified measures.
The blocking of Amastan/always accounts is excellent news! Many thanks to the admins for their moderation work.
Note: This is a serious discussion and I'm here to try and find a solution (if there ever can be a solution), not to quabble with those people who have nothing constructive to add to this discussion and whose only goal is to gloat over my suspension.
You're only here to gloat. I want to discuss the issue with the admins and any sensible person who could explain to me and everyone what is actually wrong and how to fix it. Fixing this issue might prevent other issues in the future. I'd like to ask gloaters to keep away from this discussion. I'm sure that the admins as well as the general public here wouldn't be interested in wasting their time reading a lengthy argument and a long series of messages of people gloating over the suspension of a contributor. So please save your energy and go do something more useful.
Is that Amastan? His sentences were always brimming with political undertones and you could smell the bitterness from a mile away, especially in sentences pertaining to his beloved Algeria and Islam.
I'm probably one of the few who'd rather we stay away from political messages and keep sentences as generic as possible without using real names. That's probably what got another lady banned (I don't remember her name, but she knew Yiddish and even had sentences calling for the death of certain political figures).
That's not to say we should get rid of sentences about politics. Not at all. Sentences like ''Tom is a left-wing politician.'' ''Tom disappointed the members of his party.'' are all perfectly fine and useful in my opinion.
En tiedä miksi sinut on torjuttu, mutta tietyn nimen (yli)käyttäminen lauseissa kuulostaa erittäin oudolta ja riittämättömältä syyltä, juurikin kun ottaa huomioon Tom-lauseet. rivikäyttäjästä tämä näyttää oudolta. Siksi oletan/luulen, että taustalla on jotain muutakin.
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.
✹✹ Stats & Graphs ✹✹
Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/
Tatoeba website has been running very slowly since yesterday's shutdown. Do any of you have similar experience?
It is slow for me too.
Yes, it's been slow for me too.
Not just slow. Mostly I get the message „Tatoeba is currently unavailable. We are sorry for the inconvenience. You can check our blog or Twitter for more information.”
But nor the blog nor Twitter gives any information. Yesterday I hardly did 10 % of the work I used to do in one day.
There was a little problem indeed! Tatoeba should run smoothly now.
サクッ..サク🤭動いています。ありがとうございます。
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.
I think adopting-unadopting of sentences doesn't really work.
One must beg in order his(/her) unadopted sentence(s) to be adopted.
(I may suppose that's why the sentences aren't checked.)
And I think simply unadopted sentences remain rather unchecked.
(As for me, I don't like adopt sentences added by other ones.)
I don't understand, Maaster. I adopt and correct sentences in „my languages” regularly. If an autor unadopts sentences, there is no need to be begged to change them. You can use a simple link to find them all:
https://tatoeba.org/eo/activiti..._sentences/epo
(change "epo" to the code of your language).
Some sentences have this info "This sentence is original and was not derived from translation."
Is this information anywhere in the downloadable data?
thank you!
It's in the sentences_base file.
You're right. It's right there. Sorry, I just didn't see it.
Is the format of transcriptions (japanese if that makes any difference) explained anywhere? (nothing in the Wiki, afaik)
I found three different cases (there may be more):
A: [Kanji|Reading] which makes sense
B: [Kanji1Kanji2|Reading1|Reading2] which is probably short for [Kanji1|Reading1][Kanji2|Reading2]
C: [Kanji1Kanji2|Reading] which probably means the two Kanji combined have this reading
is this correct?
And can I expect to always find something that either fits A, B or C?
That is, can I expect to *never* find something like [Kanji1Kanji2Kanji3|Reading1|reading2], i.e. a number of Kanji and readings which are not equal (in that case, how would I know whether Reading1 belongs to Kanji1Kanji2 or just Kanji1?
I hope my ad-hoc syntax makes sense.
I assume you're asking this question because you want to transform the data programmatically (otherwise you could just handle edge cases whenever you encounter them). If my assumption is correct, it might be easiest to look at Tatoeba's own code for Japanese transcriptions. (Note that Tatoeba is AGPL-licensed, in case that's an issue for you.)
The validation code for user-provided furigana is here: https://github.com/Tatoeba/tato...ption.php#L220 but I think it might not apply to those that are generated automatically using MeCab.
The testcases might also be helpful: https://github.com/Tatoeba/tato...onTest.php#L27
If you just want to display furigana using HTML <ruby> tags, our code for that is here: https://github.com/Tatoeba/tato...naTrait.php#L9 To be honest, it's not written in an easily readable manner, but I think what it does is basically to assume without validation that there are at least as many kanji as there are readings, and if there is a kanji without reading (|| or end of list) it will merge it with the preceding kanji until the numbers are equal.
So [Kanji1Kanji2Kanji3|Reading1|reading2] would be equivalent to [Kanji1|Reading1][Kanji2Kanji3|reading2], I think.
Yes, ruby is a good example. This looks good, thanks!
I will take a look at the code, especially the one where it handles unequal numbers of Kanji and readings.
The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.
When you search on Tatoeba.org, it only shows 1000 results. That is, it shows a maximum of 10 pages. It says the total number of results, but it only shows 1000. How can I fix this?
this is a technical limitation to not overload the server
Here are ways to see more than 1,000 sentences for a given search, using "advanced search."
1. After seeing the original 1,000 results, "reverse" the order the search results are shown.
2. Try the other ways of sorting sentences, plus their "reverse" searches.
I also wonder if this limit of 1000 sentences is too low.
I use this feature to find recently added sentences (in German) and sometimes the last 1000 sentences don't even cover one day.
The limit doesn't apply to sentences of specific users. Some of them own a huge amount of sentences (> 700000).
Currently, displaying or even re-sorting these is reasonably fast.