Profile
Sentences
Vocabulary
Reviews
Lists
Favorites
Comments
Comments on gillux's sentences
Wall messages
Logs
Audio
Transcriptions
Translate gillux's sentences
We should probably update the wiki, but in short:
• break → matches break and other words that have the same stem in each language that support stemming, and just break for languages that doesn’t support stemming. For instance, in English sentences it will match break, breaks, breaking, in French it will match break, breaks (it’s a car type), in Esperanto it will only match break because Esperanto doesn’t have a stemmer.
• =break → matches any sentence that have the word break exactly, regardless of stemming
• break* → matches any word starting with break (including break), like breakfast, regardless of stemming
Note that you need to provide at least three characters before or after the star. Which means th*, *th or *th* won’t work but the*, *the or *the* will.
Use minus, just like in Google.
https://tatoeba.org/sentences/s...rom=eng&to=und
This delay will also allow us to translate the interface. I updated the resource on Transifex, there are a few new strings to translate.
Yes, and I think it’s probably good enough. All recent dates are reported in a “x ago” form, while only the more older ones (older than 30 days) are displayed as a date, in which case the precise hour and minutes are likely to be unimportant.
I added an “empty trash” function to remove all the messages from the trash.
What do you mean by “search history”? If you mean the previously submitted queries, this is a browser-specific thing I don’t think we can mess with in any way.
One way to solve the slowness problem is to exclusively rely on Sphinx to sort out sentences. My recent progresses in that field make be believe it’s a workable solution.
How about that? http://dev.tatoeba.org/
Do you think this button should reset the “from” and “to” lists to “Any” as well?
Go to https://dev.tatoeba.org/user/settings and try changing the new “Number of sentences per page” setting.
I’m working on it (ticket: https://github.com/Tatoeba/tatoeba2/issues/30).
The wiki is going to be shut down for maintenance during a few hours. Sorry for the inconvenience!
> After going through your post I installed a plugin in Chrome and since then it has been showing the beautiful Nastaleeq script. So thank you for that.
> However I personally believe that websites should render the Nastaleeq script automatically as not every Urdu speaker is technically sound enough to know how they can get the script.
Of course. Actually, the problem comes from your OS. It should ask and allow you to install an Urdu font so that Urdu could be displayed correctly, not only in browsers but in any graphical interface of your system, like text editors, menus, file names etc. By installing a plugin, you’re fixing Chrome but the rest of your system is still Urdu-unaware. We Tatoeba can try to help members with broken OSes by providing web fonts, but the root of problem is the OS. That’s why the author of the blog post you mentioned first asked Microsoft, Apple and Google.
I updated the ticket.
What you can do is to advice us the best web font for Urdu. By “best”, I mean one that is free to use, that most people won’t find hard to read and that is accurate. I insist on the accuracy because Urdu seems quite complex to render, and fonts like Google Noto still has problems [1]. It would be best to get a native’s acknowledgement (like you ☺) regarding accuracy (and readability) before deploying a font. Also, note that not any font can be used as a webfont. It needs to be size-optimized and available in multiple formats. In other words, look for web fonts, not just fonts.
[1] https://github.com/googlei18n/n...Aopen+Nastaliq
I don’t think there is much Tatoeba can do to “implement” Nasta'liq script. In fact, it seems to already work. Technically speaking, Nasta'liq and Naskh use the same Unicode codepoints but are represented differently. Which means the existing Urdu sentences are not wrong, they are just displayed in Naskh instead of Nasta'liq on most systems as a result of a lack of support for Nasta'liq.
It’s like you’d need a different font for French and Italian just because French should be written in cursive and Italian in block letters (of course this is not true, it’s just an example). So every time you’re on a French website, your browser would need to automatically select a cursive font, and switch to a block letters one when displaying Italian. This is only possible if the browser is aware of the language of the content. Fortunately, we already provide language information on Tatoeba (using the lang attribute) for sentences in every language, including Urdu. So, using browsers that handle this correctly, on systems that have Nasta'liq fonts installed and are configured to use them when displaying Urdu, Urdu sentences are automatically displayed in Nasta'liq.
I installed Nasta'liq fonts on my system and configured it appropriately, the result seems good: https://i.imgur.com/oQActxn.png
Merci !
I’m not a corpus maintainer, still I want to have access to such feature, and I’m pretty sure I’m not the only one.
This makes me think we probably need some sort of highlighting for invisible characters like zero-width spaces or non-breaking spaces, so that we at least know where they are being used. We don’t want them to be highlighted normally, otherwise it would defeat their purpose, but we may have something that triggers highlighting or some visual information. I have no idea about how that could fit in the UI though, I’m sure nobody wants yet another button above sentences.
I think we should restore the complete list of languages and put the profile languages on the top, the same way personal lists appears on the top when you click on the “Add to list” icon of a sentence.