menu
Tatoeba
language
Nýskráning Innskrá
language Íslenska
menu
Tatoeba

chevron_right Nýskráning

chevron_right Innskrá

Vafra

chevron_right Sýna setningu af handahófi

chevron_right Vafra eftir tungumáli

chevron_right Vafra eftir lista

chevron_right Vafra eftir merki

chevron_right Vafra upptökum

Samfélag

chevron_right Veggur

chevron_right Meðlimalisti

chevron_right Listi tungumála meðlima

chevron_right Innfæddir

search
clear
swap_horiz
search

Veggur (7.128 þræðir)

Ábendingar

Áður en þú spyrð spurningu skaltu lesa lista algengra spurninga.

Við stefnum að því að viðhalda heilbrigðum staði fyrir siðmenntaðar umræður. Vinsamlegast lestu reglur gegn slæmum hegðum okkar.

Nýjustu skilaboð subdirectory_arrow_right

frpzzd

6 klukkustundum síðan

subdirectory_arrow_right

araneo

8 klukkustundum síðan

subdirectory_arrow_right

gillux

14 klukkustundum síðan

subdirectory_arrow_right

araneo

14 klukkustundum síðan

subdirectory_arrow_right

gillux

16 klukkustundum síðan

subdirectory_arrow_right

gillux

16 klukkustundum síðan

subdirectory_arrow_right

PaulP

17 klukkustundum síðan

subdirectory_arrow_right

frpzzd

18 klukkustundum síðan

subdirectory_arrow_right

Waldelfe

19 klukkustundum síðan

feedback

gillux

í gær

superduperimpose superduperimpose 2. september 2024 2. september 2024 kl. 21:53:48 UTC flag Report link Tengill

Some sentences have this info "This sentence is original and was not derived from translation."

Is this information anywhere in the downloadable data?
thank you!

{{vm.hiddenReplies[40752] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
Yorwba Yorwba 4. september 2024 4. september 2024 kl. 18:54:50 UTC flag Report link Tengill

It's in the sentences_base file.

{{vm.hiddenReplies[40754] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
superduperimpose superduperimpose 4. september 2024 4. september 2024 kl. 19:07:22 UTC flag Report link Tengill

You're right. It's right there. Sorry, I just didn't see it.

superduperimpose superduperimpose 31. ágúst 2024 31. ágúst 2024 kl. 11:54:06 UTC flag Report link Tengill

Is the format of transcriptions (japanese if that makes any difference) explained anywhere? (nothing in the Wiki, afaik)

I found three different cases (there may be more):

A: [Kanji|Reading] which makes sense

B: [Kanji1Kanji2|Reading1|Reading2] which is probably short for [Kanji1|Reading1][Kanji2|Reading2]

C: [Kanji1Kanji2|Reading] which probably means the two Kanji combined have this reading

is this correct?
And can I expect to always find something that either fits A, B or C?
That is, can I expect to *never* find something like [Kanji1Kanji2Kanji3|Reading1|reading2], i.e. a number of Kanji and readings which are not equal (in that case, how would I know whether Reading1 belongs to Kanji1Kanji2 or just Kanji1?

I hope my ad-hoc syntax makes sense.

{{vm.hiddenReplies[40749] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
Yorwba Yorwba 31. ágúst 2024 31. ágúst 2024 kl. 13:38:45 UTC flag Report link Tengill

I assume you're asking this question because you want to transform the data programmatically (otherwise you could just handle edge cases whenever you encounter them). If my assumption is correct, it might be easiest to look at Tatoeba's own code for Japanese transcriptions. (Note that Tatoeba is AGPL-licensed, in case that's an issue for you.)

The validation code for user-provided furigana is here: https://github.com/Tatoeba/tato...ption.php#L220 but I think it might not apply to those that are generated automatically using MeCab.

The testcases might also be helpful: https://github.com/Tatoeba/tato...onTest.php#L27

If you just want to display furigana using HTML <ruby> tags, our code for that is here: https://github.com/Tatoeba/tato...naTrait.php#L9 To be honest, it's not written in an easily readable manner, but I think what it does is basically to assume without validation that there are at least as many kanji as there are readings, and if there is a kanji without reading (|| or end of list) it will merge it with the preceding kanji until the numbers are equal.

So [Kanji1Kanji2Kanji3|Reading1|reading2] would be equivalent to [Kanji1|Reading1][Kanji2Kanji3|reading2], I think.

{{vm.hiddenReplies[40750] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
superduperimpose superduperimpose 31. ágúst 2024 31. ágúst 2024 kl. 15:07:10 UTC flag Report link Tengill

Yes, ruby is a good example. This looks good, thanks!
I will take a look at the code, especially the one where it handles unequal numbers of Kanji and readings.

charcoalis charcoalis 27. ágúst 2024 27. ágúst 2024 kl. 12:23:06 UTC flag Report link Tengill

When you search on Tatoeba.org, it only shows 1000 results. That is, it shows a maximum of 10 pages. It says the total number of results, but it only shows 1000. How can I fix this?

{{vm.hiddenReplies[40744] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
Guybrush88 Guybrush88 27. ágúst 2024 27. ágúst 2024 kl. 13:34:43 UTC flag Report link Tengill

this is a technical limitation to not overload the server

brauchinet brauchinet 28. ágúst 2024 28. ágúst 2024 kl. 10:05:50 UTC flag Report link Tengill

I also wonder if this limit of 1000 sentences is too low.
I use this feature to find recently added sentences (in German) and sometimes the last 1000 sentences don't even cover one day.
The limit doesn't apply to sentences of specific users. Some of them own a huge amount of sentences (> 700000).
Currently, displaying or even re-sorting these is reasonably fast.

sharptoothed sharptoothed 25. ágúst 2024 25. ágúst 2024 kl. 16:04:08 UTC flag Report link Tengill

✹✹ Stats & Graphs ✹✹

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/

Ergulis Ergulis 17. ágúst 2024 — breytt 17. ágúst 2024 17. ágúst 2024 kl. 17:46:37 UTC — breytt 17. ágúst 2024 kl. 18:06:15 UTC flag Report link Tengill

In searching for some solution to my problem with displaying text on Tatoeba in italics, I tried downloading another browser. From what Google offered me, I chose Brave. To my big surprise, it displays normally on it; the italics are gone.
It seems that something went wrong with setting on my basic browsers (Edge, Google Chrome, even Firefox), resulting in the issue.

{{vm.hiddenReplies[40738] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
Ergulis Ergulis 17. ágúst 2024 17. ágúst 2024 kl. 17:56:02 UTC flag Report link Tengill

There is a shield in the Brave browser. If it is on, the text shows normally. However, if I disable it, the italics appears even there. Very strange.

{{vm.hiddenReplies[40739] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
Yorwba Yorwba 17. ágúst 2024 17. ágúst 2024 kl. 19:25:11 UTC flag Report link Tengill

https://support.brave.com/hc/en...while-browsing indicates that the shield combines various blocking features that you can also toggle individually using the advanced controls. My guess is that you have a non-standard system font that shows up as italics and the font fingerprinting protection in Brave, when enabled, is preventing the browser from loading it.

In Firefox, by right-clicking the italic text and selecting "Inspect", you should be able to open a panel with three columns, the rightmost of which shows "Layout" initially, but one of the other options is "Fonts", which should show you which font is being used.

{{vm.hiddenReplies[40740] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
Ergulis Ergulis 18. ágúst 2024 — breytt 20. ágúst 2024 18. ágúst 2024 kl. 11:21:13 UTC — breytt 20. ágúst 2024 kl. 20:27:25 UTC flag Report link Tengill

Thank you for your insight, Yorwba. I checked that and found out that Noto sans italic font is used. If I disable it, the site displays normally. However, it works only temporarily, until next launching. I just need to make out how to change it permanently.
I'm glad to understand the problem and for now, I'm ok with running Tatoeba on Brave.

PrasantaHembram PrasantaHembram 10. ágúst 2024 10. ágúst 2024 kl. 19:06:39 UTC flag Report link Tengill

Hi,
I'm reaching out to inquire about importing thousands of bilingual English-Santali sentences into the Tatoeba database. I have a large collection of sentences in two languages that I'd like to contribute to the platform. Could you please provide guidance on the recommended format for preparing the sentence files, the process for uploading them to the database, and any specific requirements or guidelines for ensuring data quality and consistency? I'd greatly appreciate any assistance or documentation to help me import my sentence collection efficiently.

Thanks
Prasanta Hembram

{{vm.hiddenReplies[40723] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
gillux gillux 12. ágúst 2024 12. ágúst 2024 kl. 10:35:58 UTC flag Report link Tengill

Hello, this sounds awesome, but Tatoeba does not support mass import of sentences just yet. This is because we lack ressources to implement a proper import system. If you know how to program, you are welcome to contribute such system. If you know anybody who is willing to implement an import system, you can ask them. If you want to get notified about any progress on that matter, you can mention your interest on this Github issue thread https://github.com/Tatoeba/tatoeba2/issues/1762

As for importing sentences in general, you should care about the license of the data you want to contribute. It should be legal to re-use the data, as Tatoeba will publish it under Creative Commons CC-BY.

As for the data quality, the sentences should follow these rules https://en.wiki.tatoeba.org/art...h-explanations There is no particular expectations in terms of consistency, because Tatoeba already receives contributions from various people, without are not really following any consistency guidelines.

As for the data format, since we don’t have the tool to import just yet, there is not requirement yet, but I think CSV or TSV should be okay.

{{vm.hiddenReplies[40726] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
PrasantaHembram PrasantaHembram 14. ágúst 2024 14. ágúst 2024 kl. 15:14:13 UTC flag Report link Tengill

Hi, @gillux. Thank you for the information. I have some basic programming knowledge, but I'm not confident in my ability to contribute to the development of an import system. Will refer someone. I think for now, only admins can do mass import and is used rarely ?? and only way to contribute right now is to add/translate sentences one by one.

{{vm.hiddenReplies[40736] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
gillux gillux 16. ágúst 2024 16. ágúst 2024 kl. 15:31:59 UTC flag Report link Tengill

> I have some basic programming knowledge, but I'm not confident in my ability to contribute to the development of an import system.

I think that creating an import system is a complex task, too. Not only on the technical level, but also on the social level, as one can see from the discussions on the GitHub issue page. I think that such an import system needs to designed collaboratively, so you are more than welcome to share your ideas.

> I think for now, only admins can do mass import and is used rarely ??

Admins used to be able to do some kind of basic mass import, but, for technical reasons, not anymore.

> and only way to contribute right now is to add/translate sentences one by one.

That is correct.

sharptoothed sharptoothed 11. ágúst 2024 11. ágúst 2024 kl. 05:58:47 UTC flag Report link Tengill

✹✹ Stats & Graphs ✹✹

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/

miketheknight miketheknight 13. júlí 2020 13. júlí 2020 kl. 13:04:10 UTC flag Report link Tengill

In advanced search there's a checkbox "Owned by a self-identified native". Would it be reasonable to extend this functionality to "Owned or approved by a self-identified native"?

{{vm.hiddenReplies[35625] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
TRANG TRANG 16. júlí 2020 16. júlí 2020 kl. 19:31:34 UTC flag Report link Tengill

It could be.

Can you tell us what made you think of this? With a bit more context, we can better assess whether we should extend the checkbox as you suggested or whether we should add another search option.

Note that there has been a similar request raised on GitHub:
https://github.com/Tatoeba/tatoeba2/issues/2261

{{vm.hiddenReplies[35634] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
miketheknight miketheknight 17. júlí 2020 — breytt 17. júlí 2020 17. júlí 2020 kl. 08:22:47 UTC — breytt 17. júlí 2020 kl. 08:23:14 UTC flag Report link Tengill

Because I like working with sentences added by native speakers. They are less likely to be awkward, they are less likely to contain structural mistakes. I even enjoy noticing typical mistakes that native speakers make - for example, I used to pronounce "they're" and "there" differently in English a long time ago, and only after having noticed that native speakers of English regularly confuse "they're", "there" and "their" in writing did I understand those three have identical pronunciation.

Anyway, I have a lot of reasons to work only with sentences added by native speakers, so I almost always use the "Added by self-identified native speakers" checkbox in my searches.

However, I think I'm missing out on sentences added by non-native speakers that were approved / corrected by native speakers. I don't see why those would be any worse than sentences added by native speakers.

So I believe it would be useful to treat "Sentences added by native speakers" + "Sentences approved by native speakers" as one set.

The github link is not the same. If I OK an English sentence, it doesn't make it any more reliable than it was before me okaying it, but it's important for a sentence to be reviewed by a native speaker.

{{vm.hiddenReplies[35641] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
AlanF_US AlanF_US 17. júlí 2020 17. júlí 2020 kl. 15:52:34 UTC flag Report link Tengill

> only after having noticed that native speakers of English regularly confuse "they're", "there" and "their" in writing did I understand those three have identical pronunciation.

Not identical, at least not for every speaker, but definitely similar. :)

morbrorper morbrorper 18. júlí 2020 18. júlí 2020 kl. 21:10:41 UTC flag Report link Tengill

There are a lot of correct sentences that belong to users that have not indicated their native language. And I have come across quite a few sentences with errors, by native contributors. Nevertheless, I think this would be a useful feature.

morbrorper morbrorper 19. júlí 2020 19. júlí 2020 kl. 08:51:45 UTC flag Report link Tengill

In relation to this, I would like to call for an overview of all the sentences having the @needs native check tag. I find it discouraging to see sentences with this tag being ignored for ages.

There are also quite a few sentences that have both this tag and an "OK" tag, which is confusing. I understand that may be because the person who OK's a sentence does not always have the right to delete other people's tags, or they just forget; this makes me think the issue is perhaps not best handled using tags.

{{vm.hiddenReplies[35651] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
AlanF_US AlanF_US 19. júlí 2020 19. júlí 2020 kl. 13:44:05 UTC flag Report link Tengill

> In relation to this, I would like to call for an overview of all the sentences having the @needs native check tag. I find it discouraging to see sentences with this tag being ignored for ages.

In which languages? I make a point of frequently reviewing the English sentences with this tag (along with @check and @change). Sometimes they build up in the short term because they are owned by an active member who hasn't had a chance to get through all of them yet.

> There are also quite a few sentences that have both this tag and an "OK" tag, which is confusing.

Again, in which languages?

{{vm.hiddenReplies[35652] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
morbrorper morbrorper 19. júlí 2020 19. júlí 2020 kl. 15:12:05 UTC flag Report link Tengill

OK, I looked a bit closer, using the web interface, and found that Norwegian Bokmål actually stands for more than half of the total ~4500 sentences. It is among these that I found the OK'd ones.

Other languages that stand out are Japanese (412), and Mandarin Chinese (384). To get the whole picture, for each language, maybe somebody could run an SQL query?

Indeed, English has very few unhandled @NNC requests, for which I am grateful.

{{vm.hiddenReplies[35653] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
AlanF_US AlanF_US 19. júlí 2020 19. júlí 2020 kl. 18:11:42 UTC flag Report link Tengill

Norwegian Bokmål has no corpus maintainers. Maybe it's time to recruit one.

{{vm.hiddenReplies[35654] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
Thanuir Thanuir 20. júlí 2020 20. júlí 2020 kl. 07:17:08 UTC flag Report link Tengill

Det ville være en stor forbedring.

Thanuir Thanuir 7. ágúst 2024 7. ágúst 2024 kl. 06:07:49 UTC flag Report link Tengill

Koskien norjankielistä korpusta: siellä on useita lauseita, joissa on helposti huomattava kirjoitusvirhe, jonka usea käyttäjä on ilmoittanut. Monissa tapauksissa jollakulla käyttäjistä on norja äidinkielenä. Monissa tapauksissa virhe on dokumentoitu myös sanakirjaviittauksilla.

Myös henkilö, jolla ei ole norja äidinkielenään, voisi käydä läpi lauseet ja tehdä tällaiset ilmeiset korjaukset. Jättää vain tekemättä ne, jotka eivät ole hyvin dokumentoituja tai tuettuja.

CK CK 29. júlí 2020 29. júlí 2020 kl. 05:43:48 UTC flag Report link Tengill

We have a new Japanese voice.

Lowteq has contributed 123 audio files.

https://tatoeba.org/eng/sentenc...how/168586/und

{{vm.hiddenReplies[35698] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
Ricardo14 Ricardo14 29. júlí 2020 29. júlí 2020 kl. 13:56:58 UTC flag Report link Tengill

Amazing! :D

TRANG TRANG 30. júní 2020 — breytt 30. júní 2020 30. júní 2020 kl. 22:44:50 UTC — breytt 30. júní 2020 kl. 22:45:02 UTC flag Report link Tengill

** What's New on Tatoeba? - Your biweekly recap #20 **

(What's New on Tatoeba will be published biweekly until the end of August.)


EVENT

It's been now one month since our Kodoeba event[1] started.

※ As far as the internal code goes:

• Our participants[2] have solved five issues, and seven others are on their way. You can find the details on GitHub[3].
• Alexs has asked for feedback about the tags: https://tatoeba.org/eng/wall/show_message/35555. Be sure to share your thoughts if you'd like to see the tags in Tatoeba become more useful!

※ As for the external projects:

• lbdx has updated Tatominer: https://tatoeba.org/eng/wall/show_message/35527.
• The other projects are starting to take shape, it's still too early to showcase anything. We'll have to wait until mid or end of July.

[1] https://blog.tatoeba.org/2020/0...kodoeba-1.html
[2] https://blog.tatoeba.org/2020/0...ticipants.html
[3] https://github.com/orgs/Tatoeba/projects/1


UPDATES

※ The search has been improved for languages using Arabic scripts, Indonesian and Tagalog. Many thanks to Yorwba.

※ The number of messages in the private messages has been localized, thanks to Ricardo14.

※ There's now a reset icon in the inputs of the advanced search. Thanks to Roverandom789133 for adding this.

※ We no longer unnecessarily store IPs in our contributions logs. Thanks to jpear1 for cleaning this up.


ON THE WALL

※ Trang has been working on making the landing page responsive: https://tatoeba.org/eng/wall/show_message/35464

※ gillux has asked which sentences would be a good candidate to print on a Tatoeba T-shirt or mug: https://tatoeba.org/eng/wall/show_message/35547

※ tommg has announced the release of his language learning app that uses Tatoeba's data: https://tatoeba.org/eng/wall/show_message/35512


LANGUAGES

※ Rircardo14 posted some updates about the progress of the translation of our UI: https://tatoeba.org/eng/wall/show_message/35518.

※ A new UI language has been enabled on the dev website: Serbian.

※ As usual, thanks to all the members who helped to translate the website!


----------

If you'd like to help to the development of Tatoeba, report issues, or are just curious, have a look at the GitHub repository.

If you want to help us translate the website to your language, you can join us on Transifex: https://www.transifex.com/tatoe...ite/dashboard/ and check this article on the wiki https://en.wiki.tatoeba.org/art...e-translation.

If you're especially happy with one of the updates, don't hesitate to personally thank our developers :) They're working in the shadow but they'll be glad to hear your feedback.

----------

Last recap: https://tatoeba.org/eng/wall/show_message/35504
See this recap on the blog: http://blog.tatoeba.org/2020/06...weekly_30.html

{{vm.hiddenReplies[35583] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
samir_t samir_t 4. júlí 2020 4. júlí 2020 kl. 21:03:27 UTC flag Report link Tengill

I would like to know why the FAQ is not published in the Kabyle interface although the translation is finished on Transifex.

{{vm.hiddenReplies[35592] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
TRANG TRANG 4. júlí 2020 4. júlí 2020 kl. 21:39:58 UTC flag Report link Tengill

Translating wiki content through Transifex is a new process. In this case, we simply didn't pay enough attention and we missed the Transifex notifications about how the FAQ was fully translated into Kabyle. So it was simply forgotten (sorry!).

The wiki is a separate application than the tatoeba.org website. The UI languages available on the wiki are actually not in sync with the languages available on tatoeba.org. Currently, the wiki doesn't support Kabyle yet. We first have to add it as a supported language then we have to manually add the Kabyle translations.

But in any case, we will make sure the Kabyle translation of the FAQ is made available soon :) Thanks for reporting it and thanks for translating!

{{vm.hiddenReplies[35593] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
samir_t samir_t 5. júlí 2020 5. júlí 2020 kl. 00:15:23 UTC flag Report link Tengill

Thanks.

{{vm.hiddenReplies[35594] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
TRANG TRANG 9. júlí 2020 9. júlí 2020 kl. 14:33:53 UTC flag Report link Tengill

It's online now: https://kab.wiki.tatoeba.org/articles/show/faq

I have a couple of questions:

(1) What would be a translation of "main" (or "main page") in Kabyle?

For instance, the URL of the English main page is like this:
https://en.wiki.tatoeba.org/articles/show/main

While for French it's "page-principale" instead of "main":
https://fr.wiki.tatoeba.org/art...age-principale

For each language the name in the URL is in the language itself. For now, I have named the Kabyle page "main" but it would be more suitable to have a string in Kabyle.

(2) Same question for the FAQ URL. Is it fine to leave it as "faq" or is there another acronym in Kabyle?

{{vm.hiddenReplies[35613] ? 'expand_more' : 'expand_less'}} fela svör sýna svör
Ricardo14 Ricardo14 9. júlí 2020 9. júlí 2020 kl. 17:19:28 UTC flag Report link Tengill

samir_t replied on another thread - https://tatoeba.org/eng/wall/sh...#message_35614

"The translation of "main" in Kabyle is "agejdan".

As for the FAQ URL, it would be better to leave it as "faq".

Thanks."

https://prnt.sc/tevi8q