Perete - Tatoeba

Perete (7.389 subiecte)

Sugestii

Înainte de a pune o întrebare, asigurați-vă că ați citit Întrebările frecvente.

Ne propunem să menținem o atmosferă pozitivă pentru discuții civilizate. Vă rugăm să citiți regulile noastre împotriva comportamentului necorespunzător.

Ultimele mesaje

subdirectory_arrow_right

gillux

ieri

subdirectory_arrow_right

LeviHighway

ieri

subdirectory_arrow_right

LeviHighway

ieri

subdirectory_arrow_right

gillux

ieri

subdirectory_arrow_right

gillux

ieri

subdirectory_arrow_right

Thanuir

ieri

feedback

LeviHighway

ieri

subdirectory_arrow_right

Seael

acum 2 zile

subdirectory_arrow_right

rul

acum 3 zile

subdirectory_arrow_right

gillux

acum 3 zile

rul acum 3 zile 18 mai 2026 la 14:17:32 UTC

flag

Report

link

Link permanent

I'm having issues with search – my usual searches aren't returning anything before over a week ago, even when sorting by last created. Or very little.

For example, this is search is of my sentences that are in the "Translated by Tatoebans" list. It was updated Saturday, and I had sentences added to it - I verified this with the list interface – but only sentences from over a week ago show up in search.

https://tatoeba.org/en/sentence...rd_count_min=1

ascundeți răspunsurile afișați răspunsurile

gillux acum 3 zile 18 mai 2026 la 16:24:08 UTC

flag

Report

link

Link permanent

Hi rul, thanks for reporting this issue, I just had a look into it. I confirm there is a problem, I see a difference between what the search returns and what’s actually in the corpus.

It was a actually a rare and temporary problem. We may fix the root cause in the future, but for now I have simply reindexed the affected sentences on the "Translated by Tatoebans" list. Your search should now return the latest sentences.

ascundeți răspunsurile afișați răspunsurile

rul acum 3 zile 18 mai 2026 la 16:28:47 UTC

flag

Report

link

Link permanent

Thanks, but you seem to have made the problem worse. Now even the list view itself only shows sentences that are 9+ days old, and the actual list was supposed to have been updated 4 days ago. The sentences I had that were supposed to show up in search don't even show up on the view for the list at all anymore.

ascundeți răspunsurile afișați răspunsurile

gillux acum 3 zile 18 mai 2026 la 17:19:09 UTC

flag

Report

link

Link permanent

The list view is correct. I think it was not updated correctly this time.

I have compared the web server logs when the list was updated on May 16th and May 9th. The exact same set of sentences were added and removed on that two days. In other words, the update on May 16th ran fine, but it had no effect because the sentences it added were already on the list and the sentences it removed were already not on the list.

gillux acum 3 zile, editat acum 3 zile 18 mai 2026 la 17:20:56 UTC, editat 18 mai 2026 la 17:45:00 UTC

flag

Report

link

Link permanent

And the root cause is probably that the weekly exports did not run last week. o_O

EDIT: As it turns out, I didn’t pay attention as I was patching the kernel against the latest CVE, and I rebooted the server right when the weekly export started… Let’s just try again next Saturday and sorry for the inconvenience.

ascundeți răspunsurile afișați răspunsurile

rul acum 3 zile 18 mai 2026 la 17:43:13 UTC

flag

Report

link

Link permanent

Alright, thanks. I seem to remember having seen more recent sentences in the list view, but I have no way of proving it, and I probably just misunderstood.

ascundeți răspunsurile afișați răspunsurile

Seael acum 2 zile, editat acum 2 zile 18 mai 2026 la 22:10:21 UTC, editat 18 mai 2026 la 22:14:32 UTC

flag

Report

link

Link permanent

One of the items I keep on https://tatoeba.org/es/vocabulary/of/Seael is "hantavirus" but it says there are 0 sentences with it in Spanish despite I created #13902897 ,containing it, two days ago.

Same with other words like "cataplasma"... OK, so let's wait until Saturday, then... Thanks!

ascundeți răspunsurile afișați răspunsurile

gillux ieri 20 mai 2026 la 11:10:08 UTC

flag

Report

link

Link permanent

@Seael Having to wait for Saturday is only required to get the list "Translated by Tatoebans" properly updated.

There is actually a bigger problem that’s currently preventing many sentences from showing up in search results. Even the sentences that I manually reindexed yesterday for @rul are not showing up any more. I am investigating the issue. https://github.com/Tatoeba/tatoeba2/issues/3291

LeviHighway ieri, editat ieri 20 mai 2026 la 02:10:23 UTC, editat 20 mai 2026 la 02:12:46 UTC

flag

Report

link

Link permanent

https://tatoeba.org/zh-tw/sente...rd_count_min=1

The current written system used in the Minnan corpus is mixed. Pure Pe̍h-ōe-jī (Latin) sentences and pure Chinese-character sentences each make up about half of the corpus. I’ve been wondering whether it would make sense to split Minnan into two separate entries: one using only Pe̍h-ōe-jī, and the other using only Chinese characters. Another possibility would be to set either Pe̍h-ōe-jī or Chinese characters as an alternative script.

However, the problem is that there are currently no tools capable of converting Chinese-character Minnan into Pe̍h-ōe-jī, or vice versa. In addition, Minnan pronunciation is complicated, and most characters have multiple readings, which makes automatic conversion very difficult.

Another issue is that, in actual usage among the public, mixed writing combining Chinese characters and Pe̍h-ōe-jī is very common. I’m not sure how this situation should be handled, or which writing system for Minnan Tatoeba should ultimately include in its corpus.

ascundeți răspunsurile afișați răspunsurile

Thanuir ieri 20 mai 2026 la 06:24:01 UTC

flag

Report

link

Link permanent

Kieltä tuntematta: tuleeko vastaan jotain ongelmia, jos molempia merkkilajeja voi käyttää yksinään tai sekaisin, jos kieli kerran näin tekee?

ascundeți răspunsurile afișați răspunsurile

gillux ieri 20 mai 2026 la 06:40:50 UTC

flag

Report

link

Link permanent

The main drawback of mixing both writing systems is that it is difficult to search for words. You need to make two searches, one for each script, in order to find all potential sentences. And you need to know how to write the word in both scripts.

gillux ieri 20 mai 2026 la 07:41:44 UTC

flag

Report

link

Link permanent

Good question. It could be interesting to know what led to the decision of using POJ only on Minnan Wikipedia https://zh-min-nan.wikipedia.org/wiki/Th%C3%A2u-ia%CC%8Dh

I guess we can have a transcription system where users manually enter alternative scripts (without autogeneration), as long as there is a some way to validate consistency between the sentence and the alternative script, and of the alternative script alone.

As for mixing both scripts within a single sentence, I am not sure if we want to explicitly allow it. On the one hand, if it reflects real world usage of the language, it should be on Tatoeba. On the other hand, it makes it difficult to classify and search. Can you clarify if a typical sentence would include a majority of Chinese characters and a just few POJ, or if it's 50/50, or the opposite?

For your information, Tatoeba already has quite a few languages that use two scripts without automatic conversion. Just to name a few:

- Arabic/Latin:
https://tatoeba.org/sentences/s...ta&sort=random
https://tatoeba.org/sentences/s...hg&sort=random

- Cyrillic/Latin
https://tatoeba.org/sentences/s...fn&sort=random
https://tatoeba.org/sentences/s...rp&sort=random

Some contributors add both scripts and link both sentences to one another. Not ideal, and not officially recommended, but it helps with searching.

ascundeți răspunsurile afișați răspunsurile

LeviHighway ieri, editat ieri 20 mai 2026 la 08:59:52 UTC, editat 20 mai 2026 la 09:05:01 UTC

flag

Report

link

Link permanent

The Minnan Wikipedia was founded in 2004, at a time when there was no unified standard for writing Minnan in Chinese characters — the situation was fairly fragmented, with no academic consensus. Pe̍h-ōe-jī, developed by Western missionaries, offered a consistent and well-documented alternative, which is likely the main reason the Hokkien Wikipedia adopted it as its sole writing system.

The drawback, however, is significant: the vast majority of Minnan speakers have never learned POJ, and writing in Chinese characters has been the traditional practice for centuries.

In 2009, Taiwan's Ministry of Education published an official recommended character set for Taiwanese Minnan, which remains in use by government institutions today. That said, it has attracted considerable debate in academic circles, and many are reluctant to adopt it. This is partly because Minnan speakers are not exclusively Taiwanese — speakers in other regions have no particular reason to accept a Taiwan-specific standard — and even within Taiwan itself, the standard remains contested.

There have been proposals to add Chinese-character articles to the Minnan Wikipedia, but nothing has come of them, largely for the reasons above.

As for how mixed writing works in practice: most core Minnan vocabulary is of Sinitic origin, and for these words the character spellings are generally uncontroversial. The disputed cases — function words, sentence-final particles, and loanwords — are where some writers switch to POJ.

The proportion varies considerably and is hard to generalize: a given sentence might consist almost entirely of Sinitic vocabulary, or it might be dominated by particles and loanwords, in which case the romanized portion would be much larger.

LeviHighway ieri 20 mai 2026 la 09:15:19 UTC

flag

Report

link

Link permanent

https://tatoeba.org/zh-tw/sente...f_user/tsunhua

BTW, 472 of 482 Minnan sentences are by tsunhua. And it seems that they add one sentence in Chinese characters and then add the same sentence in POJ for every sentence they add.

PaulP acum 6 zile 15 mai 2026 la 06:22:07 UTC

flag

Report

link

Link permanent

Copied from the Tatoeba group on Telegram (because I think that no admin is still active on Telegram):

"When if ever Tatoeba will be able to capture audio directly on site?"

ascundeți răspunsurile afișați răspunsurile

AlanF_US acum 6 zile, editat acum 6 zile 15 mai 2026 la 12:47:22 UTC, editat 15 mai 2026 la 12:48:07 UTC

flag

Report

link

Link permanent

@gillux is really the only person who can answer that question directly. However, I just want to insert my opinion that even if someday we are able to capture audio on the site, it shouldn't be made audible until it has been reviewed. Audio differs from text in two ways:

(1) A native speaker can spot errors in text virtually immediately, but listening to audio takes time.

(2) Audio can suffer from background noise and lack of clarity, which is not a factor with text.

I can easily imagine a large number of submissions of poor quality damaging the usefulness of the site if they "went live" immediately.

gillux acum 5 zile 16 mai 2026 la 07:11:42 UTC

flag

Report

link

Link permanent

> "When if ever Tatoeba will be able to capture audio directly on site?"

This year hopefully!

I think I was waiting to get something working before making any announcement here, but since you brought up the topic, let me explain what has been going on behind the scenes.

I have been in contact with Hugo from Lingua Libre for some time on a
different channel. Hugo was part of the Shtooka project back in the days, along with its creator Nicolas. Later, they worked on a cloud version, which was rebranded as Lingua Libre. So essentially Lingua Libre and Shtooka are just the same piece of software at two different points in time.

A quick introduction of Lingua Libre. Lingua Libre [1] is the name of a recording tool, but also a Wikimedia France project used to gather recordings. It uses Wikimedia Commons for audio storage, and members of the Wikimedia community help connecting with native speakers to have their voice recorded. The recordings are used by other Wikimedia projects such as Wiktionary or Wikipedia, so they mainly focus on recording words.

I connected with Hugo in early 2025. Hugo was actually astonished to learn that Tatoeba audio contributors still rely on the good old Shtooka. We quickly figured out there was room for collaboration. Lingua Libre has a strong recorder and is starting to support sentences in addition to words, but it needs open text content, so it could benefit from Tatoeba’s linguistically diverse corpus. Tatoeba lacks an easy-to-use recorder and audio support is rather basic, so it could benefit from Lingua Libre’s tooling and Wikimedia Foundation’s infrastructure and "aura".

In 2025, Hugo has been working on a new version of the recorder that makes it easier for other projects to reuse. This new version has been in "beta test" for some time now and I think it will become their new official recorder soon.

At some point in late 2025, Hugo and I tried to apply to a Microsoft grant [2] to kickstart collaboration between Tatoeba and Lingua Libre, but our application got rejected. This means it will take more time to get things done but we will get there eventually.

After discussing with Trang and CK, I drafted an initial technical plan to allow using Lingua Libre to record Tatoeba sentences, and you can follow the progress on GitHub [3]. Basically, it is much harder to make two pieces of software collaborate than to develop everything in-house, but I believe it will pay in the long run. Generally speaking, Tatoeba and Lingua Libre share common goals of creating open and diverse linguistic ressources and preserving endangered languages, so I believe our project and communities should be more connected and aware of what the other party is doing. I wish the recorder could be the first step in that direction.

[1] https://lingualibre.org/
[2] https://www.microsoft.com/en-us...-voices-in-ai/
[3] https://github.com/Tatoeba/tatoeba2/issues/3183

ascundeți răspunsurile afișați răspunsurile

ciampix acum 3 zile 18 mai 2026 la 16:20:03 UTC

flag

Report

link

Link permanent

Wow! Great to hear that!

acum 5 zile, editat acum 5 zile 16 mai 2026 la 10:23:40 UTC, editat 16 mai 2026 la 10:24:07 UTC

link

Link permanent

warning

Conținutul acestui mesaj contravine regulilor noastre și, prin urmare, a fost ascuns. Este afișat numai pentru admini și pentru autorul mesajului.

acum 5 zile, editat acum 5 zile 16 mai 2026 la 07:08:28 UTC, editat 16 mai 2026 la 07:09:48 UTC

link

Link permanent

warning

Conținutul acestui mesaj contravine regulilor noastre și, prin urmare, a fost ascuns. Este afișat numai pentru admini și pentru autorul mesajului.

efinah acum 9 zile 12 mai 2026 la 13:42:50 UTC

flag

Report

link

Link permanent

https://tatoeba.org/en/sentences/show/13898325

My sentence should be deleted. Entered in wrong place. Sorry.
Can't see how to delete it myself.

ascundeți răspunsurile afișați răspunsurile

LeviHighway acum 9 zile 12 mai 2026 la 13:49:14 UTC

flag

Report

link

Link permanent

If you want your own sentence be deleted, you can edit and replace the text with DELETE.

marafon acum 9 zile, editat acum 9 zile 12 mai 2026 la 14:38:33 UTC, editat 12 mai 2026 la 14:42:45 UTC

flag

Report

link

Link permanent

I unlinked it from the Turkish. Now just copy and paste it to the right place.

jan_OkulaJu acum 9 zile 12 mai 2026 la 02:04:07 UTC

flag

Report

link

Link permanent

I found that transcriptions can be edited on the All My Sentences page. However, there is no option to edit transcriptions on the individual sentence page.

Could you please consider adding a transcription editing option to the sentence page?

ascundeți răspunsurile afișați răspunsurile

LeviHighway acum 9 zile, editat acum 9 zile 12 mai 2026 la 11:39:57 UTC, editat 12 mai 2026 la 11:41:02 UTC

flag

Report

link

Link permanent

目前僅有進階參與者可以編輯注音 (transcription)。
https://zh-tw.wiki.tatoeba.org/...d-contributors

我不太確定您說的「All My Sentences」上可以編輯注音是怎麼回事，一般來說普通參與者是不可以編輯注音的。

ascundeți răspunsurile afișați răspunsurile

jan_OkulaJu acum 9 zile, editat acum 9 zile 12 mai 2026 la 11:53:04 UTC, editat 12 mai 2026 la 11:57:27 UTC

flag

Report

link

Link permanent

不好意思，可能我的英文表達有誤。我已經知道只有高級編輯者才能編輯轉寫這項規定。但我發現我可以在個人主頁查看自己的個人句子，而且能夠在此修改注音、繁簡字轉換與振假名。我也看過其他人的頁面，無法修改他人句子的轉寫。

如果一般編輯者無法編輯轉寫，那這算不算系統漏洞呢？

比如這條句子的振假名是我編輯的。
https://tatoeba.org/zh-cn/sentences/show/13880609

ascundeți răspunsurile afișați răspunsurile

LeviHighway acum 9 zile 12 mai 2026 la 12:01:22 UTC

flag

Report

link

Link permanent

似乎確實是這樣。

如果您願意進一步說明情況，我可以將問題上報至 GitHub，您也可以自行上報：https://github.com/Tatoeba/tatoeba2/issues

ascundeți răspunsurile afișați răspunsurile

jan_OkulaJu acum 9 zile 12 mai 2026 la 13:00:52 UTC

flag

Report

link

Link permanent

不過我剛查看過，這條規範僅說明高階編輯者可編輯其他成員尚未審核的轉寫內容，卻沒有明確一般編輯者對於自身句子轉寫的編輯權限。個人認為Tatoeba預設是開放使用者編輯自己语句轉寫的。

因此我認為這項功能本身是正常合理的，只是入口過於隱蔽，必須進入個人主頁的句子頁面才能編輯，我也是偶然才發現。

所以我維持原本的建議，希望平台能為這項功能設置更明顯、更好找的操作入口，或是補上更清楚的使用說明（或許現有說明我尚未詳細閱讀），至少不用使用者自行摸索才能找到轉寫編輯功能。

ascundeți răspunsurile afișați răspunsurile

LeviHighway acum 9 zile 12 mai 2026 la 13:08:09 UTC

flag

Report

link

Link permanent

可以說明一下一般參與者怎樣可以編輯自己句子的轉寫嗎？我剛測試了一下，沒發現可以編輯轉寫的方法。

ascundeți răspunsurile afișați răspunsurile

jan_OkulaJu acum 9 zile 12 mai 2026 la 13:16:48 UTC

flag

Report

link

Link permanent

進入個人主頁→在主頁右側傳送訊息給xxx的選項上方有使用者名稱→點擊即可展開→選擇句子選項→若存在轉寫句子（如中文、日文），轉寫處就會出現鉛筆圖示→點擊即可編輯修改轉寫內容

他人頁面同樣可以開啟查看，確實如平台規定，我沒有權限編輯修改其他人句子的轉寫內容。他人頁面的鉛筆圖示顏色會較淺，以此提示無法編輯。

ascundeți răspunsurile afișați răspunsurile

LeviHighway acum 9 zile 12 mai 2026 la 13:27:23 UTC

flag

Report

link

Link permanent

我確實無法復現這個操作。不論是建議還是 Bug，都最好上報至 GitHub，那邊有人會處理系統問題。您可以自行上報，我也可以代您上報：
https://github.com/Tatoeba/tatoeba2/issues

efinah acum 9 zile, editat acum 9 zile 12 mai 2026 la 12:18:36 UTC, editat 12 mai 2026 la 12:21:31 UTC

flag

Report

link

Link permanent

New feature? word type specification

As far as I can see, there's no way to do this with the current adv search settings. Pls let me know if there is.

I'm working from Turkish to English and I want to search for "resmi" meaning official.
However, "resmi" can also mean painting or picture so those are being picked up too.
I just want the adjective, hence the feature idea.
Thx 🌸
PS I just realised doing the search in reverse finds only "official" but interestingly it also produces ENG sentences with no TUR equivalents.

ascundeți răspunsurile afișați răspunsurile

LeviHighway acum 9 zile 12 mai 2026 la 13:03:00 UTC

flag

Report

link

Link permanent

This is not what you asked for, but you can search English sentences containing "official" with Turkish translations this way:
https://tatoeba.org/zh-tw/sente...rd_count_min=1

By setting "limit to" and specifying "Language:" in the Translation field, you can limit the search results to only show sentences with translation(s) in that language.

efinah acum 10 zile 11 mai 2026 la 11:47:26 UTC

flag

Report

link

Link permanent

New feature? I cannot find this anywhere.

It would be handy to be able to save the search criterion.

ascundeți răspunsurile afișați răspunsurile

CK acum 10 zile 11 mai 2026 la 13:22:16 UTC

flag

Report

link

Link permanent

See the "create search template" button in the advanced search.

frpzzd acum 11 zile 10 mai 2026 la 16:32:09 UTC

flag

Report

link

Link permanent

Can CC BY-NC-ND 4.0 sentences be added to Tatoeba? If so, is there any admin here who would be willing to bulk add some sentences with this license to the database for me, if I provide a high-quality data file?

For context, I am looking at the following Palauan-English dictionary:
https://scholarspace.manoa.hawa...d3e754/content
I have been in contact with one of the maintainers of the website https://tekinged.com and (with his permission) have been transferring many of the site's volunteer-written Palauan sentences to Tatoeba. The site also has many sentences taken from this dictionary, and out of caution I have held off from adding those sentences to Tatoeba for now, but it would be cool if they could be added as well.

ascundeți răspunsurile afișați răspunsurile

CK acum 10 zile, editat acum 10 zile 10 mai 2026 la 23:18:41 UTC, editat 10 mai 2026 la 23:23:46 UTC

flag

Report

link

Link permanent

No, CC BY-NC-ND 4.0 content cannot be added to the Tatoeba Corpus.

Our content is re-distributed with a less restrictive license.

Note that even CC-BY content cannot be used because users of our downloadable data cannot properly give the required "BY" credit.

Perete (7.389 subiecte)

Sugestii

gillux

LeviHighway

LeviHighway

gillux

gillux

Thanuir

LeviHighway

Seael

rul

gillux

Aveți nevoie de ajutor?

Dezvoltatori

Despre