menu
Tatoeba
language
Registriĝi Ensaluti
language Esperanto
menu
Tatoeba

chevron_right Registriĝi

chevron_right Ensaluti

Foliumi

chevron_right Montri hazardan frazon

chevron_right Foliumi laŭ lingvo

chevron_right Foliumi laŭ listo

chevron_right Foliumi laŭ etikedo

chevron_right Foliumi sonregistraĵojn

Komunumo

chevron_right Muro

chevron_right Listo de ĉiuj membroj

chevron_right Lingvoj de la membroj

chevron_right Denaskaj parolantoj

search
clear
swap_horiz
search
gillux gillux 2015-majo-30, modifita 2015-junio-08 2015-majo-30 17:44:58 UTC, modifita 2015-junio-08 12:30:34 UTC link Konstanta ligilo

*** The editable transcription feature is now testable on https://dev.tatoeba.org/ ***

Currently only transcriptions for Chinese and Japanese sentences may be edited.

Summary of changes
• On every sentence that may have transcriptions, you’ll see an additional icon “Show transcriptions” along with “Add to list” etc. For the moment it’s the same icon as “Edit sentence” but we’ll change it later.
• Clicking on that icon pops-up transcriptions for every sentence of the group, along with a warning message about their unreviewed state.
• You can review/edit them by clicking on the transcription text. Once sent, a transcription will always appear along with the sentence, just like now.
• In your settings, there is a new option that allows you to show unreviewed transcriptions by default, without having to click on the “Show transcriptions” button. If enabled, the warning text is replaced by a simple warning icon on the right.

Any suggestions are welcome, but before posting a comment, please have a look at the current implementation status here: https://github.com/Tatoeba/tatoeba2/pull/661 There are some unsettled questions, and know problems (like missing icons, which is why you see Hrkt or Latn instead).

Previous thread: https://tatoeba.org/wall/show_message/22679

{{vm.hiddenReplies[22870] ? 'expand_more' : 'expand_less'}} kaŝi la respondojn montri la respondojn
tommy_san tommy_san 2015-majo-30, modifita 2015-majo-30 2015-majo-30 23:37:05 UTC, modifita 2015-majo-30 23:40:39 UTC link Konstanta ligilo

Thanks a lot! I tried it out.

1. I find it quite troublesome to edit furigana, for example from "[男の子|おとこのこ]" to "[男|おとこ] の [子|こ]". It would become much easier for us to edit if you accept a style like "男(おとこ) の 子(こ)" with zenkaku parentheses and spaces.

2. It might be nice if there were a button for corpus maintainers and advanced contributors to verify an auto-generated transcription with one click when it's correct.

3. I don't really welcome the revival of the "click to edit" UI. Especially Chinese learners would want to drag and copy pinyin.

4. romaji and pinyin sentences should start with capital letters.

5. Are spaces between the words in texts with furigana useful for Japanese learners? Personally, I prefer 「そうなの? ごめんね、心配(しんぱい)かけて」 to 「 そう な の ? ごめん ね 、 心配(しんぱい) かけて 」. However, perhaps we don't need this extra sentence with furigana in the first place. (See below.)

6. I feel Japanese sentences take up too much space. Most people don't need need romaji if they can read hiragana. I also wonder if we can't simply place furigana on top of the main sentence instead of displaying the same sentence twice. At least that's what sentences in Japanese books for children and learners look like.

7. Would it be possible to show a new transcription right away after editing it? Now we seem to need to reload the page to see it. It's especially problematic when we want to edit the transcription again just after submitting it.

8. http://prntscr.com/7bad6x This doesn't work, most possibly because of the latin alphabet.

9. Something very weird happened when I was on a search results page and tried to edit the furigana. http://prntscr.com/7baj45

10. When I provide an invalid transcription, for example "[愛あい|あい]してる 。", they just ignore me without saying anything. Everything I type is discarded when there's just one mistake. This isn't a very nice behavior. (It seems that the only sign that they ignore me is that the editing form reappears for a while after I submit a transcription.)

11. https://dev.tatoeba.org/eng/sentences/show/3803811 Why can't I edit the transcrition of this sentence?

12. The warning sign gives me the impression as if the sentence itself were untrustworthy. Maybe it should be placed just on the left of the transcription.

13. I think there should be some way to find out who added the transcription. I'd also like a way to browse latest added transcriptions and the list of transcriptions added by a particular member so that we can check if a spammer is not adding lots of bad transcriptions.

14. Maybe a sentence (itself) shouldn't be changed after another native speaker has provided a transcription for it.

{{vm.hiddenReplies[22871] ? 'expand_more' : 'expand_less'}} kaŝi la respondojn montri la respondojn
gillux gillux 2015-majo-31 2015-majo-31 07:35:09 UTC link Konstanta ligilo

1. I find it quite troublesome to edit furigana, for example from "[男の子|おとこのこ]" to "[男|おとこ] の [子|こ]". It would become much easier for us to edit if you accept a style like "男(おとこ) の 子(こ)" with zenkaku parentheses and spaces.

I admin the current square bracket syntax (お[弁当|べんとう]) is less easy to edit. I used it because it’s way easier to parse (both for us and for anyone using our data): it doesn’t need extra-processing to figure out what characters the furigana belongs to. Maybe we could use the parenthesis syntax for editing and internally store it bracketed to get the better of the two.

5. Are spaces between the words in texts with furigana useful for Japanese learners? Personally, I prefer 「そうなの? ごめんね、心配(しんぱい)かけて」 to 「 そう な の ? ごめん ね 、 心配(しんぱい) かけて 」. However, perhaps we don't need this extra sentence with furigana in the first place. (See below.)

I mostly agree. The main purpose of these spaces is to allow the romaji to be broken down into words, because it’s directly derived from the furigana. As a secondary purpose, I think it eases reading for learners. So currently the furigana serves two purposes: word segmentation and reading aids. So if we drop romaji, we can drop these spaces as well. However, romaji isn’t only here for learners. I planned to store it in order to allow romaji search. But maybe the romaji/kana conversion could be done during query-time instead.

6. I feel Japanese sentences take up too much space. Most people don't need need romaji if they can read hiragana. I also wonder if we can't simply place furigana on top of the main sentence instead of displaying the same sentence twice. At least that's what sentences in Japanese books for children and learners look like.

That would be great indeed, but it comes with other problems as CK said. I added a CSS trick to prevent the furigana to get into the copy and paste buffer, but it only works on Firefox. Maybe a copy button could become handy here. And we still need two different fields to edit either the sentence or the furigana, otherwise it would be too messy because we don’t want annotations in sentences neither.

2. It might be nice if there were a button for corpus maintainers and advanced contributors to verify an auto-generated transcription with one click when it's correct.

3. I don't really welcome the revival of the "click to edit" UI. Especially Chinese learners would want to drag and copy pinyin.

I packed the UI because I was lacking space, but yes, I agree. Although I don’t really know how to organize this. Adding a new buttons bar just like the sentence’s?

7. Would it be possible to show a new transcription right away after editing it? Now we seem to need to reload the page to see it. It's especially problematic when we want to edit the transcription again just after submitting it.

It should be the case already.

9. Something very weird happened when I was on a search results page and tried to edit the furigana. http://prntscr.com/7baj45

What did you do exactly?

10. When I provide an invalid transcription, for example "[愛あい|あい]してる 。", they just ignore me without saying anything. Everything I type is discarded when there's just one mistake. This isn't a very nice behavior. (It seems that the only sign that they ignore me is that the editing form reappears for a while after I submit a transcription.)

The current behavior surely lacks some error message, but it shouldn’t discard what your type. When the submitted transcription is invalid, it just shows the form again, and its contents are what you just submitted.

11. https://dev.tatoeba.org/eng/sentences/show/3803811 Why can't I edit the transcrition of this sentence?

This is a bug. The autogenerated transcription is invalid: 々 is considered as a kanji and thus needs furigana, whereas it doesn’t have furigana in the autogenerated transcription.

12. The warning sign gives me the impression as if the sentence itself were untrustworthy. Maybe it should be placed just on the left of the transcription.

Initially I didn’t want to pack two icons on the left, but you’re probably right.

13. I think there should be some way to find out who added the transcription. I'd also like a way to browse latest added transcriptions and the list of transcriptions added by a particular member so that we can check if a spammer is not adding lots of bad transcriptions.

The current way is to mouseover, just like with tags. Again, I still don’t really know how to organize the UI.

14. Maybe a sentence (itself) shouldn't be changed after another native speaker has provided a transcription for it.

I think we just need a proper review system.

{{vm.hiddenReplies[22875] ? 'expand_more' : 'expand_less'}} kaŝi la respondojn montri la respondojn
tommy_san tommy_san 2015-junio-02 2015-junio-02 01:47:38 UTC link Konstanta ligilo

>> 1. I find it quite troublesome to edit furigana, for example from "[男の子|おとこのこ]" to "[男|おとこ] の [子|こ]". It would become much easier for us to edit if you accept a style like "男(おとこ) の 子(こ)" with zenkaku parentheses and spaces.
>
> Maybe we could use the parenthesis syntax for editing and internally store it bracketed to get the better of the two.

Yes, that's what I had in mind.


>> 5. Are spaces between the words in texts with furigana useful for Japanese learners? Personally, I prefer 「そうなの? ごめんね、心配(しんぱい)かけて」 to 「 そう な の ? ごめん ね 、 心配(しんぱい) かけて 」.
>
> I mostly agree. The main purpose of these spaces is to allow the romaji to be broken down into words, because it’s directly derived from the furigana. As a secondary purpose, I think it eases reading for learners. So currently the furigana serves two purposes: word segmentation and reading aids. So if we drop romaji, we can drop these spaces as well. However, romaji isn’t only here for learners. I planned to store it in order to allow romaji search. But maybe the romaji/kana conversion could be done during query-time instead.

One problem is that most Japanese don't know how to write *sentences* with romaji. It seems that the most common way to write the sentence #3652121 with romaji is "Sūgaku de wakaranai tokoro ga arun dakedo oshiete moraenai kana?" but almost nobody can do this. At school, we learn 文節 and 単語.
(文節)数学で 分からない ところが あるんだけど 教えて もらえないかな?
(単語)数学 で 分から ない ところ が ある ん だ けど 教え て もらえ ない かな?
If you don't display romaji, then it might be nice to have spaces between 文節. However, I'm not sure either if many people can do this properly.
Maybe you say it doesn't need to be so consistent and each contributor can just put spaces where they want, but I don't feel like having that kind of thing in an official part of the corpus.
For romaji search, you don't have to bother about this all. You can just ignore all spaces, like this. http://japan.de/wb/index.php?q=...=Volltextsuche


>> 6. I wonder if we can't simply place furigana on top of the main sentence instead of displaying the same sentence twice.
>
> That would be great indeed, but it comes with other problems as CK said. I added a CSS trick to prevent the furigana to get into the copy and paste buffer, but it only works on Firefox. Maybe a copy button could become handy here. And we still need two different fields to edit either the sentence or the furigana, otherwise it would be too messy because we don’t want annotations in sentences neither.

I see. Let's wait and see what other people have to say.


>> 2. It might be nice if there were a button for corpus maintainers and advanced contributors to verify an auto-generated transcription with one click when it's correct.
>
>> 3. I don't really welcome the revival of the "click to edit" UI. Especially Chinese learners would want to drag and copy pinyin.
>
> I packed the UI because I was lacking space, but yes, I agree. Although I don’t really know how to organize this. Adding a new buttons bar just like the sentence’s?

There's some space on the left of a transcription, so maybe you can place some buttons there. (A verify button for native speakers, an edit button, and a robot/user icon.)


>> 7. Would it be possible to show a new transcription right away after editing it? Now we seem to need to reload the page to see it. It's especially problematic when we want to edit the transcription again just after submitting it.
>
> It should be the case already.

It sometimes doesn't work, at least on Chrome.
http://prntscr.com/7c3h51


>> 9. Something very weird happened when I was on a search results page and tried to edit the furigana. http://prntscr.com/7baj45
>
> What did you do exactly?

I was on https://dev.tatoeba.org/eng/sen...rom=und&to=und and tried to edit the furigana.


>> 10. When I provide an invalid transcription, for example "[愛あい|あい]してる 。", they just ignore me without saying anything.
>
> The current behavior surely lacks some error message, but it shouldn’t discard what your type. When the submitted transcription is invalid, it just shows the form again, and its contents are what you just submitted.

Again, could you try it on Chrome? This is how it looks right after I submit an invalid transcription. http://prntscr.com/7c3jg4


15. Could you perform the "男の子(おとこのこ)→男(おとこ)の子(こ)" type of change automatically? That would surely save us a lot of time. I don't mind if 飼い犬 were to become 飼(かい)い犬(ぬ) by that. I could easily fix this kind of thing.

{{vm.hiddenReplies[22895] ? 'expand_more' : 'expand_less'}} kaŝi la respondojn montri la respondojn
gillux gillux 2015-junio-02 2015-junio-02 09:41:49 UTC link Konstanta ligilo

You convinced me. Let’s drop spaces in furigana and romaji.

> There's some space on the left of a transcription, so maybe you can place some buttons there. (A verify button for native speakers, an edit button, and a robot/user icon.)

You’re right, I’ll try to put buttons there.

> 15. Could you perform the "男の子(おとこのこ)→男(おとこ)の子(こ)" type of change automatically?

Will do.

{{vm.hiddenReplies[22897] ? 'expand_more' : 'expand_less'}} kaŝi la respondojn montri la respondojn
tommy_san tommy_san 2015-junio-02 2015-junio-02 10:03:22 UTC link Konstanta ligilo

> Let’s drop spaces in furigana and romaji.

Korewa "drop spaces in furigana and drop romaji" tteiuimidesuyone? Supēsunonaikonnarōmajigahyōjisareteirunowamitakunaidesuyo. ☺

{{vm.hiddenReplies[22898] ? 'expand_more' : 'expand_less'}} kaŝi la respondojn montri la respondojn
gillux gillux 2015-junio-02 2015-junio-02 12:25:23 UTC link Konstanta ligilo

Ha ha ha. :-) Yes, let’s drop romaji and spaces in furigana.

gillux gillux 2015-junio-04 2015-junio-04 08:33:11 UTC link Konstanta ligilo

Thinking again, we can’t use parenthesis to mark furigana if some are used in the sentence itself, such as in #1614313 or #464018.

{{vm.hiddenReplies[22923] ? 'expand_more' : 'expand_less'}} kaŝi la respondojn montri la respondojn
Objectivesea Objectivesea 2015-junio-04 2015-junio-04 08:47:55 UTC link Konstanta ligilo

There are other types of parentheses that could be used. Here are some Unicode possibilities, some of which are sometimes used in Japanese text. If one kind was consistently used to mark normal parentheses and another kind for furigana, the risk of confusion should be minimized.

[ ] { } < > 〈 〉 《 》 「 」 『 』 〔 〕 〚 〛 ⦑ ⦒

tommy_san tommy_san 2015-junio-04 2015-junio-04 08:57:34 UTC link Konstanta ligilo

How about {}? It's less likely to be used in a sentence and as easy to type.

sharptoothed sharptoothed 2015-junio-02 2015-junio-02 20:32:26 UTC link Konstanta ligilo

> I added a CSS trick to prevent the furigana to get into the copy and paste buffer, but it only works on Firefox.

Take a look at this dirty and ugly CSS trick. It seems to work on any modern browser. :-)
http://j-langtools.com/tatoeba/furiganatest.html

{{vm.hiddenReplies[22900] ? 'expand_more' : 'expand_less'}} kaŝi la respondojn montri la respondojn
gillux gillux 2015-junio-03 2015-junio-03 07:25:08 UTC link Konstanta ligilo

That is indeed ugly. And the text isn’t correctly aligned when I zoom in or out.

{{vm.hiddenReplies[22905] ? 'expand_more' : 'expand_less'}} kaŝi la respondojn montri la respondojn
sharptoothed sharptoothed 2015-junio-03 2015-junio-03 07:39:42 UTC link Konstanta ligilo

That's strange. I haven't noticed any problem with alignment in my browsers while zooming in/out during tests. Can you show me how it looks like?

{{vm.hiddenReplies[22906] ? 'expand_more' : 'expand_less'}} kaŝi la respondojn montri la respondojn
gillux gillux 2015-junio-03 2015-junio-03 08:31:41 UTC link Konstanta ligilo

https://i.imgur.com/THXHZdo.png
https://i.imgur.com/aZI6K3p.png

This however happens only at some of the zoom steps:
-5: correct
-4: correct
-3: incorrect
-2: correct
-1: correct
0: correct
1: incorrect
2: correct
3: incorrect (that’s what I used in the screenshots)
4: correct
5: incorrect
6: correct
7: correct
8: correct

Maybe it’s just a font problem on my side.

gillux gillux 2015-junio-03 2015-junio-03 08:38:13 UTC link Konstanta ligilo

There is also a vertical alignment problem when the text wraps (regardless of zoom). Try to limit the body with width:200px. The furigana overlaps with the text.

{{vm.hiddenReplies[22908] ? 'expand_more' : 'expand_less'}} kaŝi la respondojn montri la respondojn
sharptoothed sharptoothed 2015-junio-03 2015-junio-03 08:55:50 UTC link Konstanta ligilo

Ah, now I see. Thanks, gillux!

gillux gillux 2015-majo-31 2015-majo-31 07:42:05 UTC link Konstanta ligilo

> 12. The warning sign gives me the impression as if the sentence itself were untrustworthy. Maybe it should be placed just on the left of the transcription.

How about a brown question mark sign instead?

TRANG TRANG 2015-majo-31 2015-majo-31 23:03:38 UTC link Konstanta ligilo

Here are my ideas to improve the UI.

1. I'm thinking we could use a "robot" icon for transcriptions that are auto-generated, and a "user" icon (or no icon at all) for those that were edited by real people. I haven't found a suitable robot icon yet though.

2. I have no idea what icon we could use instead for the "show transcriptions" button...
I think ideally the "show transcription" button should be some sort of global setting, something that you switch on or off for everything, and not just the sentence that you're viewing. I can at least imagine myself being frustrated of having to click it on every sentence when I search sentences or when I'm browising all the sentences in Japanese, or the sentences in a list.
But to implement this in a good way, we will need to make it easy for users to change their settings. I've been thinking for a while now to add a settings button in the menu (next to the profile link), and this button will open a popup (kinda like the login popup) where you can see the various options and turn them on and off directly from there.

3. If we implement the global setting thing, we can display the warning text when the user turns on the option rather than displaying it everywhere where there is a transcription.

4. Just like tommy_san I would like to avoid having the click to edit behavior to avoid opening the form when I just want to copy-paste.
I'm not very happy about adding yet another button in the sentences block, because there are already so many buttons and icons. But then the only other solution I can think of is to make the transcriptions editable via the already existing edit button. Which means if a user is the owner of the sentence, they will see both the sentence textarea and the transcription textarea.
Not sure what I would prefer... But I won't be editing transcriptions so my preference doesn't really matter here.

{{vm.hiddenReplies[22884] ? 'expand_more' : 'expand_less'}} kaŝi la respondojn montri la respondojn
gillux gillux 2015-majo-31, modifita 2015-majo-31 2015-majo-31 23:37:15 UTC, modifita 2015-majo-31 23:39:56 UTC link Konstanta ligilo

I like your idea of a robot icon. Or maybe some computer icon, like when you add AI enemies in games.

> I think ideally the "show transcription" button should be some sort of global setting, something that you switch on or off for everything, and not just the sentence that you're viewing. […]
> 3. If we implement the global setting thing, we can display the warning text when the user turns on the option rather than displaying it everywhere where there is a transcription.

There is already a setting that does this (in the settings page). But if turning it off would completely hide transcriptions, I’m afraid nobody would notice their existence.

{{vm.hiddenReplies[22886] ? 'expand_more' : 'expand_less'}} kaŝi la respondojn montri la respondojn
TRANG TRANG 2015-junio-07 2015-junio-07 19:00:23 UTC link Konstanta ligilo

I tried several icons and this one is the only one that looks somewhat fine at a small size:
http://www.flaticon.com/free-ic...d-symbol_56988
Screenshot of how it would look like: http://prntscr.com/7e8j22


> There is already a setting that does this (in the settings page).
> But if turning it off would completely hide transcriptions, I’m
> afraid nobody would notice their existence.

There's a similar problem with tags.
Transcriptions, just like tags, are extra information that can be useful for users, but that are not critical to the point that new users need to know of their existence right away.

What we could do is to have transcriptions always displayed on the sentence's page (because the sentence's page is the place where you can see all the information and perform any action on the sentence), but elsewhere, they would be displayed only if the option is turned on.

To give more visibility for the existence of such an option, we can have (once the feature is released) a tooltip box displayed below the search bar, that users can close once they've read it.