menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search

Wall (6,960 threads)

Tips

Before asking a question, make sure to read the FAQ.

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

Latest messages subdirectory_arrow_right

marafon

4 days ago

feedback

CK

4 days ago

feedback

sharptoothed

9 days ago

subdirectory_arrow_right

Cangarejo

9 days ago

subdirectory_arrow_right

Cangarejo

13 days ago

subdirectory_arrow_right

Thanuir

13 days ago

subdirectory_arrow_right

ondo

13 days ago

subdirectory_arrow_right

ddnktr

14 days ago

feedback

ondo

14 days ago

subdirectory_arrow_right

AlanF_US

17 days ago

blay_paul blay_paul July 7, 2010 July 7, 2010 at 5:44:13 PM UTC link Permalink

Could we have the duplicate removal script run?

Also I sent in an email with 600+ sentence IDs that can have "British English" as a tag - what's the status of that?

{{vm.hiddenReplies[1547] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG July 7, 2010 July 7, 2010 at 7:58:55 PM UTC link Permalink

I'm taking care of the British tag right now.

{{vm.hiddenReplies[1551] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul July 7, 2010 July 7, 2010 at 8:14:04 PM UTC link Permalink

Got your email - thanks for taking care of that for me. :-)

sysko sysko July 7, 2010 July 7, 2010 at 5:51:57 PM UTC link Permalink

in fact I will need to readapt the duplicate removal script to now handle tags and also to handle audio, which is not the case yet
and now we have more and more audio, I really need to take care of this before re-running it, and as I've said I've really few free times yet :(

tanzoniteblack tanzoniteblack July 7, 2010 July 7, 2010 at 12:28:31 AM UTC link Permalink

Is it possible to download the lists as a tab separated file or xml file rather than a CSV?

{{vm.hiddenReplies[1536] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG July 7, 2010 July 7, 2010 at 9:39:18 AM UTC link Permalink

Perhaps I'm misunderstanding the question, but aren't the lists already tab separated...?

If you click on download, it won't download right away, it will first lead to a page where you can choose what data to include in your list, and there's a short explanation about the structure.

=> sentence_id [tab] sentence_text [tab] translation_text

Unless there's a bug we missed, you will definitely be able to import them into Anki :)

{{vm.hiddenReplies[1541] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko July 7, 2010 July 7, 2010 at 9:50:22 AM UTC link Permalink

you're faster than me ^^

sysko sysko July 7, 2010 July 7, 2010 at 12:34:18 AM UTC link Permalink

no we only export in csv yet as it's the format which is the more commonly used, and it can easily import in excel etc.
(and anyway a csv can be a "whatyouwant separated file" so even with tab, it's still a csv, it's just the separator which change, so with a little script, you can easily from one kind of csv to another)

if you plan to reuse the data for a website/application or so, don't forget the CC-BY licence obliged you to notice us of the use (except for personnal use-only of course) and to say where the data come from etc.

{{vm.hiddenReplies[1537] ? 'expand_more' : 'expand_less'}} hide replies show replies
tanzoniteblack tanzoniteblack July 7, 2010 July 7, 2010 at 12:37:39 AM UTC link Permalink

The reason I ask, is because I have issues with the csv if the source sentences contain commas themselves, as it appears that there are more columns in that row then there should be.

{{vm.hiddenReplies[1538] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko July 7, 2010 July 7, 2010 at 12:43:09 AM UTC link Permalink

yep sure, it's a known issue, but it was not the priority yet, so yep I had in mind to replace coma by tab to avoid the problem of coma in sentences

the export was coded really quickly, as this was not something really vital for normal user

we will warn you once it will be corrected (just to warn you, we're more or less only 2 students to make/maintain/promote/debug this website on our freetime, so lack of time/not in priority list is the main reason for most missing features/bugs here)

{{vm.hiddenReplies[1539] ? 'expand_more' : 'expand_less'}} hide replies show replies
tanzoniteblack tanzoniteblack July 7, 2010 July 7, 2010 at 12:47:06 AM UTC link Permalink

How is the data for the linking of the sentences stored? Would I be able to access the source files used by the project's website itself to be able make my own export code into a format that's much more useful for my purposes?

For the note, my own purposes are creating lists while browsing and being able to download and import said lists into my Anki deck.

{{vm.hiddenReplies[1540] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko July 7, 2010 July 7, 2010 at 9:43:33 AM UTC link Permalink

we use a database, and our server is already to small to support much more feature, so no, till we don't have an other server, we will not permit "total" export of the database,

but what you want (create list and export them in anki) is already possible, create a list in the "list" section, and while browsing tatoeba each sentences has an icon "add" too list, and when you want to export it, you go in the list section, you click on your list, and then on the big green download button

moreover if you check the anki plugin list, there's a tatoeba's plugin for japanese learner (made by one of our user)

hope this help

CK CK July 6, 2010, edited October 26, 2019 July 6, 2010 at 1:11:48 AM UTC, edited October 26, 2019 at 3:57:11 AM UTC link Permalink

[not needed - removed by CK]

{{vm.hiddenReplies[1533] ? 'expand_more' : 'expand_less'}} hide replies show replies
saeb saeb July 6, 2010 July 6, 2010 at 2:54:59 AM UTC link Permalink

I already do ;), keep up the great work!

sysko sysko July 6, 2010 July 6, 2010 at 1:45:36 PM UTC link Permalink

in the long long todo list, it's planned to have the "all sentences of user X" to be displayed like list, with possibility to directly translate etc.

Scott Scott July 5, 2010 July 5, 2010 at 4:16:44 AM UTC link Permalink

Just noticed that on the Contribute page, it says: reporting mitakes.

(if you're not logged in.)

{{vm.hiddenReplies[1527] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG July 5, 2010 July 5, 2010 at 6:37:42 PM UTC link Permalink

Thank you, it's corrected ^^

blay_paul blay_paul July 5, 2010 July 5, 2010 at 3:09:32 PM UTC link Permalink

@Trang / Sysko

Could you make the list
http://tatoeba.org/eng/sentences_lists/show/26
public so I can remove sentences when I'm finished with them?

{{vm.hiddenReplies[1529] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG July 5, 2010 July 5, 2010 at 5:28:26 PM UTC link Permalink

Done. But I'm thinking I could simply assign this list to you so that it can remain private. I'll do this when I'm back home.

TRANG TRANG July 5, 2010 July 5, 2010 at 6:23:41 PM UTC link Permalink

Alright, the list is yours now.

sysko sysko July 5, 2010 July 5, 2010 at 12:10:55 AM UTC link Permalink

Trang will certainly make a post on the blog, but I tell it here now

now it's possible to filter to view only sentences with audio
http://tatoeba.org/eng/sentence...nly-with-audio

and now those who wants to record by their own, it will be possible to submit us your recording (we will explain this more in detail soon, but more or less it will be the following
* you tell us you want to record audio
* we give you a list to use with the software "swac-recorder"
* you give us back the zip with all the audio
* I will upload them in the server
(Yeah I know, in the future we will improve this a lot, but this is a first step as ever) )

it's also possible now to send us list of sentences, (for massive adding) trough our email address
we accept the following format

text

and

text[tab]translation

{{vm.hiddenReplies[1522] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko July 5, 2010 July 5, 2010 at 12:15:18 AM UTC link Permalink

and "show all in" is by default "reverse sorted" so you will see last sentences added first
I've also fix some problem in Japanese romanization ( ? which were transcribed as ?[?] etc.)

except bug fixing, and little request, this should be my last "normal" update from me before a long time, from a "coding" point of view, I will focus on an other project, and on a new version of tatoeba.

xtofu80 xtofu80 June 30, 2010 June 30, 2010 at 11:15:04 AM UTC link Permalink

It's not a big issue, and I don't know if it is easy to solve, but can an admin disable the transcription of punctuation marks?
これ 何[なに] ?[?]
=>
これ 何[なに] ?
The transcriptions get unnecessarily longer.

{{vm.hiddenReplies[1485] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko July 5, 2010 July 5, 2010 at 12:12:09 AM UTC link Permalink

done

{{vm.hiddenReplies[1525] ? 'expand_more' : 'expand_less'}} hide replies show replies
xtofu80 xtofu80 July 5, 2010 July 5, 2010 at 1:10:13 PM UTC link Permalink

? and ! seem to be fine, but the "thought fullstops" "..." are annyoing, e.g. nº75468.

blay_paul blay_paul June 30, 2010 June 30, 2010 at 11:27:56 AM UTC link Permalink

There are a number of problems with the transcription system, but I think everybody is a little overworked right now.

sysko sysko June 30, 2010 June 30, 2010 at 12:02:13 PM UTC link Permalink

I've just commit something, it should fix (only) the problem with ? and !, it will be available this weekend

Rebeca Rebeca July 4, 2010 July 4, 2010 at 4:21:56 AM UTC link Permalink

I can't find sentences with typing keywords.

{{vm.hiddenReplies[1517] ? 'expand_more' : 'expand_less'}} hide replies show replies
TRANG TRANG July 4, 2010 July 4, 2010 at 11:59:48 AM UTC link Permalink

Yes, sorry, that's because the server was down yesterday, and we need to re-launch indexation of the sentences.

It won't be available for another 6 hours at least as I have a train to catch right now...

{{vm.hiddenReplies[1518] ? 'expand_more' : 'expand_less'}} hide replies show replies
sysko sysko July 4, 2010 July 4, 2010 at 9:43:08 PM UTC link Permalink

It should be reworking now (I'm back from holidays)

brauliobezerra brauliobezerra July 4, 2010 July 4, 2010 at 1:20:47 AM UTC link Permalink

Translating sentences is not working? :(

{{vm.hiddenReplies[1515] ? 'expand_more' : 'expand_less'}} hide replies show replies
brauliobezerra brauliobezerra July 4, 2010 July 4, 2010 at 1:23:58 AM UTC link Permalink

Huh, my bad... just a geek using the latest Opera on Linux.

CK CK July 3, 2010, edited October 26, 2019 July 3, 2010 at 6:33:14 AM UTC, edited October 26, 2019 at 3:57:19 AM UTC link Permalink

[not needed - removed by CK]

{{vm.hiddenReplies[1513] ? 'expand_more' : 'expand_less'}} hide replies show replies
blay_paul blay_paul July 3, 2010 July 3, 2010 at 9:06:32 AM UTC link Permalink

Thanks! Could come in very useful.

Note how all the "By-foo" tags clump together. ;-)