menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search

Wall (7,275 threads)

Tips

Before asking a question, make sure to read the FAQ.

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

Latest messages subdirectory_arrow_right

small_snow

an hour ago

subdirectory_arrow_right

frpzzd

yesterday

subdirectory_arrow_right

LeviHighway

2 days ago

subdirectory_arrow_right

frpzzd

2 days ago

feedback

sharptoothed

2 days ago

subdirectory_arrow_right

LeviHighway

2 days ago

subdirectory_arrow_right

lingomaxim

2 days ago

subdirectory_arrow_right

frpzzd

2 days ago

feedback

LeviHighway

2 days ago

subdirectory_arrow_right

alt

3 days ago

9 days ago December 6, 2025 at 5:21:32 PM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

10 days ago December 6, 2025 at 5:43:46 AM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

Igider Igider 10 days ago, edited 10 days ago December 5, 2025 at 2:09:05 PM UTC, edited December 5, 2025 at 2:09:29 PM UTC flag Report link Permalink

Hi everyone,

I have tried to use the sentence export page on Tatoeba, but unfortunately, I am not managing to do it correctly. Could someone please help me download the following data (preferably as a CSV file, zipped or not)?

1. All Kabyle sentences
—with their translations in French, English, and Spanish,
—and with an audio recording attached.

2. All original Kabyle sentences that:
—do not have translations,
—do not have audio recordings,
—but without duplicated sentences if possible.

Any help, guidance, or explanation on how to extract these properly would be greatly appreciated.

Thank you very much in advance!

Igider



Dowload
https://tatoeba.org/en/downloads

Advanced search:
https://tatoeba.org/en/sentences/advanced_search

{{vm.hiddenReplies[41510] ? 'expand_more' : 'expand_less'}} hide replies show replies
cafoc64474 cafoc64474 10 days ago, edited 10 days ago December 5, 2025 at 3:08:50 PM UTC, edited December 5, 2025 at 3:09:15 PM UTC flag Report link Permalink

I think that you need to download sentences for each language in separate files, then download connections file and after that you need to use some graph tools to connect sentences (example NetworkX for Python).

AlanF_US AlanF_US 10 days ago, edited 10 days ago December 5, 2025 at 3:19:45 PM UTC, edited December 5, 2025 at 3:22:45 PM UTC flag Report link Permalink

Note first of all that you are talking about a language with more than 777,000 sentences, and working with such a large set of sentences is going to require scripting/programming not just for the downloading of the sentences, and not just for the selection of the subset you want, but also for the management of those sentences on your side.

The download page lets you download the following with the click of a button:
- all sentences in language A with translations in language B
- all sentences in language A
- all sentences in language A that have audio (but not the audio itself, which can only be downloaded if it the license says so, and needs to be downloaded via a URL; this is explained on the downloads page)

Another alternative, as the page says, is to produce a list of sentences, which can then be downloaded. However, this is impractical with the number of sentences you'd be dealing with.

Without scripting/programming knowledge, you could do three downloads, consisting of all Kabyle sentences translated into French, English, and Spanish, respectively. This would give you TSV files (tab-separated rather than comma-separated) containing the sentences and translations. (For reference, the one for French would be 17.8 MB in size and contain more than 200,000 entries.) This is not what you asked for, but it's probably the best you can do without scripting/programming. Otherwise, the help you need is most likely going to go beyond what can be provided on the Wall.

{{vm.hiddenReplies[41512] ? 'expand_more' : 'expand_less'}} hide replies show replies
Igider Igider 10 days ago December 5, 2025 at 6:49:36 PM UTC flag Report link Permalink

Thank you for your reply.
Since obtaining the full dataset is not practical through the current process, would you kindly provide at least the audio files related to the Kabyle sentences?

That alone would already be extremely helpful, and I would greatly appreciate your assistance with this.

Thank you again.

{{vm.hiddenReplies[41514] ? 'expand_more' : 'expand_less'}} hide replies show replies
AlanF_US AlanF_US 10 days ago December 5, 2025 at 8:42:17 PM UTC flag Report link Permalink

In the section "Sentences with audio", the Downloads page says this:

---
File description:
Contains the ids of the sentences, in all languages, for which audio is available. Other fields indicate who recorded the audio, its license and a URL to attribute the author. If the license field is empty, you may not reuse the audio outside the Tatoeba project.

Downloading audio:
A single sentence can have one or more audio, each from a different voice. To download a particular audio, use its audio id to compute the download URL. For example, to download the audio with the id 1234, the URL is https://tatoeba.org/audio/download/1234.
---

You can use a tool like wget to download files once you know the URLs.

If that information is not enough to get you started, see if you can find someone on your side who has the technical knowledge to do this.

10 days ago December 5, 2025 at 3:46:19 PM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

11 days ago, edited 11 days ago December 5, 2025 at 9:58:47 AM UTC, edited December 5, 2025 at 10:00:08 AM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

12 days ago December 4, 2025 at 12:48:05 PM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

12 days ago December 4, 2025 at 11:29:49 AM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

12 days ago December 3, 2025 at 7:06:34 PM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

14 days ago December 1, 2025 at 5:12:40 PM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

{{vm.hiddenReplies[41499] ? 'expand_more' : 'expand_less'}} hide replies show replies
14 days ago December 1, 2025 at 5:13:16 PM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.

14 days ago, edited 14 days ago December 1, 2025 at 2:11:00 PM UTC, edited December 1, 2025 at 2:11:15 PM UTC link Permalink
warning

The content of this message goes against our rules and was therefore hidden. It is displayed only to admins and to the author of the message.