menu
Tatoeba
language
Register Log in
language English
menu
Tatoeba

chevron_right Register

chevron_right Log in

Browse

chevron_right Show random sentence

chevron_right Browse by language

chevron_right Browse by list

chevron_right Browse by tag

chevron_right Browse audio

Community

chevron_right Wall

chevron_right List of all members

chevron_right Languages of members

chevron_right Native speakers

search
clear
swap_horiz
search

Wall (6,215 threads)

Tips

Before asking a question, make sure to read the FAQ.

We aim to maintain a healthy atmosphere for civilized discussions. Please read our rules against bad behavior.

Latest messages subdirectory_arrow_right

Tepan

2 hours ago

subdirectory_arrow_right

Pfirsichbaeumchen

4 hours ago

subdirectory_arrow_right

DJ_Saidez

4 hours ago

subdirectory_arrow_right

Pfirsichbaeumchen

5 hours ago

subdirectory_arrow_right

Shishir

23 hours ago

feedback

Tepan

yesterday

subdirectory_arrow_right

DJ_Saidez

yesterday

feedback

espamatics

yesterday

subdirectory_arrow_right

DJ_Saidez

yesterday

subdirectory_arrow_right

GlossaMatik

yesterday

QAzaqQA QAzaqQA March 16, 2021, edited March 16, 2021 March 16, 2021 at 6:16:13 PM UTC, edited March 16, 2021 at 6:17:20 PM UTC link Permalink

https://object.pouta.csc.fi/Tat...abk-eng.abk.gz
https://object.pouta.csc.fi/Tat...abk-eng.eng.gz

More than 12000 sentence corpus of Abkhaz to English,

{{vm.hiddenReplies[36699] ? 'expand_more' : 'expand_less'}} hide replies show replies
CK CK March 16, 2021, edited March 16, 2021 March 16, 2021 at 11:11:26 PM UTC, edited March 16, 2021 at 11:12:06 PM UTC link Permalink

From here?

https://github.com/Helsinki-NLP...ranslations.md

Just quickly sorting the lines of the English file and glancing through the data makes it seem like this isn't something that would be useful for us to import if that is what you are suggesting.

The year is 1001.
The year is 1002.
The year is 1003.
The year is 1004.
The year is 1005.
The year is 1006.
The year is 1007.
The year is 1008.
The year is 1009.
[snip]

He was 88 years old.
He was 89 years old.
He was 91 years old.
He was 92 years old.
He was 93 years old.
He was 94 years old.
[snip]

He was born in 1155.
He was born in 1187.
He was born in 1188.
He was born in 1189.
[snip]

That was in 1925.
That was in 1933.
That was in 1934.
That was in 1936.
That was in 1949.
That was in 1953.
[snip

Yes, yes.
146 B.C.E.
166 B.C.E.
172 B.C.E.
174 B.C.E.
196 B.C.E.
197 B.C.E.
198 B.C.E.
199 B.C.E.
219 B.C.E.
[snip]

B., 1976
Batman).
Folklor.
Gagba M.
Ialkaou.
Rome (c.
A little.
Abbet (c.
Abkhazia.
Agrba, E.
Annas (c.
Aurea, V.
B. (U.S.)
Children.
Degoev V.
Dewdrops.
[snip]

SEE PAGE 2.
SEE PAGE 3.
SEE PAGE 4.
SEE PAGE 5.
SEE PAGE 6.
SEE PAGE 7.
SEE PAGE 8.
SEE PAGE 9.
[snip]

Seven atoms.
The jackpot.
Turin, 1890.
Uncleanness.
Vardania, I.In May 2011 C.
Irresen (Ager.
Belgique, ager.
Benetics (Ache.
Brotherly love.
Catovica (Apol.
Chandor Petefi.

{{vm.hiddenReplies[36700] ? 'expand_more' : 'expand_less'}} hide replies show replies
QAzaqQA QAzaqQA March 17, 2021 March 17, 2021 at 5:46:29 AM UTC link Permalink

There are many useful sentences in that list along with few that isn't something that would be useful to import.

{{vm.hiddenReplies[36704] ? 'expand_more' : 'expand_less'}} hide replies show replies
Yorwba Yorwba March 17, 2021 March 17, 2021 at 10:14:20 PM UTC link Permalink

That still means that someone who is proficient in both Abkhaz and English would need to review each sentence one-by-one to determine whether it should be imported or not. (Especially important considering the dataset was apparently generated by machine-translating monolingual data, which makes both accuracy and naturalness of the translations suspect.)

I'd encourage you to take up that task if you want, but given you rate your Abkhaz knowledge with one star, it might be better to focus on sentences you can understand without machine translation. (Otherwise it's too easy to be misled by a reasonable-sounding but incorrect translation of a complex sentence.)

Cabo Cabo March 18, 2021 March 18, 2021 at 1:04:59 PM UTC link Permalink

I think if you don't understand a language, don't provide data on that language.

You already put data into the corpus, where we want to create information, and that isn't helpful.

QAzaqQA QAzaqQA March 18, 2021, edited March 18, 2021 March 18, 2021 at 4:43:33 AM UTC, edited March 18, 2021 at 12:37:37 PM UTC link Permalink

This guy has collected lot of Corpus data.

https://github.com/danielinux7/...arallel-Corpus

QAzaqQA QAzaqQA March 18, 2021 March 18, 2021 at 5:51:14 AM UTC link Permalink

Over 150,000 translated sentences from Russian to Abkhaz. However the licence must be be checked.

https://github.com/danielinux7/...-27-07.bifixed

QAzaqQA QAzaqQA March 18, 2021 March 18, 2021 at 4:49:19 AM UTC link Permalink

Kabardian Dictionary

http://www.amaltus.com/%d0%b7%d...7%d0%ba%d0%b8/

QAzaqQA QAzaqQA March 17, 2021 March 17, 2021 at 5:51:55 AM UTC link Permalink

25000 sentences of Abkhaz to English. Data must however be checked before adding.

https://object.pouta.csc.fi/Tat...ge/abk-eng.tar

{{vm.hiddenReplies[36707] ? 'expand_more' : 'expand_less'}} hide replies show replies
Thanuir Thanuir March 17, 2021 March 17, 2021 at 7:32:28 AM UTC link Permalink

1. Minkä lisenssin alla nuo ovat?

2. Jos tekijänoikeudet eivät tässä tapauksessa estä käyttöä, pitäisi jonkun käydä lauseet läpi laadunvarmistuksen takia ja luultavasti muuttaa muotoon, jossa ne voi ottaa mukaan Tatoebaan. Oletko valmis tekemään vähintään toisen näistä tai tiedätkö jonkun, joka on?

QAzaqQA QAzaqQA March 17, 2021 March 17, 2021 at 5:48:15 AM UTC link Permalink

Ingush audio. Muzyka kaloya - Song of Kaloy




https://omniglot.com/soundfiles...yka_kaloya.mp3

QAzaqQA QAzaqQA March 17, 2021, edited March 17, 2021 March 17, 2021 at 5:47:12 AM UTC, edited March 17, 2021 at 5:48:37 AM UTC link Permalink

Ingush audio. Ingushetia gimn - Ingush anthem

https://omniglot.com/soundfiles...etiya_gimn.mp3

CK CK March 16, 2021 March 16, 2021 at 11:41:14 PM UTC link Permalink

* We Now Have Some Toki Pona Audio **

https://tatoeba.org/eng/sentenc...how/169301/und
audio - toki - by 2swap - CC BY-SA 4.0
298 sentences

sharptoothed sharptoothed March 15, 2021 March 15, 2021 at 5:31:47 PM UTC link Permalink

** Stats & Graphs **

Tatoeba Stats, Graphs & Charts have been updated:
https://tatoeba.j-langtools.com/allstats/

{{vm.hiddenReplies[36695] ? 'expand_more' : 'expand_less'}} hide replies show replies
Guybrush88 Guybrush88 March 16, 2021 March 16, 2021 at 7:31:18 AM UTC link Permalink

thanks

CK CK March 13, 2021 March 13, 2021 at 8:37:39 AM UTC link Permalink

** Stats - Native Speakers **

http://tatoeba.ueuo.com/stats-2021-03-13.html

Find out who the native speakers are and the number of native-language sentences each has contributed.

I included a few more columns this week with data I personally wanted to know. Maybe you will find it interesting, too.