Display:

Translation memory from English (EN-GB) to Afrikaans, in the government domain for use in the Autshumato ITE application.

 

Productdetails

Aantal woorden: 359 817 translation units (tokens)
Opdrachtgever: Department of Arts and Culture
Financier: Department of Arts and Culture
Eigenaar: North-West University , Centre for Text Technology (CTexT)
Annotaties: UTF8 , XML , TMX
Dataformaat: text
Talen: Afrikaans, English
Documentatie: Readme contained in download
Licentiesoort: Creative Commons Attribution-NonCommercial-ShareAlike 2.5 South Africa

Complete audio recordings and orthographic transcriptions used for Lwazi speech recognition systems.

 

Productdetails

Opdrachtgever: Department of Arts and Culture
Financier: Department of Arts and Culture
Eigenaar: Meraka Institute, CSIR
Annotaties: Transcriptions: a) One utterance per file b) ANSI (Unicode) c) txt Audio: a) 8 KHz b) 16-bit c) 1 Channel; telephone d) wav
Aantal uren spraak: 562 min
Dataformaat: Speech
Talen: isiXhosa
Documentatie: Lwazi Project Final Report "Development of a telephone-based speech-driven information service for the South African Government"
Licentiesoort: Creative Commons Attribution 2.5 South Africa License
Project: Lwazi I

Het AUTONOMATA-namencorpus is een database van in totaal circa 5000 voorgelezen voornamen, achternamen, straatnamen, plaatsnamen en controlewoorden. Het corpus bestaat uit een Nederlands en een Vlaams deel.

Productdetails

Jaar: 2008
Opdrachtgever: NTU|STEVIN
Financier: NTU|STEVIN
Eigenaar: Taalunie
Dataformaat: Spraakbestanden (wav), Fon. transcripties (txt)
Talen: Nederlands, Vlaams
Refereren: AUTONOMATA-namencorpus (Version 1.0) (2008) [Data set]. Available at the Dutch Language Institute:
http://hdl.handle.net/10032/tm-a2-m2
Documentatie: LREC2006-artikel
Project: AUTONOMATA

Complete audio recordings and orthographic transcriptions used for Lwazi speech recognition systems.

 

Productdetails

Opdrachtgever: Department of Arts and Culture
Financier: Department of Arts and Culture
Eigenaar: Meraka Institute, CSIR
Annotaties: Transcriptions: a) One utterance per file b) ANSI (Unicode) c) txt Audio: a) 8 KHz b) 16-bit c) 1 Channel; telephone d) wav
Aantal uren spraak: 570 min
Dataformaat: Speech
Talen: Sesotho sa Leboa (Sepedi)
Documentatie: Lwazi Project Final Report "Development of a telephone-based speech-driven information service for the South African Government"
Licentiesoort: Creative Commons Attribution 2.5 South Africa License
Project: Lwazi I

General phonemic pronunciations for frequently occurring words in Xitsonga. Dictionaries were developed to be practically usable for speech technology systems, rather than phonetically accurate. Audio samples of all phonemes included. A letter-to-sound rule set for predicting the pronunciations of generic words included.

 

Productdetails

Opdrachtgever: Department of Arts and Culture
Financier: Department of Arts and Culture
Eigenaar: Meraka Institute, CSIR
Aantal uren spraak: Approx. 65,000 words
Dataformaat: Speech
Talen: Xitsonga
Documentation: 1) M Davel and O Martirosian, "Pronunciation dictionary development in resource-scarce environments", In Proceedings of Interspeech, Brighton, UK, September 2009
Licentiesoort: Creative Commons Attribution 2.5 South Africa License
Project: Lwazi I

Op deze website maken wij gebruik van cookies.