Public release of Haitian Creole text data by Carnegie Mellon
See the main Haitian Creole page here for an overview of all available Haitian Creole speech and text data. This page describes the text data only.
Haitian Creole text data
Data license
- Medical domain phrases and sentences in English collected at Carnegie
Mellon under the NSF-funded (jointly with the EU) NESPOLE! project, translated into Haitian Creole by Eriksen Translations, Inc.
- Newswire data collected at Carnegie Mellon under the DARPA-funded DIPLOMAT project
- English (2 MB) and Creole (2 MB) parallel files, one sentence per line
- Short phrases and other glossary entries collected and produced at Carnegie Mellon under the DIPLOMAT project
- English (519 KB) and Creole (499 KB) parallel files, one sentence per line.
Update: Several people noticed a large amount of Creole text in the English file, and vice versa. This error has been corrected as of 1 p.m. EST on Wednesday, January 27, 2010. Please download these updated versions if required.
Contact for this page: Robert Frederking.
Last updated: Wednesday, January 27, 2010, 12:55 p.m. EST by Greg Hanneman.