LexTool

LOGIOS Lexicon Tool


This tool generates a pronunciation dictionary suitable for configuring speech recognition systems that conform to ARPA-derived file formats. In particular it creates lexicons suitable for use with the Sphinx system. It is a component of the Logios package which allows you to input a Phoenix grammar and receive a compiler grammar, an n-gram language model and a pronouncing dictionary.

The tool currently accesses cmudict.0.7b and produces pronunciations using the (currently standard) 40 item phone inventory. Please note that the dictionary may be updated from time to time and that consequently your results may vary as a consequence, we hope in the direction of greater accuracy :-).

If you notice any errors in the output (such as a seemingly incorrect pronunciation), please report it and we will look into it. You can send reports to air:cs'cmu,|edu|.



word file:
hand file: Hand-crafted pronunciations that override sub-optimal (ok, incorrect) pronunciations.



An example

If your input file looks something like this left-hand column: Your output file will look something like this right-hand column:
Hello
	
HELLO        HH EH L OW
HELLO(1) HH AH L OW
world
compound_word
hyphen-ated
ONE23
2008
boom!
kweezlebotter
WORLD	W ER L D
COMPOUND_WORD	K AA M P AW N D W ER D
HYPHEN-ATED	HH AY F AH N EY T IH D
ONE23	OW EH N IY T UW TH R IY
2008	T UW Z IY R OW Z IY R OW EY T
BOOM!	B UW M
KWEEZLEBOTTER	K W IY Z L AH B AA T AH R

Please note the following: