DictionaryMaker


The purpose of the DictionaryMaker system is to facilitate the creation of an electronic pronunciation dictionary in a target language, as originally described in M. Davel and E. Barnard, "Bootstrapping for language resource generation". Such a pronunciation dictionary consists of a list of words, each associated with one or more phonetic pronunciations. The developed pronunciation dictionary can be formatted for use by various speech processing applications, such as speech synthesis and speech recognition systems.

The system is designed to allow a speaker fluent in the target language to develop a pronunciation dictionary without requiring expert linguistic knowledge or programming expertise. Along with the pronunciation dictionary, a related set of grapheme-to-phoneme (g-to-p) rules is created automatically.

The system utilises a bootstrapping approach: improving models according to a controlled set of increments, at each increment utilising the previous model to generate the next. The system balances machine learning and human intervention with the aim to simplify and minimise the human intervention required during the bootstrapping process.

Only a word list, a grapheme set and phoneme set for the target language are required as inputs to the system. Once initialised with these items, the system guides the target language speaker through the dictionary creation process.


DictionaryMaker can be downloaded    HERE    at SourceForge.net Logo

The User Manual for DictionaryMaker can be obtained here in *.pdf format: dictmaker_manual.pdf and the publications referred to in the Manual can be found here.


DictionaryMaker has been created by the HLT Human Language Technologies Research Group of the Meraka Institute, South Africa.