HTK Speech Recognition Toolkit

HTK Extensions

A number of HTK users have implemented substantial extensions to the standard HTK version. On this page we provide short descriptions and links to further information. Many thanks to the authors of these extensions.

If you provide some free extension to HTK for download and would like to see it listed on this page, email us.

HDecode

To download and use HDecode you must be already registered as an HTK user, and then agree to the HDecode End User Licence Agreement. If you have already agreed to the licence, you can download HDecode from here.

HMM-Based Speech Synthesis Toolkit (HTS)

HTS web page

The HMM-based Speech Synthesis System (HTS) for HMM-based speech synthesis. This toolkit is released as a patch code to HTK. Modifications which we made to HTK are listed below:

Context clustering based on MDL criterion (instead of ML one)
Stream-dependent context clustering
Multi-space probability distribution as state output probability (for f0 pattern modeling)
State duration modeling and clustering
Speech parameter generation from continuous density HMMs

Keiichi Tokuda (http://www.sp.nitech.ac.jp/~tokuda)

Speaker Verification Based on HTK: MASV

MASV web page

MASV stands for Munich Automatic Speaker Verification. This experimental system depends on the HTK tools (version 3.1 or greater), Matlab (version 5 or greater) and Perl (version 5 or greater). The Perl scripts control training and testing of speaker models, the Matlab part provides various score normalization schemes and a GUI for exploring the performance of a speaker verification system. MASV is published under the GNU General Public License in the hope to help others in getting started with speaker verification based on HMM models. The key features are:

all HMM types provided by HTK (including GMMs) can be used.
easily adaptable to different speech databases.
easy setup of different speaker sets (customers, impostors, world speakers, development set,...).
various possibilities of seeding models before training.
parallel processing of training / testing supported.
several score normalizations possible: world model, cohort speakers, handset normalization (h-norm).
easy evaluation with Matlab GUI (including comparison between matched / mismatched conditions).

Ulrich Türk (tuerk@phonetik.uni-muenchen.de, http://www.phonetik.uni-muenchen.de/Mitarbeiter/tuerk/tuerk.html)

Tutorial on using HTK with the Speech Filing System (SFS)

HTK/SFS Tutorial web page

This tutorial describes the use of HTK in combination with the Speech Filing System (SFS). This covers installation on CYGWIN, phone and phone-class recognition, phone alignment and pronunciation variation analysis.

Mark Huckvale (M.Huckvale@ucl.ac.uk)

HTK Matlab interface

Voicebox

This toolbox contains "readhtk" and "writehtk" allowing reading and writing of waveform files.