What is HTK?
The Hidden Markov Model Toolkit (HTK) is a portable toolkit for
building and manipulating hidden Markov models. HTK is primarily used
for speech recognition research although it has been used for numerous
other applications including research into speech synthesis, character
recognition and DNA sequencing. HTK is in use at hundreds of sites
HTK consists of a set of library modules and tools available in C
source form. The tools provide sophisticated facilities for speech
analysis, HMM training, testing and results analysis. The software
supports HMMs using both continuous density mixture Gaussians and
discrete distributions and can be used to build complex HMM systems.
The HTK release contains extensive documentation and examples.
HTK was originally developed at the Machine Intelligence Laboratory
(formerly known as the Speech Vision and Robotics Group)
of the Cambridge University Engineering Department (CUED)
where it has been used to build CUED's large vocabulary speech
recognition systems (see CUED HTK LVR).
In 1993 Entropic Research Laboratory Inc. acquired the rights to sell
HTK and the development of HTK was fully transferred to Entropic in
1995 when the Entropic Cambridge Research Laboratory Ltd was
established. HTK was sold by Entropic until 1999 when Microsoft bought
Entropic. Microsoft has now licensed HTK back to CUED and is providing
support so that CUED can redistribute HTK and provide development
support via the HTK3 web site. See History of HTK for more details.
While Microsoft retains the copyright to the original HTK code,
everybody is encouraged to make changes to the source code and
contribute them for inclusion in HTK3.
Join the HTK Team at CUED
If you are interested in joining the HTK Team to work on software
development or speech recognition algorithm research (either as an RA or PhD student)
send an email with your CV to Phil
Cambridge University Engineering Department (CUED) is currently able
to offer a number of well-funded three year PhD research studentships.
For more details please see the
MIL Laboratory jobs page
Those interested in a research studentship working with the HTK team might also consider first applying for a place on the MPhil in Machine Learning, Speech and Language Technology
HTK version 3.4.1 is the current stable
HTK version 3.5 beta is the most recent
HTK is available for free download but you must first agree to this
. You must then
for a username and password
which will allow you to download the HTK Book and source code. Registration
is free but does require a valid e-mail address; your password for site
access will be sent to this address.
28 June 2016 (pcw):
31 December 2015 (pcw):
We have released another beta(3) of HTK 3.5
- some minor bug fixes
- source code changes which to allow easier compilation on Windows (although we haven’t included a Visual Studio setup)
- proper integration of the RNNLM rescoring functions as discussed in the HTK Book for HTK 3.5 (alpha).
Thanks for the feedback and suggestions from various users.
We are still working on more substantial updates to HTK 3.5 to include further functionality.
Phil & the Cambridge HTK Team
24 August 2015 (pcw):
HTK 3.5 beta is released. This can be downloaded from the HTK downloads page. Note that the samples package is now included with the HTK 3.5 beta download. HDecode is still an additional download due to its separate license. HTK 3.4.1 continues to be available.
Key features of HTK 3.5 are described in the 24th August news item and the UK speech presentation and interspeech paper referenced there provide further background.
The HTK 3.5 beta source code package has been developed and tested for use on Linux. Only a simple build procedure is included which will require some manual configuration. A more automatic configuration will be available in future as well as support for other platforms. Compilation options include builds for standard CPU; use with Intel MKL libraries; and use with NVIDIA GPUs.
HTK 3.5 also includes a new version of the HTKBook. This is an alpha version of the book and so is in some places incomplete. The HTKBook for HTK 3.5 includes documentation of the new features of HTK including the new tools for acoustic modelling with neural networks and use of recurrent neural network language models. The book also includes extended tutorial information for using the new HTK features, and includes a new section of tutorial examples using the Resource Management task that illustrate new (and old) functionality. The scripts that are provided for this task may well be of use more generally.
In future we intend to both extend the functionality of HTK 3.5 with additional neural network models and also include recipes for standard current speech recognition tasks.
We are currently preparing a new major release, HTK 3.5. The key features of HTK 3.5 are the inclusion of
- Built-in support for artificial neural network (ANN) models while maintaining compatibility with most existing functions.
- Flexible input feature configurations
- ANN structures can be any directed acyclic graph
- Stochastic gradient descent supporting frame/sequence training
- CPU/GPU math kernels for ANNs
- Decoders extended to support both tandem and hybrid systems
- Support for decoding RNN language models
- Lattice rescoring using RNNLMs
- Class / Full word outputs, interpolation with n-grams
- 64-bit compatible throughout
- Bug fixes
- Updated documentation and examples
More details of our plans for HTK 3.5 can be found in