HTK Speech Recognition Toolkit
 [ Home | Register | Mailing Lists | Documentation ]
Home

Getting HTK
Register
Manage login/password
Download

Documentation
HTKBook
FAQ
History of HTK
CUED LVR Systems
License

Mailing Lists
Subscribe
Account/Unsubscribe
Archives

Development
Get involved
Future Plans
Report a Bug
Bug Status
ATK   

Links
HTK Extensions
ASR Toolkits/Software
ASR Research Sites
Speech Companies
Speech Conferences
Speech Journals
ASR Evaluations


Search

Sponsors

What is HTK?

The Hidden Markov Model Toolkit (HTK) is a portable toolkit for building and manipulating hidden Markov models. HTK is primarily used for speech recognition research although it has been used for numerous other applications including research into speech synthesis, character recognition and DNA sequencing. HTK is in use at hundreds of sites worldwide.

HTK consists of a set of library modules and tools available in C source form. The tools provide sophisticated facilities for speech analysis, HMM training, testing and results analysis. The software supports HMMs using both continuous density mixture Gaussians and discrete distributions and can be used to build complex HMM systems. The HTK release contains extensive documentation and examples.

HTK was originally developed at the Machine Intelligence Laboratory (formerly known as the Speech Vision and Robotics Group) of the Cambridge University Engineering Department (CUED) where it has been used to build CUED's large vocabulary speech recognition systems (see CUED HTK LVR). In 1993 Entropic Research Laboratory Inc. acquired the rights to sell HTK and the development of HTK was fully transferred to Entropic in 1995 when the Entropic Cambridge Research Laboratory Ltd was established. HTK was sold by Entropic until 1999 when Microsoft bought Entropic. Microsoft has now licensed HTK back to CUED and is providing support so that CUED can redistribute HTK and provide development support via the HTK3 web site. See History of HTK for more details.

While Microsoft retains the copyright to the original HTK code, everybody is encouraged to make changes to the source code and contribute them for inclusion in HTK3.

Join the HTK Team at CUED

If you are interested in joining the HTK Team to work on software development or speech recognition algorithm research (either as an RA or PhD student) send an email with your CV to Phil Woodland <pcw@eng.cam.ac.uk>

Cambridge University Engineering Department (CUED) is currently able to offer a number of well-funded three year PhD research studentships. For more details please see the MIL Laboratory jobs page

Those interested in a research studentship working with the HTK team might also consider first applying for a place on the MPhil in Machine Learning, Speech and Language Technology http://www.mlsalt.eng.cam.ac.uk/

Current releases

HTK version 3.4.1 is the current stable release.

HTK version 3.5 beta is the most recent release.

Getting HTK

HTK is available for free download but you must first agree to this license. You must then register for a username and password which will allow you to download the HTK Book and source code. Registration is free but does require a valid e-mail address; your password for site access will be sent to this address.

HTK News

28 June 2016 (pcw):
  • We have released another beta(3) of HTK 3.5

    This includes

    • some minor bug fixes
    • source code changes which to allow easier compilation on Windows (although we haven’t included a Visual Studio setup)
    • proper integration of the RNNLM rescoring functions as discussed in the HTK Book for HTK 3.5 (alpha).

    Thanks for the feedback and suggestions from various users.

    We are still working on more substantial updates to HTK 3.5 to include further functionality.

    Phil & the Cambridge HTK Team

31 December 2015 (pcw):
  • HTK 3.5 beta is released. This can be downloaded from the HTK downloads page. Note that the samples package is now included with the HTK 3.5 beta download. HDecode is still an additional download due to its separate license. HTK 3.4.1 continues to be available.

    Key features of HTK 3.5 are described in the 24th August news item and the UK speech presentation and interspeech paper referenced there provide further background.

    The HTK 3.5 beta source code package has been developed and tested for use on Linux. Only a simple build procedure is included which will require some manual configuration. A more automatic configuration will be available in future as well as support for other platforms. Compilation options include builds for standard CPU; use with Intel MKL libraries; and use with NVIDIA GPUs.

    HTK 3.5 also includes a new version of the HTKBook. This is an alpha version of the book and so is in some places incomplete. The HTKBook for HTK 3.5 includes documentation of the new features of HTK including the new tools for acoustic modelling with neural networks and use of recurrent neural network language models. The book also includes extended tutorial information for using the new HTK features, and includes a new section of tutorial examples using the Resource Management task that illustrate new (and old) functionality. The scripts that are provided for this task may well be of use more generally.

    In future we intend to both extend the functionality of HTK 3.5 with additional neural network models and also include recipes for standard current speech recognition tasks.

24 August 2015 (pcw):
  • We are currently preparing a new major release, HTK 3.5. The key features of HTK 3.5 are the inclusion of

    • Built-in support for artificial neural network (ANN) models while maintaining compatibility with most existing functions.
      • Flexible input feature configurations
      • ANN structures can be any directed acyclic graph
      • Stochastic gradient descent supporting frame/sequence training
      • CPU/GPU math kernels for ANNs
      • Decoders extended to support both tandem and hybrid systems
    • Support for decoding RNN language models
      • Lattice rescoring using RNNLMs
      • Class / Full word outputs, interpolation with n-grams
    • 64-bit compatible throughout
    • Bug fixes
    • Updated documentation and examples

    More details of our plans for HTK 3.5 can be found in

Comments and suggestions to htk-mgr@eng.cam.ac.uk