Getting HTK
Documentation
Mailing Lists
Development
Links
|
Proceedings DARPA Broadcast News Transcription Workshop
The 1997 HTK Broadcast News Transcription System
P.C. Woodland, T. Hain, S.E. Johnson,
T.R. Niesler, A. Tuerk, E.W.D. Whittaker and
S.J. Young
This paper presents the recent development of the HTK broadcast news
transcription system. Previously we have used data type specific
modelling based on adapted Wall Street Journal trained HMMs. However,
we are now using data for which no manual pre-classification or
segmentation is available and therefore automatic techniques are
required and compatible acoustic modelling strategies must be adopted.
A number of recognition experiments are presented that compare
data-type specific and non-specific models; differing amounts of
training data; the use of gender-dependent modelling and the effects
of automatic data-type classification. Based on these experiments, the
HTK system for the 1997 broadcast news evaluation was designed. A
detailed description of this system is given which includes a
class-based language modelling component. The complete system yields
an overall word error rate of 22.0% on the 1996 unpartitioned
broadcast news development test data and just 15.8% on the 1997
evaluation test set.
gzip'd PS |
PDF |
HTML
|