I met a problem in training word tri-gram. I found that HTK LM training tool would set the probility of sentence start to zero. In LPCalc.c, function static int CalcUniProbs(BackOffLM *lm, FLEntry *tgtFE, Boolean rebuild), there are some code to do that: line 178: /* clamp sentence start symbol prob */ if ((nid = GetNameId(lm->htab,sstStr,FALSE))!=NULL) { if ((se = FindSE(unigram,0,lm->vocSize,LM_INDEX(nid)))!=NULL) { tMass = tMass - se->prob; se->prob = 0.0; } } for (se=unigram, i=0; ivocSize; i++, se++) { se->prob = se->prob/tMass; } That will lead to a problem that all the bi-gram or tri-gram prob contains will be set to zero. Thus in viterbi decoder, the engine may be shutdown since the is pruned because of the low score. Maybe the decoder of HTK is modified to process the words propagation after . My question is that why change the prob of , to reduce the size of LM model? Is there any other modifications in LM tools or HTK tools to deal with the LM score of ?