I met a problem in training word tri-gram. I found that HTK LM training tool would set the probility of sentence start <s> to zero. 

In LPCalc.c, function static int CalcUniProbs(BackOffLM *lm, FLEntry *tgtFE, Boolean rebuild), there are some code to do that: line 178:
/* clamp sentence start symbol prob */
      if ((nid = GetNameId(lm->htab,sstStr,FALSE))!=NULL) {
	 if ((se = FindSE(unigram,0,lm->vocSize,LM_INDEX(nid)))!=NULL) {
	    tMass = tMass - se->prob; 
	    se->prob = 0.0;
	 }
      }
      for (se=unigram, i=0; i<lm->vocSize; i++, se++) {
	 se->prob = se->prob/tMass;
      }

That will lead to a problem that all the bi-gram or tri-gram prob contains <s> will be set to zero. Thus in viterbi decoder, the engine may be shutdown since the <s> is pruned because of the low score. Maybe the decoder of HTK is modified to process the words propagation after <s>. My question is that why change the prob of <s>, to reduce the size of LM model? Is there any other modifications in LM tools or HTK tools to deal with the LM score of <s>?