According to one aspect of the invention, a method is provided in which a set of probabilistic attributes in an N-gram language model is classified into a plurality of classes. Each resultant class is clustered into a plurality of segments to build a code-book for the respective class using a modified K-means clustering process which dynamically adjusts the size and centroid of each segment during each iteration in the modified K-means clustering process. A probabilistic attribute in each class is then represented by the centroid of the corresponding segment to which the respective probabilistic attribute belongs.