WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Speaker-independent label coding apparatus    
United States Patent5182773   
Link to this pagehttp://www.wikipatents.com/5182773.html
Inventor(s)Bahl; Lalit R. (Amawalk, NY); Picheny; Michael A. (White Plains, NY); Nahamoo; David (White Plains, NY); de Souza; Peter V. (Mahopac, NY)
AbstractThe present invention is related to speech recognition and particularly to a new type of vector quantizer and a new vector quantization technique in which the error rate of associating a sound with an incoming speech signal is drastically reduced. To achieve this end, the present invention technique groups the feature vectors in a space into different prototypes at least two of which represent a class of sound. Each of the prototypes may in turn have a number of subclasses or partitions. Each of the prototypes and their subclasses may be assigned respective identifying values. To identify an incoming speech feature vector, at least one of the feature values of the incoming feature vector is compared with the different values of the respective prototypes, or the subclasses of the prototypes. The class of sound whose group of prototypes, or at least one of the prototypes, whose combined value most closely matches the value of the feature value of the feature vector is deemed to be the class corresponding to the feature vector. The feature vector is then labeled with the identifier associated with that class.



 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Inventor     Bahl; Lalit R. (Amawalk, NY); Picheny; Michael A. (White Plains, NY); Nahamoo; David (White Plains, NY); de Souza; Peter V. (Mahopac, NY)
Owner/Assignee     International Business Machines Corporation (Yorktown Heights, NY)
Patent assignment
All assignments
Publication Date     January 26, 1993
Application Number     07/673,810
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     March 22, 1991
US Classification     704/222
Int'l Classification     G10L 007/00 G10L 007/08
Examiner     Shaw; Dale M.
Assistant Examiner     Tung; Kee M.
Attorney/Law Firm     Pollock, Vande Sande & Priddy
Address
Parent Case    
Priority Data    
USPTO Field of Search     381/41 381/43 381/29 381/30 381/31 381/32 381/33 381/34 381/35
Patent Tags     speaker-independent label coding
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5046099
Nishimura
704/256.4
Sep,1991

[0 after 0 votes]
5023912
Segawa
704/240
Jun,1991

[0 after 0 votes]
4926488
Nadas
704/233
May,1990

[0 after 0 votes]
4847906
Ackenhusen
704/217
Jul,1989

[0 after 0 votes]
4837831
Gillick

Jun,1989

[0 after 0 votes]
4829577
Kuroda
704/244
May,1989

[0 after 0 votes]
4827251
Aoki
345/636
May,1989

[0 after 0 votes]
4819271
Bahl
704/256
Apr,1989

[0 after 0 votes]
4817156
Bahl
704/256.2
Mar,1989

[0 after 0 votes]
4805219
Baker
704/241
Feb,1989

[0 after 0 votes]
4802224
Shiraki
704/245
Jan,1989

[0 after 0 votes]
4783802
Takebayashi
704/243
Nov,1988

[0 after 0 votes]
4773093
Higgins
704/247
Sep,1988

[0 after 0 votes]
4748670
Bahl
704/256.1
May,1988

[0 after 0 votes]
4403114
Sakoe
704/252
Sep,1983

[0 after 0 votes]
4032711
Sambur
704/246
Jun,1977

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


We claim:

1. A speech coding apparatus comprising:

means for storing a plurality of classes each having an identifier represented by at least two of a plurality of prototypes, each of the plurality of prototypes having at least one prototype value;

transducer means for extracting from an utterance a feature vector signal having at least one feature value;

means for establishing a match between the feature vector signal and at least one of the classes by selecting from the plurality of prototypes at least one prototype having a prototype value that best matches the feature value of the feature vector signal; and

means for coding the feature vector signal with the identifier of the class represented by the selected at least one prototype vector.

2. Speech coding apparatus of claim 1, wherein the prototype value of the at least one prototype is computed from at least means, variances and a priori probabilities of a set of acoustic feature vectors associated with the prototype.

3. Speech coding apparatus of claim 1, wherein the prototype value of the at least one prototype is computed by associating location of the feature value of the one feature vector signal on a probability distribution function of the prototype.

4. Speech coding apparatus of claim 1, wherein each class of the plurality of classes is represented by a plurality of prototypes whose respective prototype values are considered as a whole against the feature value of the feature vector signal to determine whether the feature vector signal corresponds to the class.

5. Speech coding apparatus of claim 1, further comprising:

means for storing a plurality of training classes;

means for measuring and transforming training utterances into a series of training feature vectors each having a feature value: and

means for correlating each of the series of training feature vectors with one of the training classes to generate the plurality of stored classes.

6. Speech coding apparatus of claim 5, further comprising:

means for measuring and extracting from utterances over successive predetermined time periods corresponding successive sets of feature vectors, each feature vector of the successive sets of feature vectors having a dimensionality of at least one feature value;

means for merging the feature vectors in each of the successive sets of feature vectors to form a plurality of consolidated feature vectors whose respective dimensionalities being the sum of the dimensionalities of the corresponding merged feature vectors, the consolidated feature vectors being more adaptable for discrimination between the stored training classes; and

means for spatially reorienting the consolidated feature vectors to reduce their dimensionality to thereby effect easier manipulation thereof.

7. Speech coding apparatus of claim 6, wherein each of the training classes is divided into training subclasses, further comprising:

means for configuring the training subclasses as respective training distribution functions having corresponding means, variances and a priori probabilities; and

means for storing the training distribution functions, each of the training distribution functions representing a training prototype.

8. Speech coding apparatus of claim 7, wherein each of the stored classes has at least one subcomponent; and

wherein the correlating means correlates the series of feature vectors with the at least one subcomponent to generate a plurality of stored component classes.

9. Speech coding apparatus of claim 8, wherein the configuring means further configures the plurality of component classes as respective distribution functions each having corresponding means, variances and a priori probabilities; further comprising:

means for storing the distribution functions representing the component classes, each of the distribution functions of the component classes representing a prototype.

10. Speech coding apparatus of claim 1, wherein the coding means comprises:

a quantizing means for outputting a label corresponding to the coded feature vector signal.

11. Speech coding apparatus of claim 1, wherein the establishing means comprises:

means for grouping a plurality of speech feature vectors into a predetermined number of prototypes each having respective means, variances and a priori probabilities; and

means for dividing each of the predetermined number of prototypes into at least two sub-prototypes to better differentiate the feature vector signal from other feature vector signals.

12. A speech coding apparatus comprising:

means for storing a plurality of prototypes representative of a plurality of classes, each class having an identifier represented by at least two of the plurality of prototypes, each of the plurality of prototypes having at least one prototype value;

transducer means for extracting from an utterance a feature vector signal having at least one feature value;

means for establishing a match between the feature vector signal and at least one class by comparing the feature value of the feature vector signal against the respective prototype values of the prototypes;

means for coding the feature vector signal with the identifier of the class represented by any of the prototypes having a prototype value most closely matching the feature value of the feature vector signal.

13. Speech coding apparatus of claim 12, wherein each class is represented by a number of prototypes of the plurality of prototypes, the respective prototype values of the prototypes of each class being considered as a whole against the feature value of the feature vector signal to determine which class of the plurality of classes the feature vector signal best corresponds to.

14. Speech coding apparatus of claim 12, wherein the prototype value of each prototype is computed from at least means, variances and a priori probabilities of a set of acoustic feature vectors associated with the prototype.

15. Speech coding apparatus of claim 12, wherein each prototype has a score value computed by associating location of the feature value of the one feature vector signal on a probability distribution function of the prototype.

16. Speech coding apparatus of claim 12, further comprising:

means for storing a plurality of training classes;

means for measuring and transforming training utterances into a series of training feature vectors each having a feature value: and

means for correlating each of the series of training feature vectors with one of the training classes to generate the plurality of stored classes.

17. Speech coding apparatus of claim 16, further comprising:

means for measuring and extracting from utterances over successive predetermined time periods corresponding successive sets of feature vectors, each feature vector of the successive sets of feature vectors each having a dimensionality and at least one feature value;

means for merging the feature vectors in each of the successive sets of feature vectors to form a plurality of consolidated feature vectors whose respective dimensionalities being the sum of the dimensionalities of the corresponding merged feature vectors, the consolidated feature vectors being more adaptable for discrimination between the stored training classes; and

means for spatially reorienting the consolidated feature vectors to reduce their dimensionality to thereby afford easier manipulation thereof.

18. Speech coding apparatus of claim 17, wherein each of the training classes is divided into training subclasses, further comprising:

means for configuring the training subclasses as respective training distribution function having corresponding means, variances and a priori probabilities; and

means for storing the training distribution functions, each of the training distribution functions representing a training prototype.

19. Speech coding apparatus of claim 18, wherein each of the stored classes has at least one subcomponent; and

wherein the correlating means correlates the series of feature vectors with the at least one subcomponent to generate a plurality of stored component classes.

20. Speech coding apparatus of claim 19, wherein the configuring means further configures the plurality of component classes as respective distribution functions each having corresponding means, variances and a priori probabilities; further comprising:

means for storing the distribution functions representing the component classes, each of the distribution functions of the component classes representing a prototype.

21. Speech coding apparatus of claim 12, wherein the coding means comprises:

a quantizing means for outputting a label corresponding to the coded feature vector signal.

22. Speech coding apparatus of claim 12, wherein the establishing means comprises:

means for grouping a plurality of speech feature vectors into a predetermined number of prototype each having respective means, variances and a priori probabilities; and

means for dividing each of the predetermined number of prototype into at least two sub-prototypes to better differentiate the feature vector signal from other feature vector signals.

23. A method of coding speech comprising the steps of:

(a) storing in a memory means a plurality of classes each having an identifier represented by at least two of a plurality of prototypes, each of the plurality of prototypes having at least one prototype value;

(b) using transducer means to extract from an utterance a feature vector signal having at least one feature value;

(c) establishing a correspondence between the feature vector signal and at least one class of the plurality of classes by selecting from among a plurality of prototypes at least one prototype whose prototype value most closely matches the feature value of the feature vector signal; and

(d) coding the feature vector signal with the identifier of class represented by the selected at least one prototype.

24. The method of claim 23, wherein prior to step (a), the method further comprising the steps of:

establishing an inventory of training classes;

extracting training feature vectors from a string of training text; and

correlating each of the feature vectors with one of the training classes.

25. The method of claim 24, further comprising the steps of:

measuring and extracting from utterances over successive predetermined periods of time corresponding successive sets of feature vectors, each feature vector of the successive sets of feature vectors having a dimensionality of at least one feature value;

merging the feature vectors in each of the successive sets of feature vectors to form a plurality of consolidated feature vectors whose respective dimensionalities being the sum of the dimensionalities of the corresponding merged feature vectors, the consolidated feature vectors being more adaptable for discrimination between the stored training classes; and

spatially reorienting the consolidated feature vectors to reduce their dimensionalities to thereby effect easier manipulation thereof.

26. The method of claim 25, further comprising the steps of:

establishing the number of prototypes required to provide adequate representation of a class; and wherein for each of the training classes, the method further comprising the steps of:

selecting a number of training prototypes;

calculating respective new training prototypes by averaging the respective values of feature vectors situated proximate to each of the training prototypes until the average distance between the feature vectors remains substantially constant; and

successively replacing the two closest new training prototypes with another new training prototype whose value is the average of the values of the replaced training prototypes until a predetermined number of another training prototypes remains.

27. The method of claim 26, further comprising the steps of:

using a distribution analysis on the predetermined number of training prototypes to calculate a corresponding set of new training prototypes each having an estimated means, variances and a priori probabilities; and

dividing each new training prototype into corresponding additional training prototypes.

28. The method of claim 24, wherein the correlating step comprises utilizing a viterbi alignment technique.

29. The method of claim 23, wherein step (c) further comprises the steps of:

establishing the number of prototypes required to provide adequate representation for a class; and wherein for each of the classes, the method further comprising the steps of:

selecting a number of prototypes;

calculating respective new prototypes by averaging the respective values of feature vectors situated proximate to each of the prototypes until the average distance between the feature vectors remains substantially constant; and

successively replacing the two closest new prototypes with another new prototype whose value is the average of the values of the replaced prototypes until a predetermined number of another prototypes remains.

30. The method of claim 29, further comprising the steps of:

using a distribution analysis on the predetermined number of the another prototypes to calculate a corresponding set of prototypes each having estimated means, variances and a priori probabilities;

dividing each prototype having the estimated means, variances and a priori probabilities into additional prototypes to provide a greater number of prototypes for comparison with the feature vector signal.

31. The method of claim 23, wherein the prototype value of the at least one prototype is computed from means, variances and a priori probabilities of a set of acoustic feature vectors associated with the prototype.

32. The method of claim 23, wherein the prototype value of the at least one prototype is computed by associating the location of the feature value of the one feature vector signal on a probability distribution function of the prototype.

33. A method of coding speech comprising the steps of:

(a) storing in a memory means a plurality of prototype vectors representative of a plurality of classes, each class having an identifier represented by at least one of the plurality of prototype vectors, each of the plurality of prototype vectors having at least one prototype value;

(b) using transducer means to extract from an utterance a feature vector signal having a feature value;

(c) establishing a correspondence between the feature vector signal and at least one class by comparing the feature value of the feature vector signal against the respective prototype values of the prototype vectors;

(d) coding the feature vector signal with the identifier of the class represented by any of the prototype vectors having a prototype value that most closely matches the feature value of the feature vector signal.

34. The method of claim 33, wherein each class is represented by a number of prototype vectors of the plurality of prototype vectors, and wherein the method further comprising the step of:

considering the respective prototype values of the prototype vectors of each class as a whole against the feature value of the feature vector signal to determine which class of the plurality of classes the feature vector signal best corresponds to.

35. The method of claim 33, wherein prior to step (a), the method further comprising the steps of:

establishing an inventory of training classes;

extracting training feature vectors from a string of training text; and

correlating each of the feature vectors with one of the training classes.

36. The method of claim 35, further comprising the steps of:

measuring and extracting from utterances over successive predetermined periods of time corresponding successive sets of feature vectors, each feature vector of the successive sets of feature vectors having a dimensionality of at least one feature value;

merging the feature vectors in each of the successive sets of feature vectors to form a plurality of consolidated feature vectors whose respective dimensionalities being the sum of the dimensionalities of the corresponding merged feature vectors, the consolidated feature vectors being more adaptable for discrimination between the stored training classes; and

spatially reorienting the consolidated feature vectors to reduce their dimensionalities to thereby effect easier manipulation thereof.

37. The method of claim 36, further comprising the steps of:

establishing the number of prototype vectors required to provide adequate representation of a class; and wherein for each of the training classes, the method further comprising the steps of:

selecting a number of training prototype vectors;

calculating respective new training prototype vectors by averaging the respective values of feature vectors situated proximate to each of the training prototype vectors until the average distance between the feature vectors remains substantially constant; and

successively replacing the two closest new training prototype vectors with another new training prototype vector whose value is the average of the values of the replaced training prototype vectors until a predetermined number of another training prototype vectors remains.

38. The method of claim 37, further comprising the steps of:

using a distribution analysis on the predetermined number of training prototype vectors to calculate a corresponding set of new training prototype vectors each having estimated means, variances and a priori probabilities; and

dividing each new training prototype vector into corresponding additional training prototype vectors.

39. The method of claim 33, wherein step (c) further comprises the steps of:

establishing the number of prototype vectors required to provide adequate representation for a class; and wherein for each of the classes, the method further comprising the steps of:

selecting a number of prototype vectors;

calculating respective new prototype vectors by averaging the respective values of feature vectors situated proximate to each of the prototype vectors until the average distance between the feature vectors remains substantially constant; and

successively replacing the two closest new prototype vectors with another new prototype vector whose value is the average of the values of the replaced prototype vectors until a predetermined number of another prototype vectors remains.

40. The method of claim 39, further comprising the steps of:

using a distribution analysis on the predetermined number of the another prototype vectors to calculate a corresponding set of prototype vectors each having estimated means, variances and a priori probabilities;

dividing each prototype vector having the estimated means, variances and a priori probabilities into additional prototype vectors to provide a greater number of prototype vectors for comparison with the feature vector signal.

41. The method of claim 33, wherein the correlating step comprises utilizing a Viterbi alignment technique.

42. The method of claim 33, wherein the prototype value of the at least one prototype vector is computed from means, variances and a priori probabilities of a set of acoustic feature vectors associated with the prototype.

43. The method of claim 33, wherein the prototype value of the at least one prototype vector is computed by associating location of the feature value of the one feature vector signal on a probability distribution function of the prototype vector.

44. A speech coding apparatus comprising:

means for storing two or more prototype vector signals, each prototype vector signal representing a prototype vector having an identifier and at least two partitions, each partition having at least one partition value;

transducer means for measuring value of at least one feature of an utterance during a time interval to produce a feature vector signal representing the value of the at least one feature of the utterance;

means for calculating a match score for each partition, each partition match score representing the value of a match between the partition value of the partition and the feature value of the feature vector signal;

means for calculating a prototype match score for each prototype vector, each prototype match score representing a function of the partition match scores for all partitions in the prototype vector; and

means for coding the feature vector signal with the identifier of the prototype vector signal having a best prototype match score.

45. An apparatus as claimed in claim 44, characterized in that:

each partition match score is proportional to the joint probability of occurrence of the feature value of the feature vector signal and the partition value of the partition; and

the prototype match score represents the sum of the partition match scores for all partitions in the prototype vector.

46. An apparatus as claimed in claim 45, further comprising means for generating prototype vector signals, said prototype vector signal generating means comprising:

means for measuring the value of at least one feature of a training utterance during each of a series of successive first time intervals to produce a series of training corresponding to a first time interval, each training feature vector signal representing the value of at least one feature of the training utterance during a second time interval containing the corresponding first time interval, each second time interval being greater than or equal to the corresponding first time interval;

means for providing a network of elemental models corresponding to the training utterance;

means for correlating the training feature vector signals in the series of training feature vector signals to the elemental models in the network of elemental models corresponding to the training utterance so that each training feature vector signal in the series of training feature vector signals corresponds to one elemental model in the network of elemental models corresponding to the training utterance;

means for selecting a fundamental set of all training feature vector signals which correspond to all occurrences of a first elemental model in the network of elemental models corresponding to the training utterance;

means for selecting at least first and second different subsets of the fundamental set of training feature vector signals to form a first label set of training feature vector signals;

means for calculating centroid of the feature values of the training feature vector signals of each of the first and second subsets of the fundamental set; and

means for storing a first prototype vector signal corresponding to the first label set of training feature vector signals, said first prototype vector signal representing a first prototype vector having at least first and second partitions, each partition having at least one partition value, the first partition having a partition value equal to the value of the centroid of the feature values of the training feature vector signals in the first subset of the fundamental set, the second partition having a partition value equal to the value of the centroid of the feature values of the training feature vector signals in the second subset of the fundamental set.

47. An apparatus as claimed in claim 46, characterized in that the centroid is arithmetic average.

48. An apparatus as claimed in claim 47, characterized in that the network of elemental models is a series of elemental models.

49. An apparatus as claimed in claim 48, characterized in that:

the fundamental set of training feature vector signals is divided into at least first, second and third subsets of training feature vector signals;

the calculating means further calculates the centroid of the feature values of the training feature vector signals in the third subset; and

the apparatus further comprises means for storing a second prototype vector signal, said second prototype vector signal representing the value of the centroid of the feature values of the training feature vector signals in the third subset of the fundamental set.

50. An apparatus as claimed in claim 49, characterized in that:

the feature values of the training feature vector signals in each subset of the fundamental set have a feature value variance and a a priori probability;

the apparatus further comprises means for calculating the variance and a priori probability of the feature values of the training feature vector signals in each subset of the fundamental set;

the first partition of the first prototype vector has a further partition value equal to the value of the variance and a priori probability of the feature values of the training feature vector signals in the first subset of the fundamental set;

the second partition of the first prototype vector has a further partition value equal to the value of the variance and a priori probability of the feature values of the training feature vector signals in the second subset of the fundamental set; and

the second prototype signal represents the value of the variance and a priori probability of the feature values of the training feature vector signals in the third subset of the fundamental set.

51. An apparatus as claimed in claim 50, characterized in that:

the apparatus further comprises means for estimating conditional probability of occurrence of each subset of the fundamental set of training feature vector signals given the occurrence of the first label set;

the apparatus further comprises means for estimating the probability of occurrence of the first label set of training feature vector signals;

the first prototype vector further represents the estimated probability of occurrence of the first label set of training feature vector signals;

the first partition of the first prototype vector has a further partition value equal to the estimated conditional probability of occurrence of the first subset of the fundamental set of training feature vector signals given the occurrence of the first label set; and

the second partition of the first prototype vector has a further partition value equal to the estimated conditional probability of occurrence of the second subset of the fundamental set of training feature vector signals given the occurrence of the first label set.

52. An apparatus as claimed in claim 51, characterized in that:

each second time interval is equal to at least two first time intervals; and

each feature vector signal comprises at least two feature values of the utterance at two different times.

53. An apparatus as claimed in claim 52, characterized in that each feature vector signal represents values of m features, where m is an integer greater than or equal to two;

each partition has n partition values, where n is less than m; and

the apparatus further comprises means for transforming the m values of each feature vector signal to n values prior to calculating the centroids, and variances and a priori probability of the subsets.

54. An apparatus as claimed in claim 53, characterized in that:

the elemental models are elemental probabilistic models;

the correlating means comprises means for aligning the feature vector signals and the elemental probabilistic models.

55. A speech coding method comprising the steps of:

storing two or more prototype vector signals, each prototype vector signal representing a prototype vector having an identifier and at least two partitions, each partition having at least one partition value;

using transducer means to measure a value of at least one feature of an utterance during a time interval to produce a feature vector signal representing the value of the at least one feature of the utterance;

calculating a match score for each partition, each partition match score representing the value of a match between the partition value of the partition and the feature value of the feature vector signal;

calculating a prototype match score for each prototype vector, each prototype match score representing a function of the partition match scores for all partitions in the prototype vector; and

coding the feature vector signal with the identifier of the prototype vector signal having the a prototype match score.

56. A method as claimed in claim 55, characterized in that:

each partition match score is proportional to the joint probability of occurrence of the feature value of the feature vector signal and the partition value of the partition; and

the prototype match score represents the sum of the partition match scores for all partitions in the prototype vector.

57. A method as claimed in claim 56, further comprising a method of generating prototype vector signals, said prototype vector signal generating method comprising:

measuring the value of at least one feature of a training utterance during each of a series of successive first time intervals to produce a series of training feature vector signals, each training feature vector signal corresponding to a first time interval, each training feature vector signal representing the value of at least one feature of the training utterance during a second time interval containing the corresponding first time interval, each second time interval being greater than or equal to the corresponding first time interval;

providing a network of elemental models corresponding to the training utterance;

correlating the training feature vector signals in the series of training feature vector signals to the elemental models in the network of elemental models corresponding to the training utterance so that each training feature vector signal in the series of training feature vector signals corresponds to one elemental model in the network of elemental models corresponding to the training utterance;

selecting a fundamental set of all training feature vector signals which correspond to all occurrences of a first elemental model in the network of elemental models corresponding to the training utterance;

selecting at least first and second different subsets of the fundamental set of training feature vector signals to form a first label set of training feature vector signals;

calculating centroid of the feature values of the training feature vector signals of each of the first and second subsets of the fundamental set; and

storing a first prototype vector signal corresponding to the first label set of training feature vector signals, said first prototype vector signal representing a first prototype vector having at least first and second partitions, each partition having at least one partition value, the first partition having a partition value equal to the value of the centroid of the feature values of the training feature vector signals in the first subset of the fundamental set, the second partition having a partition value equal to the value of the centroid of the feature values of the training feature vector signals in the second subset of the fundamental set.

58. A method as claimed in claim 57, characterized in that the centroid is arithmetic average.

59. A method as claimed in claim 58, characterized in that the network of elemental models is a series of elemental models.

60. A method as claimed in claim 59, characterized in that:

the fundamental set of training feature vector signals is divided into at least first, second and third subsets of training feature vector signals;

the calculating step further calculates the centroid of the feature values of the training feature vector signals in the third subset; and

the method further comprises the step of storing a second prototype vector signal, said second prototype vector signal representing the value of the centroid of the feature values of the training feature vector signals in the third subset of the fundamental set.

61. A method as claimed in claim 60, characterized in that:

the feature values of the training feature vector signals in each subset of the fundamental set have a feature value variance and a priori probability;

the method further comprises the step of calculating the variance and a priori probability of the feature values of the training feature vector signals in each subset of the fundamental set;

the first prototype signal represents the values of the variance and a priori probability of the feature values of the training feature vector signals in the first and second subsets of the fundamental set; and

the second prototype signal represents the value of the variance and a priori probability of the feature values of the training feature vector signals in the third subset of the fundamental set.

62. A method as claimed in claim 61, characterized in that:

the method further comprises the step of estimating conditional probability of occurrence of each subset of the fundamental set of training feature vector signals given the occurrence of the first label set;

the method further comprises the step of estimating the probability of occurrence of the first label set of training feature vector signals;

the first prototype vector further represents the estimated probability of occurrence of the first label set of training feature vector signals;

the first partition of the first prototype vector has a further partition value equal to the estimated conditional probability of occurrence of the first subset of the fundamental set of training feature vector signals given the occurrence of the first label set; and

the second partition of the first prototype vector has a further partition value equal to the estimated conditional probability of occurrence of the second subset of the fundamental set of training feature vector signals given the occurrence of the first label set.

63. A method as claimed in claim 62, characterized in that:

each second time interval is equal to at least two first time intervals; and

each feature vector signal comprises at least two feature values of the utterance at two different times.

64. A method as claimed in claim 63, characterized in that:

each feature vector signal represents values of m features, where m is an integer greater than or equal to two;

each partition has n partition values, where n is less than m; and

the method further comprises the step of transforming the m values of each feature vector signal to n values prior to calculating the centroids and variance and a priori probability of the subsets.

65. A method as claimed in claim 64, characte