WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Method and system for voice coding based on vector quantization    
United States Patent5077798   
Link to this pagehttp://www.wikipatents.com/5077798.html
Inventor(s)Ichikawa; Akira (Musashino, JP); Asakawa; Yoshiaki (Kawasaki, JP); Yajima; Shunichi (Hachioji, JP); Aritsuka; Toshiyuki (Higashimurayama, JP); Yamasaki; Katsuya (Fujisawa, JP)
AbstractA system for voice coding based on vector quantization has an apparatus in which a distribution area of parameters representative of a voice is divided into a plurality of domains so that one vector (code vector) may correspond to one domain, an apparatus for representing individual code vectors by codes specific thereto, an apparatus for converting an input voice into a vector and determining membership functions by numerically expressing the distance between the nearest code vector and each of the predetermined number of neighboring vectors, and an apparatus for transmitting, as fuzzy vector quantization information, a code of the nearest code vector and the membership functions.



 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Inventor     Ichikawa; Akira (Musashino, JP); Asakawa; Yoshiaki (Kawasaki, JP); Yajima; Shunichi (Hachioji, JP); Aritsuka; Toshiyuki (Higashimurayama, JP); Yamasaki; Katsuya (Fujisawa, JP)
Owner/Assignee     Hitachi, Ltd. (Tokyo, JP)
Patent assignment
All assignments
Publication Date     December 31, 1991
Application Number     07/412,987
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     September 26, 1989
US Classification    
Int'l Classification    
Examiner     Kemeny; Emanuel S.
Assistant Examiner    
Attorney/Law Firm     Antonelli, Terry Stout & Kraus
Address
Parent Case    
Priority Data     Sep 28, 1988 [JP] 63-240972 Mar 13, 1989 [JP] 1-057706 Apr 28, 1989 [JP] 1-107615 Aug 18, 1989 [JP] 1-211311
USPTO Field of Search    
Patent Tags     voice coding based vector quantization
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
4975957
Ichikawa
704/220
Dec,1990

[0 after 0 votes]
4860355
Copperi
704/213
Aug,1989

[0 after 0 votes]
4811398
Copperi
704/230
Mar,1989

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


We claim:

1. A system for voice coding based on vector quantization comprising:

(a) means in which a distribution area of parameters representative of a voice is divided into a plurality of domains so that one vector (code vector) having elements represented by values of said parameters may correspond to one domain;

(b) means for representing individual code vectors by codes specific thereto;

(c) means for registering as neighboring vectors code vectors in a plurality of domains which are close, in terms of vector space distance, to each code vector;

(d) means for storing said code vectors, codes and said codes of said neighboring vectors;

(e) means for converting an input voice into a vector having elements represented by values of said parameters;

(f) means for determining the distance between the converted input voice vector and each of said code vectors in said domains by reading out said neighboring vectors from said means for storing and calculating distances between each of said neighboring vectors and said converted input voice vector;

(g) means for determining a code vector having a minimum value of distance to be a nearest vector and selecting a code representing said nearest vector from said stored codes for transmitting;

(h) means for determining membership functions by numerically expressing the distance between said input voice vector and each of neighboring vectors registered in association with said selected code; and

(i) means for transmitting, as vector quantization information, said numerically expressed membership functions and selected code.

2. A voice coding system according to claim 1 wherein said transmitting means includes:

(a) diverse quantization means for determining a reconstructed vector from said vector quantization information through interpolation;

(b) means for determining the difference between said reconstructed vector and said input vector; and

(c) means for modifying said vector quantization information for said selected code such that the difference can be minimized.

3. A voice coding system according to claim 1 wherein said means for converting an input voice into a vector includes:

(a) means for windowing a predetermined interval of the input voice signal;

(b) means for effecting Fourier transform of a windowed signal;

(c) means for determining a power spectrum by squaring respective components resulting from the Fourier transform performed by said means for effecting;

(d) means for logarithmically converting the power spectrum;

(e) means for effecting cosine expansion of said logarithmically converted power spectrum; and

(f) means for determining said input voice vector by using coefficient values of respective components resulting from the cosine expansion as said parameter values.

4. A voice coding system according to claim 3 wherein said means for determining a power spectrum includes means for extracting harmonics of the pitch frequency from the results of the Fourier transform performed by said means for effecting.

5. A voice coding system according to claim 3 wherein said storage means includes:

(a) means for sequentially grouping said parameter values represented by said coefficient values in respect of individual parameter values corresponding to coefficient values of a same order by which a number of said respective components resulting from the cosine expansion is represented, beginning with a parameter value corresponding to a coefficient value of lower order of said power spectrum and reaching a parameter value corresponding to a coefficient value of higher order of said power spectrum; and

(b) means for hierarchically arranging parameter values in respective groups of respective orders.

6. A voice coding system according to claim 3 wherein said means for storing includes storage means for storing coefficient values each being of a different order as parameter values of said code vector in accordance with values of said pitch frequency.

7. A voice coding system according to claim 3 wherein said storage means includes means for limiting a range within which said storage means is retrieved, in accordance with values of pitch information.

8. A voice coding system according to claim 1 wherein said means for determining the distance includes means for weighting individual elements of each vector the distance of which is to be determined, and means for determining the distance on the basis of the weighted elements.

9. A voice coding system according to claim 8 wherein said means for determining membership functions includes:

(a) means for determining the membership functions pursuant to ##EQU6## where l=0, 1, ---, N

N+1: the number of code vectors

.mu.lk: membership function

d[i].sub..kappa. : the distance between input vector and each code vector

.beta.: weighting coefficient; and

(b) means for determining values of weighing coefficients in accordance with a degree by which the weighing coefficients affect voice quality.

10. A system for voice coding based on vector quantization comprising:

(a) means in which a distribution area of first parameters representative of a voice is divided into a plurality of domains so that one vector (first code vector) having elements represented by values of said first parameters may correspond to one domain:

(b) means for representing individual first code vectors by first codes specific thereto;

(c) means for registering as neighboring vectors code vectors in a plurality of domains which are close, in terms of vector space distance, to each code vector;

(d) first storage means for storing said first code vectors, said first codes and codes of said neighboring vectors;

(e) means for converting an input voice into a vector having elements represented by values of said first parameters;

(f) means for determining the distance between the converted input voice vector and each of said first code vectors in said domains by reading out said neighboring vectors from said first storage means and calculating distances between each of said neighboring vectors and said converted input voice vector;

(g) means for determining a first code vector having a minimum value of distance to be a nearest vector and selecting a first code representing said nearest vector from said stored codes;

(h) vector inverse quantization means for determining, from said selected first code, a reconstructed vector which approximates said input voice vector;

(i) means for determining a quantization distortion representing a difference between said input voice vector and said reconstructed vector;

(j) means in which a distribution area of second parameters representative of said quantization distortion is divided into a plurality of domains so that one vector having elements represented by values of said second parameters may correspond to one domain;

(k) means for representing individual second code vectors by second codes specific thereto;

(l) means for registering as second neighboring vectors second code vectors in a plurality of domains which are close, in terms of vector space distance, to each second code vector;

(m) second storage means for storing said second code vectors, second codes and said codes for said second neighboring vectors;

(n) means for converting said quantization distortion into a second vector having elements represented by values of said second parameters;

(o) means for determining the distance between the quantization distortion vector and each of said second code vectors in said domains by reading out said second neighboring vectors from said second storage means and calculating distances between each of said second neighboring vectors and said quantization distortion vector;

(p) means for determining a second code vector having a minimum value of distance to be a nearest vector and selecting a second code representing said nearest vector from said stored second codes;

(q) means for determining membership functions by numerically expressing the distance between said quantization distortion vector and each of second neighboring vectors registered in association with said selected second code vector; and

(r) means for delivering, as vector quantization information for said input voice, said numerically expressed membership functions and selected second and first codes.

11. A system for voice coding based on vector quantization comprising:

(a) means in which a distribution area of parameters representative of a voice is divided into a plurality of domains so that one vector (code vector) having elements represented by values of said parameters may correspond to one domain;

(b) means for representing individual code vectors by codes specific thereto;

(c) means for storing said code vectors and said codes;

(d) means for converting an input voice into a vector having elements represented by values of said parameters;

(e) means for retrieving said storage means to select code vectors as candidate vectors for vector quantization on the basis of distances between a plurality of said code vectors and said converted input voice vector;

(f) means for effecting fuzzy vector quantization of said input voice vector after each time when the candidate vectors are sequentially added one by one to a candidate vector having a minimum distance;

(g) means for comparing a quantization distortion occurring before sequential addition of candidate vectors with that occurring after said sequential addition;

(h) means responsive to a result of a comparison to decide in accordance with an increase or decrease in said quantization distortion whether the added candidate vectors would be used for the fuzzy vector quantization;

(i) means for selecting a code of said candidate vector having information of highest similarity;

(j) means for determining membership functions by numerically expressing the distance between said input voice vector and each of candidate vectors used for said fuzzy vector quantization; and

(k) means for transmitting, as vector quantization information, said numerically expressed membership functions and selected code.

12. A voice coding system according to claim 11 wherein said means for selecting candidate vectors includes:

(a) means for determining the distance between said input voice vector and each code vector; and

(b) means for selecting a predetermined number of code vectors in accordance with a closeness of each code vector to said input voice vector, said selecting being performed in a predetermined manner.

13. A voice coding system according to claim 11 wherein said means for selecting candidate vectors includes:

(a) means for determining the distance between said input voice vector and each code vector; and

(b) means for selecting code vector having values of the distance which are below a predetermined value.

14. A voice coding system according to claim 11 wherein said transmitting means includes:

(a) inverse quantization means for determining a reconstructed vector from said vector quantization information through interpolation;

(b) means for determining the difference between said reconstructed vector and said input vector; and

(c) means for modifying said vector quantization information such that the difference can be minimized.

15. A voice coding system according to claim 11 wherein said transmitting means includes:

(a) inverse quantization means for determining a reconstructed vector from said vector quantization information through interpolation;

(b) means for storing said reconstructed vector; and

(c) means for fetching the stored reconstructed vector as one of said code vectors.

16. A voice coding system according to claim 15 wherein said candidate vector selection means includes:

(a) means for determining the distance between said input voice vector and each code vector; and

(b) means for selecting a predetermined number of code vectors in accordance with a closeness of each code vector to said input voice vector, said selecting being performed in a predetermined manner.

17. A voice coding system according to claim 15 wherein said candidate vector selection means includes:

(a) means for determining the distance between said input voice vector and each code vector; and

(b) means for selecting code vectors having values of the distance which are below a predetermined value.

18. A voice coding system according to claim 15 wherein said transmitting means further includes:

(a) means for determining the difference between said reconstructed vector and said input voice vector; and

(b) means for modifying said vector quantization information such that the difference can be minimized.

19. A system for voice coding based on vector quantization comprising:

(a) means in which a distribution area of parameters representative of a voice is divided into a plurality of domains so that one vector (code vector) having elements represented by values of said parameters may correspond to one domain;

(b) means for representing individual code vectors by code as specific thereto;

(c) means for registering as neighboring vector code vectors in a plurality of domains which are close, in terms of vector space distance, to each code vector;

(d) means for storing said code vectors, codes and said codes of said neighboring vectors;

(e) means for converting an input voice into a vector having elements represented by values of said parameters;

(f) means for determining the distance between the converted input voice vector and each of said code vectors in said domains by reading out said neighboring vectors from said means for storing and calculating distances between each of said neighboring vectors and said converted input voice vector;

(g) means for determining a nearest vector having a minimum value of distance;

(h) means for selecting, from said neighboring vector registered in association with said nearest vector, candidate vectors which are combined with said nearest vector to approximate said input voice vector;

(i) means for determining a synthesis vector which approximates said input voice vector and which takes the form of a linear combination of said nearest vector and candidate vectors;

(j) means for determining a coefficient of the linear combination through weighing by which the quantization distortion is minimized; and

(k) means for transmitting, as vector quantization information, said coefficient of linear combination, said candidate vectors, said codes and said code of said nearest vector.

20. A voice coding system according to claim 19 wherein said means for determining a synthesis vector includes means for determining the synthesis vector which is positioned on a straight line connecting said nearest vector and one of said candidate vectors, and said coefficient determining means determines the coefficient pursuant to ##EQU7## where w: coefficient

input vector: x.sub.i ={X.sub.1, X.sub.2, . . . , x.sub.l }

nearest vector u.sub.i ={u.sub.1, u.sub.2, . . . , u.sub.l }

candidate vectors used for approximation v.sub.i ={v.sub.1, v.sub.2, . . . , v.sub.l }

21. A voice coding system according to claim 19 wherein said storage means includes:

(a) means in which a plurality of tables describing combinations of different kinds of said code vectors, codes and neighboring vectors are stored; and

(b) means for retrieving, from said plurality of tables, a table which is effective for minimization of the approximation error.

22. A system for voice decoding based on vector quantization comprising:

(a) means for storing a table having code vectors, codes of said code vectors and codes of neighboring vectors registered in association with said code vectors, the contents of said table corresponding to the contents of a second table in a transmitting station; and

(b) fuzzy vector inverse quantization means for determining a reconstructed vector representing an input voice through interpolation on the basis of codes of one or a plurality of received code vectors which are used for vector quantization, said determining operation being performed by use of received membership functions and said stored table, said received code vectors and said received membership functions being received by a receiver from said transmitting station.

23. A system for voice decoding based on vector quantization comprising:

(a) first storage means for storing a table having code vectors, codes of said code vector and codes of neighboring vectors registered in association with said code vectors, the contents of said table corresponding to the contents of a second table in a transmitting station;

(b) second storage means for storing a reconstructed vector representing an input voice produced through interpolation on the basis of codes of one or a plurality of received code vectors used for vector quantization, received membership functions and said stored table, said received code vectors and said received membership functions being received by a receiver from said transmitting station;

(c) means for reading a reconstructed vector resulting from reconstruction of a preceding input voice vector form said second storage means when a signal is received; and

(d) fuzzy vector inverse quantization means for determining a reconstructed vector of a currently received input voice vector through interpolation on the basis of a code of received code vectors, received membership functions, said stored table and said read-out reconstructed vector, said received code vectors and said received membership functions being received by said receiver from said transmitting station.

24. A system for voice decoding based on vector quantization comprising:

(a) means for storing a table in which code vectors, codes of said code vector and said codes of neighboring vectors registered in association with said code vectors are contained which are the same as those used for vector quantization in the transmitting station; and

(b) fuzzy vector inverse quantization means for determining a reconstructed vector representing an input voice pursuant to

where

w: coefficient

y: reconstructed vector

input vector x.sub.i ={x.sub.1, x.sub.2, . . . , x.sub.l }

nearest vector u.sub.i ={u.sub.1, u.sub.2, . . . , u.sub.l }

candidate vectors used for approximation v.sub.i ={v.sub.1, v.sub.2, . . . , v.sub.l }

said nearest vector .mu..sub.i has the minimum distance to said input vector X.sub.i, said candidate vectors V.sub.i correspond to some of said neighboring vectors

On the basis of codes of a plurality of received code vectors used for the vector quantization, a received coefficient value used for vector approximation in the form of a linear combination, said received coefficient value being received by a receiver from a transmitting station, and said stored table, said received code vectors being received by said receiver from said transmitting station.

25. A voice communication system based on vector quantization comprising:

a voice encoding system including

(a) means in which a distribution area of parameters representative of a voice is divided into a plurality of domains so that one vector (code vector) having elements represented by values of said parameters may correspond to one domain,

(b) means for representing individual code vectors by codes specific thereto,

(c) means for registering as neighboring vectors code vectors in a plurality of domains which are close, in terms of vector space distance, to each code vector,

(d) means for storing said code vectors, codes and said codes of said neighboring vectors,

(e) means for converting an input voice into a vector having elements represented by values of said parameters,

(f) means for determining the distance between the converted input voice vector and each of said code vectors in said domains by reading out said neighboring vectors from said means for storing and calculating distances between each of said neighboring vectors and said converted input voice vector,

(g) means for determining a code vector having a minimum value of distance to be a nearest vector and selecting a code representing said nearest vector from said stored codes for transmitting,

(h) means for determining membership functions by numerically expressing the distance between said input voice vector and each of neighboring vectors registered in association with said selected code, and

(i) means for transmitting, as vector quantization information, said numerically expressed membership functions and selected code; and

a voice decoding system including

(j) means for storing a table in which code vectors, codes of said code vectors and said codes of neighboring vectors registered in association with said code vectors, the contents of said table corresponding to the contents of a second table in a transmitting station, and

(k) fuzzy vector inverse quantization means for determining a reconstructed vector representing an input voice through interpolation on the basis of codes of one or a plurality of received code vectors which are used for vector quantization, said determining operation being performed by use of received membership functions and said stored table, said received code vectors and said received membership functions being received by a receiver from said transmitting station.

26. A voice communication system based on vector quantization comprising:

a voice encoding system including

(a) means in which a distribution area of first parameters representative of a voice is divided into a plurality of domains so that one vector (first code vector) having elements represented by values of said first parameters may correspond to one domain,

(b) means for representing individual first code vectors by first codes specific thereto,

(c) means for registering as neighboring vectors code vectors in a plurality of domains which are close, in terms of vector space distance, to each other,

(d) first storage means for storing said first code vectors, said first codes and code of said neighboring vectors,

(e) means for converting an input voice into a vector having elements represented by values of said first parameters,

(f) means for determining the distance between the converted input voice vector and each of said first code vectors in said domains by reading out said neighboring vectors from said first storage means and calculating distances between each of said neighboring vectors and said converted input voice vector;

(g) means for determining a first code vector having a minimum value of distance to be a nearest vector and selecting a first code representing said nearest vector from said stored codes;

(h) vector inverse quantization means for determining, from said selected first code, a reconstructed vector which approximates said input voice vector,

(i) means for determining a quantization distortion representing a difference between said input voice vector and said reconstructed vector,

(j) means in which a distribution area of second parameters representative of said quantization distortion is divided into a plurality of domains so that one vector (second code vector) having elements represented by values of said second parameters may correspond to one domain,

(k) means for representing individual second code vectors by second codes specific thereto,

(l) means for registering as second neighboring vectors second code vectors in a plurality of domains which are close, in terms of vector space distance, to each second code vector,

(m) second storage means for storing said second code vectors, second codes and said codes of said second neighboring vectors,

(n) means for converting said quantization distortion into a second vector having elements represented by values of said second parameters,

(o) means for determining the distance between the quantization distortion vector and each of said second code vectors in said domains by reading out said second neighboring vectors from said second storage means and calculating distances between each of said second neighboring vectors and said quantization distortion vector,

(p) means for determining a second code vector having a minimum value of distance to be a nearest vector and selecting a second code representing said nearest vector from said stored second codes,

(q) means for determining membership functions by numerically expressing the distance between said quantization distortion vector and each of second neighboring vectors registered in association with said selected second code vector, and

(r) means for delivering, as vector quantization information for said input voice, said numerically expressed membership functions and selected second and first codes; and

a voice decoding system including

(s) means for storing a table in which code vectors, codes of said code vectors and said codes of neighboring vectors registered in association with said code vectors, the contents of said table corresponding to the contents of a second table in a transmitting station; and

(t) fuzzy vector inverse quantization means for determining a reconstructed vector representing an input voice through interpolation on the basis of codes of one or a plurality of received code vectors which are used for vector quantization, said determining operation being performed by use of received membership functions and said stored table, said received code vectors and said received membership functions being received by a receiver from said transmitting station.

27. A voice communication system based on vector quantization comprising:

a voice encoding system including

a voice encoding system including

(a) means in which a distribution area of parameters representative of a voice is divided into a plurality of domains so that one vector (code vector) having elements represented by values of said parameters may correspond to one domain,

(b) means for representing individual code vectors by codes specific thereto,

(c) means for storing said code vectors and said codes,

(d) means for converting an input voice into a vector having elements represented by values of said parameters,

(e) means for retrieving said storage means to select code vectors as candidate vectors for vector quantization on the basis of distances between a plurality of said code vectors and said converted input voice vector,

(f) means for effecting fuzzy vector quantization of said input voice vector after each time when other candidate vectors are sequentially added one by one to a candidate vector having a minimum distance,

(g) means for comparing a quantization distortion occurring before sequential addition of candidate vectors with that occurring after said sequential addition,

(h) means responsive to a result of a comparison to decide in accordance with an increase or decrease in said quantization distortion whether the added candidate vectors would be used for the fuzzy vector quantization,

(i) means for selecting a code of said candidate vector having information of highest, similarity,

(j) means for determining membership functions by numerically expressing the distance between said input voice vector and each of candidate vectors used for said fuzzy vector quantization, and

(k) means for transmitting, as vector quantization information, said numerically expressed membership functions and selected code; and

a voice decoding system including

(l) means for storing a table in which code vectors, codes of said code vectors and said codes of neighboring vectors registered in association with said code vectors, the contents of said table corresponding to the contents of a second table in a transmitting station; and

(m) fuzzy vector inverse quantization means for determining a reconstructed vector representing an input voice through interpolation on the basis of codes of one or a plurality of received code vectors which are used for vector quantization, said determining operation being performed by use of received membership functions and said stored table, said received code vectors and said received membership functions being received by a receiver from said transmitting station.

28. A voice communication system based on vector quantization comprising:

a voice encoding system including

(a) means in which a distribution area of parameters representative of a voice is divided into a plurality of domains so that one vector (code vector) having elements represented by values of said parameters may correspond to one domain,

(b) means for representing individual code vectors by codes specific thereto,

(c) means for storing said code vectors and said codes,

(d) means for converting an input voice into a vector having elements represented by values of said parameters,

(e) means for retrieving said storage means to select code vectors as candidate vectors for vector quantization on the basis of distances between a plurality of said code vectors and said converted input voice vector,

(f) means for effecting fuzzy vector quantization of said input voice vector after each time when other candidate vectors are sequentially added one by one to a candidate vector having a minimum distance,

(g) means for comparing a quantization distortion occurring before sequential addition of candidate vectors with that occurring after said sequential addition,

(h) means responsive to a result of a comparison to decide in accordance with an increase or decrease in said quantization distortion whether the added candidate vectors would be used for the fuzzy vector quantization,

(i) means for selecting a code of said candidate vector having information of highest, similarity,

(j) means for determining membership functions by numerically expressing the distance between said input voice vector and each of candidate vectors used for said fuzzy vector quantization, and

(k) means for transmitting, as vector quantization information, said numerically expressed membership functions and selected code; and

wherein said transmitting means includes

(l) inverse quantization means for determining a reconstructed vector from said vector quantization information through interpolation,

(m) means for determining the difference between said reconstructed vector and said input vector, and

(n) means for fetching the stored reconstructed vector as one of said code vectors; and

a voice decoding system including

(o) first storage means for storing a table having code vectors, codes of said code vector and codes of neighboring vectors registered in association with said code vectors, the contents of said table corresponding to the contents of a second table in a transmitting station,

(p) second storage means for storing a reconstructed vector representing an input voice produced through interpolation on the basis of codes of one or a plurality of received code vectors used for vector quantization, received membership functions and said stored table, said received code vectors and said received membership functions being received by a receiver from said transmitting station,

(q) means for reading a reconstructed vector resulting from reconstruction of a preceding input voice vector from said second storage means when a signal is received, and

(r) fuzzy vector inverse quantization means for determining a reconstructed vector of a currently received input voice vector through interpolation on the basis of a code of received code vectors, received membership functions, said stored table and said read-out reconstructed vector, said received code vector and said received membership functions being received by said receiver from said transmitting station.

29. A voice communication system based on vector quantization comprising:

a voice encoding system including

(a) means in which a distribution area of parameters representative of a voice is divided in to a plurality o domains so that one vector (code vector) having elements represented by values of said parameters may correspond to one domain,

(b) means for representing individual code vectors by code as specific thereto,

(c) means for registering as neighboring vector code vector in a plurality of domains which are close, in terms of vector space distance, to each code vector,

(d) means for storing said code vectors, codes and said codes of said neighboring vectors,

(e) means for converting an input voice into a vector having elements represented by values of said parameters,

(f) means for determining the distance between the converted input voice vector and each of said code vectors in said domains by retrieving and storage means,

(g) means for determining a nearest vector having a minimum value of distance,

(h) means for selecting, from said neighboring vector registered in association with said nearest vector, candidate vectors which are combined with said nearest vector to approximate said input voice vector,

(i) means for determining a synthesis vector which approximates said input voice vector and which takes the form of a linear combination of said nearest vector and candidate vectors,

(j) means for determining a coefficient of the linear combination through weighing by which the quantization distortion is minimized, and

(k) means for transmitting, as vector quantization information, said coefficient of linear combination, said candidate vectors, said codes and said code of said nearest vector;

wherein said means for determining a synthesis vector includes means for determining the synthesis vector which is positioned on a straight line connecting said nearest vector and one of said candidate vectors, and said coefficient determining means determines the coefficient pursuant to ##EQU8## where l=0, 1, . . . , N N+1: the number of code vectors

.parallel.lk: membership function

d[i].sub.j.kappa. : the distance between input vector and code vector

.alpha.: weighting coefficient; and

voice decoding system including

(l) means for storing a table in which code vectors, codes of said code vector and said codes of neighboring vectors registered in association with said code vectors are contained which are the same as those used for vector quantization in the transmitting station, and

(m) fuzzy vector inverse quantization means for determining a reconstructed vector representing an input voice ##EQU9## where w: coefficient

input vector X.sub.i -{x.sub.1, x.sub.2, . . . , x.sub.l }

nearest vector u.sub.i ={u.sub.1, u.sub.2, . . . , u.sub.l }

candidate vector used for approximation v.sub.1 ={v.sub.1, v.sub.2, . . . , v.sub.l };

said nearest vector u.sub.i has the minimum distance to said input vector X.sub.i, said candidate vectors V.sub.i correspond to some of said neighboring vectors

on the basis of codes of a plurality of received code vectors used for the vector quantization, a received coefficient value used for vector approximation in the form of a linear combination, said received coefficient value being received by a receiver from a transmitting station, and said stored table, said received code vectors being received by said receiver from said transmitting station.

30. A method for voice coding based on vector quantization comprising the steps of:

(a) dividing a distributed area of parameters representative of a voice into a plurality of domains making one vector (code vector) having elements represented by values of said parameters correspond to one domain, and storing code vectors;

(b) storing codes assigned to said code vectors;

(c) converting an input voice into a vector having elements represented by values of said parameters;

(d) determining a code vector having a minimum value of distance from said input voice vector to be a nearest vector and selecting a code representing said nearest vector from said store codes for transmitting;

(e) selecting a predetermined number of code vectors neighboring said nearest vector as candidate vectors for approximation of said input voice vector;

(f) determining membership functions by numerically expressing the distance between said input voice vector and each of said selected code vectors and the distance between said input voice vector and said nearest vector, in accordance with ##EQU10## where l=0, 1 . . . , N .mu.l k: membership function

d[i].sub.l.kappa. : the distance between input vector and code vector

.alpha.: weighting coefficient; and

(g) transmitting and numerically expressed membership functions and said selected code as vector quantization information.

31. A voice coding method according to claim 30 wherein said transmission step includes the steps of:

(a) determining a reconstructed vector representing an input voice vector for use as a feedback signal from said vector quantization information through interpolating;

(b) storing said reconstructed vector; and

(c) using said reconstructed vector produced precedently as a candidate vector.

32. A voice coding method according to claim 31 wherein said transmission step further includes the steps of:

(a) determining an error between a reconstructed vector representing a current input voice and said input voice vector representing an input voice inputted before said current input voice; and

(b) modifying said vector quantization information for said selected code such that the difference can be minimized.

33. A method for voice coding based on vector quantization comprising the steps of:

(a) devising a distribution area of parameters representative of a voice into a plurality of domains, making one vector (code vector) having elements represented by values of said parameters correspond to one domain, and storing code vectors;

(b) storing codes assigned to said code vectors;

(c) converting an input voice into a vector having elements represented by values of said parameters;

(d) determining a code vector having a minimum value of distance from said input voice vector to be a nearest vector and selecting a code representing said nearest vector from said stored codes for transmitting;

(e) selecting a predetermined number of code vector neighboring said nearest at vector as candidate vectors for approximation of said input voice vector;

(f) determining a synthesis vector which approximates said input voice vector and which takes the form of a linear combination of said nearest vector and candidate vectors;

(g) determining a coefficient of the linear combination through weighing by which the quantization distortion is minimized; and

(h) transmitting, as vector quantization information, said coefficient of linear combination, said codes of said candidate vectors and said selected code of said nearest vector.

34. A voice coding method according to claim 33 wherein said synthesis vector determining step includes determining the synthesis vector which is positioned on a straight line connecting said nearest vector and one of said candidate vectors, and said coefficient determining step includes the step of determining the coefficient pursuant to ##EQU11## wherein w: coefficient

input vector x.sub.i ={x.sub.1, x.sub.2, . . . , x.sub.l }

nearest vector u.sub.i ={u.sub.1, u.sub.2, . . . , u.sub.l }

candidate vector used for approximation v.sub.i ={v.sub.1, v.sub.2, . . . , v.sub.l }

said nearest vector .mu..sub.i has the minimum distance to said input vector X.sub.i, said candidate vectors V.sub.i correspond to some of said neighboring vectors.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

This invention relates to a system for coding

with high efficiency and more particularly to method and system for voice coding which are suitable for providing a reproduced voice of high quality at a high information compression rate.

In the past, a variety of highly efficient voice coding systems have been proposed. For example, "Digital Information Compression" by Kazuo Nakada, published by Kohsaido Sampoh Shuppan, Electronic Science Series 100 explains plainly various systems, showing many systems belonging to waveform coding system and information source coding system (parameter coding system). One may also refer to "Study of Vector Coding of Voice" by Moriya et al, Papers SP86-16 (1986) of Voice Research Conference, The Institute of Electronics and Communication Engineers of Japan, and JP-A-63-285599.

Of the above conventional systems, the waveform coding system can generally insure good voice quality but has difficulties in raising the information compression efficiency, and the parameter coding system can provide high information compression efficiencies but is disadvantageous in that even with the amount of information increased, improvements in voice quality are limited and sufficiently high quality can not be obtained. Thus, in an information compression region (near 10 kb/s) between bands which are well adapted for the above two systems, the performance particularly in terms of voice quality relative to the quantity of the information is degraded. Under the circumstances, a hybrid system utilizing advantages of the above two systems has recently been proposed, including a multipath type (for example, B. S. Atal et al, "A new Model of LPC Excitation for Producing Natural-Sounding Speech at Low Bit Rates" Proc, ICASSP 82, PP. 614-617, (1982), a CELP type (B. S. Atal et al, "Stochastic coding of speech signals at very low bit rates" Proc. ICC 84, pp. 1610-1613 (1984)) and a TOR type (A. Ichikawa et al, "A Speech Coding method Using Thinned-out Residual" Proc, ICASSP 85, pp. 961-964 (1985)), and has been studied from various view points. But the hybrid system is still unsatisfactory from the standpoint of not only voice quality but also processing expense.

In general, various highly efficient coding systems is using the fact that voice information is locally existent within the range in which parameters are available. The above technical idea has been further developed positively. Combination of a plurality of parameters is represented with a vector. The localization of the vectors is noticed, so that the voice informations can be represented by smaller informations. Such a system, called a vector quantization system and disclosed in, for example, R. M. Gray, "Vector Quantization" IEEE ASSP Magazine, pp. 4-29, (1984, 4) has been highlighted. To describe the vector quantization system more specifically, when a voice is expressed using suitable parameters, the parameters are distributed in a special pattern because of the structure of human mouth. As an example, FIG. 1 graphically shows a voice expressed in terms of two parameters a and b. Most of human speech can be expressed by parameter values filling within an area A. In order for the voice to undergo vector quantization, the area A is divided into a great number of domains and codes 1, 2, 3, . . . specifying individual domains are allotted thereto.

In the case of scaler quantization, when the voice repreSented by a point x in FIG. 1 is coded, a.sub.1 as a parameter "a" and b.sub.1 as a parameter "b" are independently transmitted. On the other hand, in the case of vector quantization, code 12 is transmitted. The code 12 specifies the divided region in which the point x is included.

In the case of scalar quantization, the voice information is represented with the value from amin to a.sub.max as the parameter "a" and with the value from bmin to b.sub.max as the parameter "b" in order to cover the whole area in which there is voice information. Since the parameters "a" and "b" are independently used, the information used for representing the voice is allotted to each divided region within the rectangular region represented by B in FIG. 1. As a result, the voice information is allotted to the region (B - A), even though the voice is not actually present in that region. On the other hand, in the case of vector quantization, since the information used for representing the voice is allotted only in the region represented by A in FIG. 1 in which the voice is present, the information can be compressed more than is possible with scalar quantization.

The method of decoding transmitted codes in the vector quantization is explained below. Each divided region is represented by a representing vector, each having values for each of the parameters which represent the divided region. The representing vector is called a code vector or a centroid. this system is provided with a table called a code book in which the representing vector and the corresponding code are listed. Identical code books are provided on the transmitting side (coding side) and on the receiving side (decoding side) respectively, so that the representing vector corresponding to the transmitted code can be obtained by searching the code book. However, in general, there is a difference between the vector representing the actual input voice (referred to hereinafter input vector) and the representing vector which is obtained. The difference is a quantization distortion.

In the vector quantization system, in order to realize high quality voice coding, it is necessary to prepare in advance a code book of high quality which can express a voice with as high fidelity as possible. To this end, many problems have to be solved including the necessity of use of a sufficiently large amount of speech as training data and the decision as to how many codes the code book should contain and as to what parameters should be used. As a countermeasure against problems encountered in preparation of the code book, a fuzzy vector method (for example, H. P. Tseng, et al, "Fuzzy Vector Quantization Applied to Hidden Markou Modeling" ICASSP 87', 4 (1987)) has been proposed wherein a membership function is used for determining the input voice through interpolation. The membership function represent the degree of similarity between the input vector and each of the representing vectors by using numerical values. The similarity is concretely represented by the distance between the input vector and each of the representing vectors. In the fuzzy vector method, in spite of the fact that the voice quality is expected to be improved in proportion to the quality of the code book, it is not used as technique for transmission because of a large amount of the membership function. At present, the use of the fuzzy vector method for pre-processing of speech recognition has been studied at the most. In addition, a KNN method (for example, "Study of Normalization of Spectrogram by Using Fuzzy Vector Quantization" by Nakamura et al., Papers SP87-123 of Voice Research Conference, Feb. 19, 1988) has been proposed wherein with the view of decreasing the amount of information, the input voice is compared with each of all the representing vectors registered in a code book so that only N vectors close to a point representative of the input vector may be used. The KNN method, however, requires a sorting processing for selection of the N representing vectors (code vectors) close to the input voice point and the amount of processing in the sorting processing raises a very severe problem from the practical standpoint. Further, the transmission of codes of all the N representing vectors causes loss on the amount of the information to be transmitted.

SUMMARY OF THE INVENTION

An object of the present invention is to provide a method and apparatus which can reproduce an input voice with fidelity by using a smaller amount of transmission information in voice coding based on vector quantization.

A second object of the invention is to provide a method and apparatus which are suitable for reproduction of a high-quality decoded voice and vector quantization.

According to the present invention, to accomplish the first object, first to fourth methods may be employed.

In accordance with the first method, N codes of neighboring vectors are registered in association with individual codes in the code book.

In accordance with the second method, fuzzy vector quantization is effected by selectively using representative vectors (hereinafter referred to as code vectors) in the code book in accordance with an input vector. Means for selecting the code vectors includes means for selecting candidates for the code vectors to be used, means for evaluating the relation of the candidate vectors to the input vector, and means for determining a vector to be used on the basis of results of the evaluation.

In accordance with the third method, results of the i