WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Digital encoder with dynamic quantization bit allocation    
United States Patent5621856   
Link to this pagehttp://www.wikipatents.com/5621856.html
Inventor(s)Akagiri; Kenzo (Kanagawa, JP)
AbstractA digital encoder for compressing a digital input signal derived from an analog signal to reduce the number of bits required to represent the analog signal with low quantizing noise. In the encoder, a digital input signal representing the analog signal is divided into three frequency ranges. The digital signal in each of the three frequency ranges is divided in time into frames, and subdivided into blocks, the time duration of which may be adaptively varied. The blocks are orthogonally transformed into spectral coefficients, which are grouped into critical bands. The total number of bits available for quantizing the spectral coefficients is allocated among the critical bands. In a first embodiment and a second embodiment, fixed bits are allocated among the critical bands according to a selected one of a plurality of predetermined bit allocation patterns and variable bits are allocated among the critical bands according to the energy in the critical bands. In the first embodiment, the apportionment between fixed bits and variable bits is fixed. In a second embodiment, the apportionment between fixed bits and variable bits is varied according to the smoothness of the spectrum of the input signal. In a third embodiment, bits are allocated among the critical bands according to a noise shaping factor that is varied according to the smoothness of the spectrum of the input signal. All three embodiments give low quantizing noise with both broad spectrum signals and highly tonal signals.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5621856
Digital encoder with dynamic quantization bit allocation - US Patent 5621856 Drawing
Digital encoder with dynamic quantization bit allocation
Inventor     Akagiri; Kenzo (Kanagawa, JP)
Owner/Assignee     Sony Corporation (JP)
Patent assignment
All assignments
Publication Date     April 15, 1997
Application Number     08/465,340
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     June 5, 1995
US Classification     704/200.1 704/226 704/229 704/230
Int'l Classification     G10L 003/02 G10L 009/00
Examiner     MacDonald; Allen R.
Assistant Examiner     Edouard; Patrick N.
Attorney/Law Firm     Limbach & Limbach L.L.P.
Address
Parent Case     This is a divisional of application Ser. No. 08/272,872, filed Jul. 8, 1994; which is a continuation of Ser. No. 07/924,298, filed Aug. 3, 1992, now abandoned.
Priority Data     Aug 02, 1991[JP]3-216216 Aug 02, 1991[JP]3-216217 Aug 27, 1991[JP]3-271774
USPTO Field of Search     395/2.38 395/2.39 395/2.35 395/2..15 381/29 381/30 381/31 381/32 381/33 381/34 381/35 381/36 381/37 381/38 381/39 381/40 381/41 381/42 381/43 381/44 381/45 381/46 381/47 381/48 381/49 381/50 381/51
Patent Tags     digital encoder dynamic quantization bit allocation
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5268685
Fujiwara
341/76
Dec,1993

[0 after 0 votes]
5264846
Oikawa
341/76
Nov,1993

[0 after 0 votes]
5235671
Mazor

Aug,1993

[0 after 0 votes]
5222189
Fielder
704/229
Jun,1993

[0 after 0 votes]
5166686
Sugiyama

Nov,1992

[0 after 0 votes]
5157760
Akagiri
704/233
Oct,1992

[0 after 0 votes]
5151941
Nishiguchi
704/233
Sep,1992

[0 after 0 votes]
5142656
Fielder
704/229
Aug,1992

[0 after 0 votes]
5134475
Johnston
375/240.12
Jul,1992

[0 after 0 votes]
5125030
Nomura
704/222
Jun,1992

[0 after 0 votes]
5117228
Fuchigami
341/200
May,1992

[0 after 0 votes]
5115240
Fujiwara
341/51
May,1992

[0 after 0 votes]
5109417
Fielder
704/205
Apr,1992

[0 after 0 votes]
5049992
Citta
348/443
Sep,1991

[0 after 0 votes]
5042069
Chhatwal
704/229
Aug,1991

[0 after 0 votes]
4972484
Theile
704/200.1
Nov,1990

[0 after 0 votes]
4964166
Wilson
704/229
Oct,1990

[0 after 0 votes]
4956871
Swaminathan
704/229
Sep,1990

[0 after 0 votes]
4949383
Koh
704/229
Aug,1990

[0 after 0 votes]
4932062
Hamilton
704/233
Jun,1990

[0 after 0 votes]
4896362
Veldhuis
704/200.1
Jan,1990

[0 after 0 votes]
4535472
Tomcik
704/229
Aug,1985

[0 after 0 votes]
4455649
Esteban
370/522
Jun,1984

[0 after 0 votes]
4184049
Crochiere
704/229
Jan,1980

[0 after 0 votes]
5105463
Veldhuis
704/200.1
Dec,1969

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


I claim:

1. A digital encoding apparatus for compressing a digital input signal to provide a compressed digital output signal, the digital input signal representing an audio information signal, the compressed digital output signal, after expansion, conversion to an analog signal and reproduction of the analog signal, being for perception by the human ear, the apparatus comprising:

first frequency dividing means for receiving the digital input signal and for dividing the digital input signal into a plurality of frequency ranges;

time dividing means for dividing in time at least one of the frequency ranges of the digital input signal into a plurality of blocks;

second frequency dividing means for orthogonally transforming each block to provide a plurality of spectral coefficients;

means for grouping the plurality of spectral coefficients into critical bands;

noise factor setting means for setting a noise shaping factor in response to an amplitude of the digital input signal; and

bit allocating means for allocating among the critical bands a total number of quantizing bits available for quantizing the spectral coefficients, the quantizing bits being allocated among the critical bands according to the noise-shaping factor.

2. The digital encoding apparatus of claim 1, wherein

the compressed output signal, after expansion, conversion to an analog signal and reproduction of the analog signal, has quantizing noise, the quantizing noise having a spectrum, the quantizing noise being dependent on the allocation of the quantizing bits among the critical bands, and

the noise factor setting means sets the noise shaping factor such that, as the amplitude of the digital input signal increases, the spectrum of the quantizing noise is flattened.

3. The digital encoding apparatus of claim 1, wherein

the digital input signal has a spectrum having a smoothness, and

the noise factor setting means sets the noise shaping factor in response to the smoothness of the spectrum of the digital input signal.

4. The digital encoding apparatus of claim 3, wherein

the compressed output signal, after expansion, conversion to an analog signal and reproduction of the analog signal, has quantizing noise, the quantizing noise having a spectrum, the quantizing noise being dependent on the allocation of the quantizing bits among the critical bands, and

the noise factor setting means sets the noise shaping factor such that, as the smoothness of the spectrum of the digital input signal increases, the spectrum of the quantizing noise is flattened.

5. The digital encoding apparatus of claim 4, wherein

the apparatus additionally includes a spectral smoothness index generating means for generating a spectral smoothness index in response to the smoothness of the spectrum of the digital input signal, and

the spectral smoothness index generating means derives the spectral smoothness index in response to a measured difference in energy between adjacent critical bands.

6. The digital encoding apparatus of claim 4, wherein

the apparatus additionally includes a spectral smoothness index generating means for generating a spectral smoothness index in response to the smoothness of the spectrum of the digital input signal,

the apparatus additionally comprises a floating point processing means for floating point processing the spectral components and for generating floating point data for each critical band, and

the spectral smoothness index generating means derives the spectral smoothness index in response to a difference in floating point data between adjacent critical bands.

7. The apparatus of claim 3, wherein

the digital input signal additionally has an amplitude, and

the bit allocation means changes the allocation of quantization bits in response to a signal having diminished spectral levels at high frequencies when the amplitude of the digital input signal is small.

8. The apparatus of claim 7, wherein

the digital input signal has a minimum audibility frequency, and

the high frequency spectral levels are diminished for digital input signal amplitudes that are small at frequencies not lower than the minimum audibility frequency.

9. An apparatus for decoding a compressed digital input signal to provide a digital output signal, the compressed digital input signal being derived from a non-compressed digital input signal, the non-compressed digital input signal representing an audio information signal, the compressed digital input signal, after decoding, conversion to an analog signal, and reproduction of the analog signal, being for perception by the human ear, the compressed digital input signal being derived from the non-compressed digital input signal by the steps of:

dividing the non-compressed digital input signal into a plurality of frequency ranges;

dividing in time each of the frequency ranges of the non-compressed digital input signal into a plurality of blocks;

orthogonally transforming each block to provide a plurality of spectral coefficients;

grouping the plurality of spectral coefficients into critical bands;

setting a noise shaping factor in response to an amplitude of the non-compressed digital input signal;

allocating among the critical bands a total number of quantizing bits available for quantizing the spectral coefficients, the quantizing bits being allocated among the critical bands according to the noise-shaping factor;

generating quantizing word length data indicating the number of bits used to quantize the spectral coefficients in each critical band; and

multiplexing the quantized spectral coefficients and the word length data to provide the compressed digital input signal;

the decoder comprising:

demultiplexing means for extracting the quantizing word-length data from the compressed digital input signal and for extracting the spectral coefficients from the compressed digital input signal using the quantizing word-length data,

means for grouping the extracted spectral coefficients into a plurality of frequency ranges;

means for performing an inverse orthogonal transform on the spectral coefficients in each frequency range to generate blocks of time-dependent data in each frequency range; and

means for combining the blocks of time-dependent data in each frequency range to provide the digital output signal.

10. A medium for recording compressed digital data derived from a non-compressed digital input signal by a process including the steps of:

dividing the non-compressed digital input signal into a plurality of frequency ranges;

dividing in time each of the frequency ranges of the non-compressed digital input signal into a plurality of blocks;

orthogonally transforming each block to provide a plurality of spectral coefficients;

grouping the plurality of spectral coefficients into critical bands;

setting a noise shaping factor in response to an amplitude of the non-compressed digital input signal;

allocating among the critical bands a total number of quantizing bits available for quantizing the spectral coefficients, the quantizing bits being allocated among the critical bands according to the noise-shaping factor; and

multiplexing the quantized spectral coefficients and quantizing word length data to provide the compressed digital data.

11. The medium of claim 10, wherein, in the process:

the non-compressed digital input signal has a spectrum having a smoothness, and

the step of setting a noise factor includes setting the noise shaping factor in response to the smoothness of the spectrum of the non-compressed digital input signal.

12. A method for deriving compressed digital data from a non-compressed digital input signal, the method including the steps of:

dividing the non-compressed digital input signal into a plurality of frequency ranges;

dividing in time each of the frequency ranges of the non-compressed digital input signal into a plurality of blocks;

orthogonally transforming each block to provide a plurality of spectral coefficients;

grouping the plurality of spectral coefficients into critical bands;

setting a noise shaping factor in response to an amplitude of the non-compressed digital input signal;

allocating among the critical bands a total number of quantizing bits available for quantizing the spectral coefficients, the quantizing bits being allocated among the critical bands according to the noise-shaping factor; and

multiplexing the quantized spectral coefficients and quantizing word length data to provide the compressed digital data.

13. The method of claim 12, wherein:

the non-compressed digital input signal has a spectrum having a smoothness, and

the step of setting a noise factor includes setting the noise shaping factor in response to the smoothness of the spectrum of the non-compressed digital input signal.
 Description Submit all comments and votes
 


FIELD OF THE INVENTION

The invention relates to a digital encoder circuit for compressing a digital input signal to reduce the number of bits required to represent an analog information signal.

BACKGROUND OF THE INVENTION

A variety of techniques exist for digitally encoding audio or speech signals using bit rates considerably lower than those required for pulse-code modulation (PCM). In sub-band coding (SBC), a filter bank divides the frequency band of the audio signal into a plurality of sub bands. In sub-band coding, the signal is not formed into frames along the time axis prior to coding. In transform encoding, a frame of digital signals representing the audio signal on the time axis is converted by an orthogonal transform into a block of spectral coefficients representing the audio signal on the frequency axis.

In a combination of sub-band coding and transform coding, digital signals representing the audio signal are divided into a plurality of frequency ranges by sub-band coding, and transform coding is independently applied to each of the frequency ranges.

Known filters for dividing a frequency, spectrum into a plurality of frequency ranges include the Quadrature Mirror Filter (QMF), as discussed in, for example, R. E. Crochiere, Digital Coding of Speech in Subbands, 55 BELL SYST. TECH. J., No. 8, (1976). The technique of dividing a frequency spectrum into equal-width frequency ranges is discussed in Joseph H. Rothweiler, Polyphase Quadrature Filters--A New Subband Coding Technique, ICASSP 83 BOSTON.

Known techniques for orthogonal transform include the technique of dividing the digital input audio signal into frames of a predetermined time duration, and processing the resulting frames using a Fast Fourier Transform (FFT), discrete cosine transform (DCT) or modified DCT (MDCT) to convert the signals from the time axis to the frequency axis. Discussion of a MDCT may be found in J. P. Princen and A. B. Bradley, Subband/Transform Coding Using Filter Bank Based on Time Domain Aliasing Cancellation, ICASSP 1987.

In a technique of quantizing the spectral coefficients resulting from an orthogonal transform, it is known to use sub bands that take advantage of the psychoacoustic characteristics of the human auditory system. In this, spectral coefficients representing an audio signal on the frequency axis may be divided into a plurality of critical frequency bands. The width of the critical bands increase with increasing frequency. Normally, about 25 critical bands are used to cover the audio frequency spectrum of 0 Hz to 20 kHz. In such a quantizing system, bits are adaptively allocated among the various critical bands. For example, when applying adaptive bit allocation to the spectral coefficient data resulting from a MDCT, the spectral coefficient data generated by the MDCT within each of the critical bands is quantized using an adaptively-allocated number of bits.

Known adaptive bit allocation techniques include that described in IEEE TRANs. ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL. ASSP-25, No. 4 (August, 1977) in which bit allocation is carried out on the basis of the amplitude of the signal in each critical band. This technique produces a flat quantization noise spectrum and minimizes noise energy, but the noise level perceived by the listener is not optimum because the technique does not effectively exploit the psychoacoustic masking effect.

In the bit allocation technique described in M. A. Krassner, The Critical Band Encoder--Digital Encoding of the Perceptual Requirements of the Auditory System, ICASSP 1980, the psychoacoustic masking mechanism is used to determine a fixed bit allocation that produces the necessary signal-to-noise ratio for each critical band. However, if the signal-to-noise ratio of such a system is measured using a strongly tonal signal, for example, a 1 kHz sine wave, non-optimum results are obtained because of the fixed allocation of bits among the critical bands.

It is also known that, to optimize the perceived noise level using the amplitude-based bit allocation technique discussed above, the spectrum of the quantizing noise can be adapted to the human auditory sense by using a fixed noise shaping factor. Bit allocation is carried out in accordance with the following formula:

b(k)=.delta.+1/2log.sub.2 {.sigma..sup.2(1+.gamma.) (k)/D} (1)

where b(k) is the word length of the quantized spectral coefficients in the k'th critical band, .delta. is an optimum bias, .sigma..sup.2 (k) is the signal power in the k'th critical band, D is the mean quantization error power over all the entire frequency spectrum, and .gamma. is the noise shaping factor. To find the optimum value of b(k) for each critical band, the value of .delta. is changed so that the sum of the b(k)s for all the critical bands is equal to, or just less than, the total number of bits available for quantization.

This technique does not allow bits to be concentrated sufficiently within a single critical band, so unsatisfactory results are obtained when the signal-to-noise ratio is measured using a high tonality signal, such as a 1 kHz sine wave.

OBJECTS AND SUMMARY OF THE INVENTION

It can be seen from the foregoing that, if quantization noise is minimized by allocating bits among the critical bands according to the amplitude of the signal in each respective critical band, the quantization noise perceived by the listener is not minimized. It can also be seen that if fixed numbers of bits are allocated among the critical bands, taking into account psychoacoustic masking, the signal-to-noise ratio is unsatisfactory when measured using a high-tonality signal, such as a 1 kHz sine wave.

Accordingly, it is an object of the present invention to provide a circuit in which bits are allocated among the critical bands such that the quantization noise perceived by a human listener is minimized, and that a satisfactory signal-to-noise ratio can be measured using a high-tonality input signal, such as a 1 kHz sine wave.

According to a first aspect of the invention, a digital encoding apparatus for compressing a digital input signal to provide a compressed digital output signal is provided. The apparatus includes a first frequency dividing device that receives the digital input signal and divides the digital input signal into a plurality of frequency ranges. A time dividing device divides at least one of the frequency ranges of the digital input signal in time. The result of this time division is a plurality of frames. A second frequency dividing device orthogonally transforms each frame to provide a plurality of spectral coefficients. A device groups the plurality of spectral coefficients into critical bands. A bit allocating device allocates the total number of quantizing bits available for quantizing the spectral coefficients among the critical bands. The total number of bits includes fixed bits, which are allocated among the critical bands according to a selected one of a plurality predetermined bit allocation patterns. The total number of bits also includes variable bits, which are allocated among the critical bands according to signal energy in the critical bands. Finally, the apparatus includes a device that allocates the variable bits among the critical bands in response to signal energy in a data block derived by dividing the digital input signal in time and in frequency.

In a first variation, number of fixed bits is constant, and the number of variable bits is constant.

In a second variation, the bit allocation device includes a device for apportioning the total number of quantizing bits available for quantizing the spectral coefficients between fixed bits and variable bits. The apportionment is made in response to the smoothness of the spectrum of the digital input signal.

In a second embodiment of the invention, a digital encoding apparatus for compressing a digital input signal to provide a compressed digital output signal is provided. The digital input signal represents an audio information signal, and the compressed digital output signal, after expansion, conversion to an analog signal, and reproduction of the analog signal, is for perception by the human ear. The second embodiment of the apparatus comprises a first frequency dividing device that receives the digital input signal and divides the digital input signal into a plurality of frequency ranges. A time dividing device divides in time at least one of the frequency ranges of the digital input signal. The result of the time division is a plurality of frames. A second frequency dividing device orthogonally transforms each frame to provide a plurality of spectral coefficients. A device groups the plurality of spectral coefficients into critical bands. A noise factor setting device sets a noise shaping factor in response to the digital input signal. Finally, a bit allocating device allocates the total number of quantizing bits available for quantizing the spectral coefficients among the critical bands. The quantizing bits are allocated among the critical bands according to the noise-shaping factor.

In a first method according to the invention for deriving compressed digital data from a non-compressed digital input signal, the non-compressed digital input signal is divided into a plurality of frequency ranges. Each of the frequency ranges of the non-compressed digital input signal is divided in time into a plurality of frames. Each frame is orthogonally transformed to provide a plurality of spectral coefficients. The plurality of spectral coefficients is grouped into critical bands. The total number of quantizing bits available for quantizing the spectral coefficients is allocated among the critical bands. The total number of bits includes fixed bits that are allocated among the critical bands according to a selected one of a plurality predetermined bit allocation patterns. The total number of bits also includes variable bits that are allocated among the critical bands according to signal energy in the critical bands. Finally, the quantized spectral coefficients and quantizing word length data are multiplexed to provide the compressed digital data.

In a second method according to the invention of deriving compressed digital data from a non-compressed digital input signal, the non-compressed digital input signal is divided into a plurality of frequency ranges. Each of the frequency ranges of the non-compressed digital input signal is divided in time into a plurality of frames. Each frame is orthogonally transformed to provide a plurality of spectral coefficients. The plurality of spectral coefficients is grouped into critical bands. A noise shaping factor is set in response to the non-compressed digital input signal. The total number of quantizing bits available for quantizing the spectral coefficients is allocated among the critical bands according to the noise-shaping factor. Finally, the quantized spectral coefficients and quantizing word length data are multiplexed to provide the compressed digital data.

The invention also encompasses a medium for recording compressed digital data derived from a non-compressed digital input signal according to either of the two methods set forth above.

Finally, the invention encompasses a decoding apparatus for expanding compressed digital data derived from a non-compressed digital input signal according to either of the methods set forth above. The decoding apparatus according to the invention comprises a demultiplexer that extracts the quantizing word length data from the compressed digital data and extracts the spectral coefficients from the compressed digital input signal using the quantizing word length data. A device groups the extracted spectral coefficients into a plurality of frequency ranges. A device performs an inverse orthogonal transform on the spectral coefficients in each frequency range to generate frames of time-dependent data in each frequency range. Finally, a device combines the frames of time-dependent data in each frequency range to provide the digital output signal.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block circuit diagram of an encoding apparatus according to the present invention.

FIG. 2 shows a practical example of how the digital input signal is divided in frequency and time in the circuit shown in FIG. 1.

FIG. 3 is a block diagram illustrating the bit allocation circuit of the adaptive bit allocation and encoding circuit of FIG. 1. The bit allocation circuit has a fixed ratio between fixed bits and variable bits.

FIG. 4 shows a Burke spectrum.

FIG. 5 is a graph showing an example of how the circuit shown in FIG. 1 allocates bits to a signal having a relatively flat spectrum.

FIG. 6 is a graph showing the quantization noise spectrum for the signal shown in FIG. 5.

FIG. 7 is a