|
|
|
| United States Patent | 5621856 |
| Link to this page | http://www.wikipatents.com/5621856.html |
| Inventor(s) | Akagiri; Kenzo (Kanagawa, JP) |
| Abstract | A digital encoder for compressing a digital input signal derived from an
analog signal to reduce the number of bits required to represent the
analog signal with low quantizing noise. In the encoder, a digital input
signal representing the analog signal is divided into three frequency
ranges. The digital signal in each of the three frequency ranges is
divided in time into frames, and subdivided into blocks, the time duration
of which may be adaptively varied. The blocks are orthogonally transformed
into spectral coefficients, which are grouped into critical bands. The
total number of bits available for quantizing the spectral coefficients is
allocated among the critical bands. In a first embodiment and a second
embodiment, fixed bits are allocated among the critical bands according to
a selected one of a plurality of predetermined bit allocation patterns and
variable bits are allocated among the critical bands according to the
energy in the critical bands. In the first embodiment, the apportionment
between fixed bits and variable bits is fixed. In a second embodiment, the
apportionment between fixed bits and variable bits is varied according to
the smoothness of the spectrum of the input signal. In a third embodiment,
bits are allocated among the critical bands according to a noise shaping
factor that is varied according to the smoothness of the spectrum of the
input signal. All three embodiments give low quantizing noise with both
broad spectrum signals and highly tonal signals. |
|
|
|
Title Information  |
|
|
|
|
|
Drawing from US Patent 5621856 |
|
|
Digital encoder with dynamic quantization bit allocation |
|
|
|
|
|
| Publication Date |
April 15, 1997 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Parent Case |
This is a divisional of application Ser. No. 08/272,872, filed Jul. 8,
1994; which is a continuation of Ser. No. 07/924,298, filed Aug. 3, 1992,
now abandoned. |
|
| Priority Data |
Aug 02, 1991[JP]3-216216
Aug 02, 1991[JP]3-216217
Aug 27, 1991[JP]3-271774 |
|
|
|
|
|
|
|
|
|
|
|
Title Information  |
|
|
References  |
|
|
| *references marked with an asterisk below are user-added references |
|
U.S. References |
|
|
| Add a new US reference: |
| | Reference | Relevancy | Comments | Reference | Relevancy | Comments | 5268685 Fujiwara 341/76 Dec,1993 |      Your vote accepted [0 after 0 votes] | | 5264846 Oikawa 341/76 Nov,1993 |      Your vote accepted [0 after 0 votes] | | 5235671 Mazor
Aug,1993 |      Your vote accepted [0 after 0 votes] | | 5222189 Fielder 704/229 Jun,1993 |      Your vote accepted [0 after 0 votes] | | 5166686 Sugiyama
Nov,1992 |      Your vote accepted [0 after 0 votes] | | 5157760 Akagiri 704/233 Oct,1992 |      Your vote accepted [0 after 0 votes] | | 5151941 Nishiguchi 704/233 Sep,1992 |      Your vote accepted [0 after 0 votes] | | 5142656 Fielder 704/229 Aug,1992 |      Your vote accepted [0 after 0 votes] | | 5134475 Johnston 375/240.12 Jul,1992 |      Your vote accepted [0 after 0 votes] | | 5125030 Nomura 704/222 Jun,1992 |      Your vote accepted [0 after 0 votes] | | 5117228 Fuchigami 341/200 May,1992 |      Your vote accepted [0 after 0 votes] | | 5115240 Fujiwara 341/51 May,1992 |      Your vote accepted [0 after 0 votes] | | 5109417 Fielder 704/205 Apr,1992 |      Your vote accepted [0 after 0 votes] | | 5049992 Citta 348/443 Sep,1991 |      Your vote accepted [0 after 0 votes] | | 5042069 Chhatwal 704/229 Aug,1991 |      Your vote accepted [0 after 0 votes] | | 4972484 Theile 704/200.1 Nov,1990 |      Your vote accepted [0 after 0 votes] | | 4964166 Wilson 704/229 Oct,1990 |      Your vote accepted [0 after 0 votes] | | 4956871 Swaminathan 704/229 Sep,1990 |      Your vote accepted [0 after 0 votes] | | 4949383 Koh 704/229 Aug,1990 |      Your vote accepted [0 after 0 votes] | | 4932062 Hamilton 704/233 Jun,1990 |      Your vote accepted [0 after 0 votes] | | 4896362 Veldhuis 704/200.1 Jan,1990 |      Your vote accepted [0 after 0 votes] | | 4535472 Tomcik 704/229 Aug,1985 |      Your vote accepted [0 after 0 votes] | | 4455649 Esteban 370/522 Jun,1984 |      Your vote accepted [0 after 0 votes] | | 4184049 Crochiere 704/229 Jan,1980 |      Your vote accepted [0 after 0 votes] | | 5105463 Veldhuis 704/200.1 Dec,1969 |      Your vote accepted [0 after 0 votes] | | | | | |
|
|
|
|
U.S. References |
|
|
Foreign References |
|
|
|
|
|
|
Foreign References |
|
|
Other References |
|
|
|
|
|
|
Other References |
|
|
|
|
|
References  |
|
|
|
|
|
| Market Size |
|
Estimate the gross annual revenues of the relevant market
sector:
|
| | |
| |
|
|
| Market Share |
|
Estimate the percentage of the relevant market sector this invention will capture:
|
| | |
| |
|
|
| Reasonable Royalty |
|
What percentage of gross sales should the inventor or assignee be paid?
|
| | |
| |
|
|
|
Public's "Guesstimation" of Royalty Value
|
| Market Size | N/A | [No votes] | | x | Market Share | N/A | [No votes] | | x | Reasonable Royalty | N/A | [No votes] |
| | N/A | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
Market Review  |
|
|
Technical Review  |
|
|
Claims  |
|
|
I claim:
1. A digital encoding apparatus for compressing a digital input signal to
provide a compressed digital output signal, the digital input signal
representing an audio information signal, the compressed digital output
signal, after expansion, conversion to an analog signal and reproduction
of the analog signal, being for perception by the human ear, the apparatus
comprising:
first frequency dividing means for receiving the digital input signal and
for dividing the digital input signal into a plurality of frequency
ranges;
time dividing means for dividing in time at least one of the frequency
ranges of the digital input signal into a plurality of blocks;
second frequency dividing means for orthogonally transforming each block to
provide a plurality of spectral coefficients;
means for grouping the plurality of spectral coefficients into critical
bands;
noise factor setting means for setting a noise shaping factor in response
to an amplitude of the digital input signal; and
bit allocating means for allocating among the critical bands a total number
of quantizing bits available for quantizing the spectral coefficients, the
quantizing bits being allocated among the critical bands according to the
noise-shaping factor.
2. The digital encoding apparatus of claim 1, wherein
the compressed output signal, after expansion, conversion to an analog
signal and reproduction of the analog signal, has quantizing noise, the
quantizing noise having a spectrum, the quantizing noise being dependent
on the allocation of the quantizing bits among the critical bands, and
the noise factor setting means sets the noise shaping factor such that, as
the amplitude of the digital input signal increases, the spectrum of the
quantizing noise is flattened.
3. The digital encoding apparatus of claim 1, wherein
the digital input signal has a spectrum having a smoothness, and
the noise factor setting means sets the noise shaping factor in response to
the smoothness of the spectrum of the digital input signal.
4. The digital encoding apparatus of claim 3, wherein
the compressed output signal, after expansion, conversion to an analog
signal and reproduction of the analog signal, has quantizing noise, the
quantizing noise having a spectrum, the quantizing noise being dependent
on the allocation of the quantizing bits among the critical bands, and
the noise factor setting means sets the noise shaping factor such that, as
the smoothness of the spectrum of the digital input signal increases, the
spectrum of the quantizing noise is flattened.
5. The digital encoding apparatus of claim 4, wherein
the apparatus additionally includes a spectral smoothness index generating
means for generating a spectral smoothness index in response to the
smoothness of the spectrum of the digital input signal, and
the spectral smoothness index generating means derives the spectral
smoothness index in response to a measured difference in energy between
adjacent critical bands.
6. The digital encoding apparatus of claim 4, wherein
the apparatus additionally includes a spectral smoothness index generating
means for generating a spectral smoothness index in response to the
smoothness of the spectrum of the digital input signal,
the apparatus additionally comprises a floating point processing means for
floating point processing the spectral components and for generating
floating point data for each critical band, and
the spectral smoothness index generating means derives the spectral
smoothness index in response to a difference in floating point data
between adjacent critical bands.
7. The apparatus of claim 3, wherein
the digital input signal additionally has an amplitude, and
the bit allocation means changes the allocation of quantization bits in
response to a signal having diminished spectral levels at high frequencies
when the amplitude of the digital input signal is small.
8. The apparatus of claim 7, wherein
the digital input signal has a minimum audibility frequency, and
the high frequency spectral levels are diminished for digital input signal
amplitudes that are small at frequencies not lower than the minimum
audibility frequency.
9. An apparatus for decoding a compressed digital input signal to provide a
digital output signal, the compressed digital input signal being derived
from a non-compressed digital input signal, the non-compressed digital
input signal representing an audio information signal, the compressed
digital input signal, after decoding, conversion to an analog signal, and
reproduction of the analog signal, being for perception by the human ear,
the compressed digital input signal being derived from the non-compressed
digital input signal by the steps of:
dividing the non-compressed digital input signal into a plurality of
frequency ranges;
dividing in time each of the frequency ranges of the non-compressed digital
input signal into a plurality of blocks;
orthogonally transforming each block to provide a plurality of spectral
coefficients;
grouping the plurality of spectral coefficients into critical bands;
setting a noise shaping factor in response to an amplitude of the
non-compressed digital input signal;
allocating among the critical bands a total number of quantizing bits
available for quantizing the spectral coefficients, the quantizing bits
being allocated among the critical bands according to the noise-shaping
factor;
generating quantizing word length data indicating the number of bits used
to quantize the spectral coefficients in each critical band; and
multiplexing the quantized spectral coefficients and the word length data
to provide the compressed digital input signal;
the decoder comprising:
demultiplexing means for extracting the quantizing word-length data from
the compressed digital input signal and for extracting the spectral
coefficients from the compressed digital input signal using the quantizing
word-length data,
means for grouping the extracted spectral coefficients into a plurality of
frequency ranges;
means for performing an inverse orthogonal transform on the spectral
coefficients in each frequency range to generate blocks of time-dependent
data in each frequency range; and
means for combining the blocks of time-dependent data in each frequency
range to provide the digital output signal.
10. A medium for recording compressed digital data derived from a
non-compressed digital input signal by a process including the steps of:
dividing the non-compressed digital input signal into a plurality of
frequency ranges;
dividing in time each of the frequency ranges of the non-compressed digital
input signal into a plurality of blocks;
orthogonally transforming each block to provide a plurality of spectral
coefficients;
grouping the plurality of spectral coefficients into critical bands;
setting a noise shaping factor in response to an amplitude of the
non-compressed digital input signal;
allocating among the critical bands a total number of quantizing bits
available for quantizing the spectral coefficients, the quantizing bits
being allocated among the critical bands according to the noise-shaping
factor; and
multiplexing the quantized spectral coefficients and quantizing word length
data to provide the compressed digital data.
11. The medium of claim 10, wherein, in the process:
the non-compressed digital input signal has a spectrum having a smoothness,
and
the step of setting a noise factor includes setting the noise shaping
factor in response to the smoothness of the spectrum of the non-compressed
digital input signal.
12. A method for deriving compressed digital data from a non-compressed
digital input signal, the method including the steps of:
dividing the non-compressed digital input signal into a plurality of
frequency ranges;
dividing in time each of the frequency ranges of the non-compressed digital
input signal into a plurality of blocks;
orthogonally transforming each block to provide a plurality of spectral
coefficients;
grouping the plurality of spectral coefficients into critical bands;
setting a noise shaping factor in response to an amplitude of the
non-compressed digital input signal;
allocating among the critical bands a total number of quantizing bits
available for quantizing the spectral coefficients, the quantizing bits
being allocated among the critical bands according to the noise-shaping
factor; and
multiplexing the quantized spectral coefficients and quantizing word length
data to provide the compressed digital data.
13. The method of claim 12, wherein:
the non-compressed digital input signal has a spectrum having a smoothness,
and
the step of setting a noise factor includes setting the noise shaping
factor in response to the smoothness of the spectrum of the non-compressed
digital input signal. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
FIELD OF THE INVENTION
The invention relates to a digital encoder circuit for compressing a
digital input signal to reduce the number of bits required to represent an
analog information signal.
BACKGROUND OF THE INVENTION
A variety of techniques exist for digitally encoding audio or speech
signals using bit rates considerably lower than those required for
pulse-code modulation (PCM). In sub-band coding (SBC), a filter bank
divides the frequency band of the audio signal into a plurality of sub
bands. In sub-band coding, the signal is not formed into frames along the
time axis prior to coding. In transform encoding, a frame of digital
signals representing the audio signal on the time axis is converted by an
orthogonal transform into a block of spectral coefficients representing
the audio signal on the frequency axis.
In a combination of sub-band coding and transform coding, digital signals
representing the audio signal are divided into a plurality of frequency
ranges by sub-band coding, and transform coding is independently applied
to each of the frequency ranges.
Known filters for dividing a frequency, spectrum into a plurality of
frequency ranges include the Quadrature Mirror Filter (QMF), as discussed
in, for example, R. E. Crochiere, Digital Coding of Speech in Subbands, 55
BELL SYST. TECH. J., No. 8, (1976). The technique of dividing a frequency
spectrum into equal-width frequency ranges is discussed in Joseph H.
Rothweiler, Polyphase Quadrature Filters--A New Subband Coding Technique,
ICASSP 83 BOSTON.
Known techniques for orthogonal transform include the technique of dividing
the digital input audio signal into frames of a predetermined time
duration, and processing the resulting frames using a Fast Fourier
Transform (FFT), discrete cosine transform (DCT) or modified DCT (MDCT) to
convert the signals from the time axis to the frequency axis. Discussion
of a MDCT may be found in J. P. Princen and A. B. Bradley,
Subband/Transform Coding Using Filter Bank Based on Time Domain Aliasing
Cancellation, ICASSP 1987.
In a technique of quantizing the spectral coefficients resulting from an
orthogonal transform, it is known to use sub bands that take advantage of
the psychoacoustic characteristics of the human auditory system. In this,
spectral coefficients representing an audio signal on the frequency axis
may be divided into a plurality of critical frequency bands. The width of
the critical bands increase with increasing frequency. Normally, about 25
critical bands are used to cover the audio frequency spectrum of 0 Hz to
20 kHz. In such a quantizing system, bits are adaptively allocated among
the various critical bands. For example, when applying adaptive bit
allocation to the spectral coefficient data resulting from a MDCT, the
spectral coefficient data generated by the MDCT within each of the
critical bands is quantized using an adaptively-allocated number of bits.
Known adaptive bit allocation techniques include that described in IEEE
TRANs. ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL. ASSP-25, No. 4
(August, 1977) in which bit allocation is carried out on the basis of the
amplitude of the signal in each critical band. This technique produces a
flat quantization noise spectrum and minimizes noise energy, but the noise
level perceived by the listener is not optimum because the technique does
not effectively exploit the psychoacoustic masking effect.
In the bit allocation technique described in M. A. Krassner, The Critical
Band Encoder--Digital Encoding of the Perceptual Requirements of the
Auditory System, ICASSP 1980, the psychoacoustic masking mechanism is used
to determine a fixed bit allocation that produces the necessary
signal-to-noise ratio for each critical band. However, if the
signal-to-noise ratio of such a system is measured using a strongly tonal
signal, for example, a 1 kHz sine wave, non-optimum results are obtained
because of the fixed allocation of bits among the critical bands.
It is also known that, to optimize the perceived noise level using the
amplitude-based bit allocation technique discussed above, the spectrum of
the quantizing noise can be adapted to the human auditory sense by using a
fixed noise shaping factor. Bit allocation is carried out in accordance
with the following formula:
b(k)=.delta.+1/2log.sub.2 {.sigma..sup.2(1+.gamma.) (k)/D} (1)
where b(k) is the word length of the quantized spectral coefficients in the
k'th critical band, .delta. is an optimum bias, .sigma..sup.2 (k) is the
signal power in the k'th critical band, D is the mean quantization error
power over all the entire frequency spectrum, and .gamma. is the noise
shaping factor. To find the optimum value of b(k) for each critical band,
the value of .delta. is changed so that the sum of the b(k)s for all the
critical bands is equal to, or just less than, the total number of bits
available for quantization.
This technique does not allow bits to be concentrated sufficiently within a
single critical band, so unsatisfactory results are obtained when the
signal-to-noise ratio is measured using a high tonality signal, such as a
1 kHz sine wave.
OBJECTS AND SUMMARY OF THE INVENTION
It can be seen from the foregoing that, if quantization noise is minimized
by allocating bits among the critical bands according to the amplitude of
the signal in each respective critical band, the quantization noise
perceived by the listener is not minimized. It can also be seen that if
fixed numbers of bits are allocated among the critical bands, taking into
account psychoacoustic masking, the signal-to-noise ratio is
unsatisfactory when measured using a high-tonality signal, such as a 1 kHz
sine wave.
Accordingly, it is an object of the present invention to provide a circuit
in which bits are allocated among the critical bands such that the
quantization noise perceived by a human listener is minimized, and that a
satisfactory signal-to-noise ratio can be measured using a high-tonality
input signal, such as a 1 kHz sine wave.
According to a first aspect of the invention, a digital encoding apparatus
for compressing a digital input signal to provide a compressed digital
output signal is provided. The apparatus includes a first frequency
dividing device that receives the digital input signal and divides the
digital input signal into a plurality of frequency ranges. A time dividing
device divides at least one of the frequency ranges of the digital input
signal in time. The result of this time division is a plurality of frames.
A second frequency dividing device orthogonally transforms each frame to
provide a plurality of spectral coefficients. A device groups the
plurality of spectral coefficients into critical bands. A bit allocating
device allocates the total number of quantizing bits available for
quantizing the spectral coefficients among the critical bands. The total
number of bits includes fixed bits, which are allocated among the critical
bands according to a selected one of a plurality predetermined bit
allocation patterns. The total number of bits also includes variable bits,
which are allocated among the critical bands according to signal energy in
the critical bands. Finally, the apparatus includes a device that
allocates the variable bits among the critical bands in response to signal
energy in a data block derived by dividing the digital input signal in
time and in frequency.
In a first variation, number of fixed bits is constant, and the number of
variable bits is constant.
In a second variation, the bit allocation device includes a device for
apportioning the total number of quantizing bits available for quantizing
the spectral coefficients between fixed bits and variable bits. The
apportionment is made in response to the smoothness of the spectrum of the
digital input signal.
In a second embodiment of the invention, a digital encoding apparatus for
compressing a digital input signal to provide a compressed digital output
signal is provided. The digital input signal represents an audio
information signal, and the compressed digital output signal, after
expansion, conversion to an analog signal, and reproduction of the analog
signal, is for perception by the human ear. The second embodiment of the
apparatus comprises a first frequency dividing device that receives the
digital input signal and divides the digital input signal into a plurality
of frequency ranges. A time dividing device divides in time at least one
of the frequency ranges of the digital input signal. The result of the
time division is a plurality of frames. A second frequency dividing device
orthogonally transforms each frame to provide a plurality of spectral
coefficients. A device groups the plurality of spectral coefficients into
critical bands. A noise factor setting device sets a noise shaping factor
in response to the digital input signal. Finally, a bit allocating device
allocates the total number of quantizing bits available for quantizing the
spectral coefficients among the critical bands. The quantizing bits are
allocated among the critical bands according to the noise-shaping factor.
In a first method according to the invention for deriving compressed
digital data from a non-compressed digital input signal, the
non-compressed digital input signal is divided into a plurality of
frequency ranges. Each of the frequency ranges of the non-compressed
digital input signal is divided in time into a plurality of frames. Each
frame is orthogonally transformed to provide a plurality of spectral
coefficients. The plurality of spectral coefficients is grouped into
critical bands. The total number of quantizing bits available for
quantizing the spectral coefficients is allocated among the critical
bands. The total number of bits includes fixed bits that are allocated
among the critical bands according to a selected one of a plurality
predetermined bit allocation patterns. The total number of bits also
includes variable bits that are allocated among the critical bands
according to signal energy in the critical bands. Finally, the quantized
spectral coefficients and quantizing word length data are multiplexed to
provide the compressed digital data.
In a second method according to the invention of deriving compressed
digital data from a non-compressed digital input signal, the
non-compressed digital input signal is divided into a plurality of
frequency ranges. Each of the frequency ranges of the non-compressed
digital input signal is divided in time into a plurality of frames. Each
frame is orthogonally transformed to provide a plurality of spectral
coefficients. The plurality of spectral coefficients is grouped into
critical bands. A noise shaping factor is set in response to the
non-compressed digital input signal. The total number of quantizing bits
available for quantizing the spectral coefficients is allocated among the
critical bands according to the noise-shaping factor. Finally, the
quantized spectral coefficients and quantizing word length data are
multiplexed to provide the compressed digital data.
The invention also encompasses a medium for recording compressed digital
data derived from a non-compressed digital input signal according to
either of the two methods set forth above.
Finally, the invention encompasses a decoding apparatus for expanding
compressed digital data derived from a non-compressed digital input signal
according to either of the methods set forth above. The decoding apparatus
according to the invention comprises a demultiplexer that extracts the
quantizing word length data from the compressed digital data and extracts
the spectral coefficients from the compressed digital input signal using
the quantizing word length data. A device groups the extracted spectral
coefficients into a plurality of frequency ranges. A device performs an
inverse orthogonal transform on the spectral coefficients in each
frequency range to generate frames of time-dependent data in each
frequency range. Finally, a device combines the frames of time-dependent
data in each frequency range to provide the digital output signal.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block circuit diagram of an encoding apparatus according to the
present invention.
FIG. 2 shows a practical example of how the digital input signal is divided
in frequency and time in the circuit shown in FIG. 1.
FIG. 3 is a block diagram illustrating the bit allocation circuit of the
adaptive bit allocation and encoding circuit of FIG. 1. The bit allocation
circuit has a fixed ratio between fixed bits and variable bits.
FIG. 4 shows a Burke spectrum.
FIG. 5 is a graph showing an example of how the circuit shown in FIG. 1
allocates bits to a signal having a relatively flat spectrum.
FIG. 6 is a graph showing the quantization noise spectrum for the signal
shown in FIG. 5.
FIG. 7 is a | | |