|
Claims  |
|
|
What is claimed is:
1. A digital system including a coder and a decoder for subband coding of a
digital audio music signal having a given sampling rate 1/T, the coder
comprising:
analysis filter means responsive to the audio music signal for generating a
number of P subband signals, the analysis filter means dividing the audio
music signal band according to the quadrature mirror filter technique,
with sampling rate reduction into successive subbands of band numbers p
(i.ltoreq.p.ltoreq.P) increasing with the frequency, the bandwidth and the
sampling rate for each subband being an integral submultiple of 1/(2T) and
1/T, respectively, and the bandwidths of the subbands approximately
corresponding with the critical bandwidths of the human auditory system in
the respective frequency ranges,
means responsive to each of the subband signals for determining a
characteristic parameter G(p;m) which is representative of the signal
level in a block having a same number of M signal samples for each
subband, m being the number of the block,
means for adaptively quantizing the blocks of the respective subband
signals in response to the respective characteristic parameters G(p;m);
and the decoder comprising:
means for adaptively dequantizing the blocks of quantized subband signals
in response to the respective characteristic parameters G(p;m),
synthesis filter responsive to the dequantized subband signals means for
constructing a replica of the digital audio music signal, these synthesis
filter means merging the subbands to the audio music signal band according
to the quadrature mirror filter technique, with the sampling rate
increase,
characterized in that
the respective quantizing means in the coder and the respective
dequantizing means in the decoder for each of the subbands having a band
number p smaller than p.sub.im are arranged for the respective quantizing
and dequantizing of the subband signals with a fixed number of B(p) bits,
the subband having band number p.sub.im being situated in the portion of
the audio music signal band with the lowest thresholds for masking noise
in critical bands of the human auditory system by single music tones in
the centre of the respective critical bands,
the coder and the decoder each further include bit allocation means
responsive to the respective characteristic parameter G(p;m) of the
subbands having a band number p not smaller than p.sub.im within an
allocation window having a duration equal to the block length for the
subband having the band number p.sub.im, for allocating a number of B(p;m)
bits from a predetermined fixed total number of B bits for the allocation
window to the respective quantizing means in the coder and the respective
dequantizing means in the decoder for the signal block having block number
m of the subband having band number p, the bit allocation means each
comprising:
comparator means for comparing within each allocation window the
characteristic parameters G(p;m) to respective thresholds T(p) for the
subbands having band number p and for generating respective binary
comparator signals C(p;m) having a first value C(p;m)="1" for a parameter
G(p;m) not smaller than the threshold T(p) and a second value C(p;m)="0"
in the opposite case, these thresholds T(p) being related to the
thresholds of the human auditory system for just perceiving single music
tones,
means for storing a predetermined allocation pattern {B(p)} of numbers of
B(p) quantizing bits for subbands having respective band numbers p, these
numbers B(p) being related to the thresholds for masking noise in the
critical bands of the human auditory system by single music tones in the
centre of the respective critical bands,
means for determining an allocation pattern {B(p;m)} of respective numbers
of B(p;m) quantizing bits for the signal block having the block number m
of the subband having band number p, in response to the allocation pattern
stored {B(p)} and the respective characteristic parameters G(p;m) and
comparator signals C(p;m), the allocation pattern {B(p;m)} being equal to
the allocation pattern stored {B(p)} if all comparator signals C(p;m)
within an allocation window have the said first value C(p;m)="1" and, in
the opposite case, the allocation pattern {B(p;m)} being formed by not
allocating quantizing bits to blocks within an allocation window having a
comparator signal of the said second value C(p;m)="0" and by allocating
the sum S of the numbers of B(p) quantizing bits available within an
allocation window for the latter blocks in the allocation pattern stored
{B(p)} to the blocks within an allocation window having a comparator
signal of the said first value C(p;m)="1" and having the largest values of
the characteristic parameter G(p;m) for obtaining numbers of B(p;m)
quantizing bits which are greater than the corresponding numbers of B(p)
quantizing bits in the allocation pattern stored {B(p)},
means for supplying the allocation pattern {B(p;m)} determined thus to the
respective quantizing means in the coder and the respective dequantizing
means in the decoder.
2. A digital system as claimed in claim 1, characterized in that the said
bit allocation means also include means which in response to successive
characteristic parameters G(p;m) and G(p;m+1) of each subband of band
number p exceeding p.sub.im :
do not allocate quantizing bits to block (p;m+1) and add the numbers of
B(p;m+1) quantizing bits available for this block to the said sum S, if
the ratio Q=G(p;m)/G(p;m+1) is greater than a predetermined value R(p) of
the order of 10.sup.2 and block (p;m+1) is situated within the allocation
window;
do not allocate quantizing bits to block (p;m) and add the numbers of
B(p;m) quantizing bits available for this block to the said sum S, if the
ratio Q=G(p;m)/G(p;m+1) is smaller than the value 1/R(p) and block (p;m)
is situated within the allocation window.
3. A coder for subband coding for a digital audio music signal having a
given sampling rate 1/T, the coder comprising:
(a) analysis filter means responsive to the audio music signal for
generating a number of P subband signals, the analysis filter means
dividing the audio music signal band according to the quadrature mirror
filter technique, with sampling rate reduction into successive subbands of
band numbers p (i.ltoreq.p.ltoreq.P) increasing with the frequency, the
bandwidth and the sampling rate for each subband being an integral
submultiple of 1/(2T) and 1/T, respectively, and the bandwidths of the
subbands approximately corresponding with the critical bandwidths of the
human auditory system in the frequency ranges,
(b) means responsive to each of the subband signals for determining a
characteristic parameter G(p;m) which is representative of the signal
level in a block having a same number of M signal samples for each
subband, m being the number of the blocks; and
(c) means for adaptively quantizing the blocks of the respective subband
signals in response to the respective characteristic parameters G(p;m),
wherein
said means for adaptively quantizing for each of the subbands having a band
number p smaller than p.sub.im are arranged for the quantizing of the
subband signals with a fixed numbers of B(p) bits, the subband having band
numbers p.sub.im being situated in the portion of the audio music signal
band with the lowest thresholds for masking noise in critical bands of the
human auditory system by single music tones in the center of the
respective critical bands, said coder further including
(d) bit allocation means responsive to the respective characteristic
parameters G(p;m) of the subbands having a band number p not smaller than
p.sub.im within an allocation window having a duration equal to the block
length for the subband having the band number p.sub.im, for allocating a
number of B(p;m) bits for a predetermined fixed total number of B bits for
the allocation window to the quantizing means in the coder for the signal
block having block number m of the subband having band number p, the bit
allocation means each comprising
(i) comparator means for comparing within each allocation window the
characteristic parameters G(p;m) to respective thresholds T(p) for the
subbands having band number p and for generating respective binary
comparator signals C(p;m) having a first value C(p;m)="1" for a parameter
G(p;m) not smaller than the threshold T(p) and a second value C(p;m)="0"
in the opposite case, these thresholds T(p) being related to the
thresholds, of the human auditory system for perceiving just single music
tones.,
(ii) means for storing a predetermined allocation pattern {B(p)} of numbers
of B(p) quantizing bits for subbands having respective band numbers p,
these numbers B(p) being related to respective band numbers p, these
numbers B(p) being related to the thresholds for masking noise in the
critical bands of the human auditory system by single music tones in the
center of the respective critical bands,
(iii) means for determining an allocation pattern {B(p;m)} of respective
numbers of B(p;m) quantizing bits for the signal block having the block
number m of the subband having band number p, in response to the
allocation pattern stored {B(p)} and the respective characteristic
parameters G(p;m) and comparator signals C(p;m), the allocation pattern
{B(p;m)} being equal to the allocation pattern stored {B(p)} if all
comparator signals C(p;m) within an allocation window have the said first
value C(p;m)="1" and, in the opposite case, the allocation pattern
{B(p;m)} bing formed by not allocating quantizing bits to blocks within an
allocation window having a comparator signal of the said second value
C(p;m)="0" and by allocating the sum S of the numbers of B(p) quantizing
bits available within an allocation window for the latter blocks in the
allocation pattern stored {B(p)} to the blocks within an allocation window
having a comparator signal of the said first value C(p;m)="1" and having
the largest values of the characteristic parameter G(p;m) for obtaining
numbers of B(p;m) quantizing bits which are greater than the corresponding
numbers of B(p) quantizing bits in the allocation patter stored {B(p)},
and
(iv) means for supplying the allocation pattern {B(p;m)} determined thus to
the respective quantizing means in the coder.
4. An encoder according to claim 1, wherein said bit allocation means
further includes means which in response to successive characteristic
parameters G(p;m) and G(p;m+1) of each subband of band number p exceeding
p.sub.im
do not allocate quantizing bits to block (p;m+1) and add the numbers of
B(p;m+1) quanitizing bits available for this block to the said sum S if
the ratio Q=G(p;m)/G(p;m+1) is greater than a predetermined value R(p) of
the order of ten to the power of two and block (p;m+1) is situated within
the allocation window, and
do not allocate quantizing bits to block (p;m) and add the numbers of
B(p;m) quantizing bits available for this block to the said sum S, if the
ratio Q=G(p;m)/G(p;m+1) is smaller than the value 1/R(p) and the block
(p;m) is situated within the allocation window. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
The invention relates to a digital system including a coder and a decoder
for subband coding of a digital audio signal having a given sampling rate
1/T, the coder comprising:
analysis filter means responsive to the audio signal, for generating a
number of P subband signals, the analysis filter means dividing the audio
signal band according to the quadrature mirror filter technique, with
sampling rate reduction, into successive subbands of band numbers p
(1.ltoreq.p.ltoreq.P) increasing with the frequency, the bandwidth and the
sampling rate for each subband being an integral submultiple of 1/(2T) and
1/T, respectively, and the bandwidths of the subbands approximately
corresponding with the critical bandwidths of the human auditory system in
the respective frequency ranges,
means responsive to each of the subband signals, for determining a
characteristic parameter G(p;m) which is representive of the signal level
in a block having a same number of M signal samples for each subband, m
being the number of the block,
means for adaptively quantizing the blocks of the respective subband
signals in response to the respective characteristic parameters G(p;m);
and the decoder comprising:
means for adaptively dequantizing the blocks of the quantized subband
signals in response to the respective characteristic parameters G(p;m),
synthesis filter means responsive to the dequantized subband signals for
constructing a replica of the digital audio signal, these synthesis filter
means merging the subbands to the audio signal band according to the
quadrature mirror filter technique, with sampling rate increase.
A system for subband coding of a similar structure is known from the
article entitled "The Critical Band Coder--Digital Encoding of Speech
Signals Based on the Perceptual Requirements of the Auditory System" by M.
E. Krasner, published in Proc. IEEE ICASSP 80, Vol. 1, pp. 327-331, Apr.
9-11, 1980.
In this known system, use in made of a subdivision of the speech signal
band into a number of subbands, whose bandwidths approximately correspond
with the bandwidths of the critical bands of the human auditory system in
the respective frequency ranges (compare FIG. 2 in the article by
Krasner). This subdivision has been chosen because on the basis of
psychoacoustic experiments it may be expected that in a suchlike subband
the quantizing noise will be optimally masked by the signals within this
subband when the quantizing takes account of the noise-masking curve of
the human auditory system (this curve indicates the threshold for masking
the noise in a critical band by a single tone in the centre of the
critical band, compare FIG. 3 in the article by Krasner).
In the case of a high-quality digital music signal, represented according
to the Compact Disc standard with 16 bits per signal sampling at a sample
rate of 1/T=44.1 kHz, it appears that the use of this known subband coding
with a suitably chosen bandwidth and a suitably chosen quantizing for the
respective subbands results in quantized output signals of the coder which
can be represented with an average number of 2.5 bits per signal sample,
while the quality of the replica of the music signal does not perceptibly
differ from that of the original music signal in virtually all passages of
nearly all sorts of music signals. However, in certain passages of some
sorts of music signals the quantizing noise is still audible. The
audibility of the quantizing noise can be reduced by increasing the number
of quantizing levels, but this implies that the average number of bits per
sample of the quantized output signals of the coder than has to be
increased too.
SUMMARY OF THE INVENTION
The invention has for its object to provide a digital system of the type
mentioned in the opening paragraph for subband coding of high-quality
audio signals, in which the audibility of quantizing noise in the replica
of the audio signals is reduced in an effective manner without increasing
the average number of bits per sample of the quantized output signals of
the coder.
The digital system for subband coding of a digital audio signal in
accordance with the invention is characterized in that
the respective quantizing means in the coder and the respective
dequantizing means in the decoder for each of the subbands having a band
number p smaller than p.sub.im are arranged for the respective quantizing
and dequantizing of the subband signals with a fixed number of B(p) bits,
the subband having band number p.sub.im being situated in the portion of
the audio signal band with the lowest thresholds for masking noise in
critical bands of the human auditory system by single tones in the centre
of the respective critical bands,
the coder and the decoder each further include bit allocation means
responsive to the respective characteristic parameters G(p;m) of the
subbands having a band number p not smaller than p.sub.im within an
allocation window having a duration equal to the block length for the
subband having the band number p.sub.im, for allocating a number of B(p;m)
bits from a predetermined fixed total number of B bits for the allocation
window to the respective quantizing means in the coder and the respective
dequantizing means in the decoder for the signal block having block number
m of the subband having band number p, the bit allocation means each
comprising:
comparator means for comparing within each allocation window the
characteristic parameters G(p;m) to respective threshold T(p) for the
subbands having band number p and for generating respective binary
comparator signals C(p;m) having a first value C(p;m)="1" for a parameter
G(p;m) not smaller than the threshold T(p) and a second value C(p;m)="0"
in the opposite case, these thresholds T(p) being related to the threshold
of the human auditory system for just perceiving single tones,
means for storing a predetermined allocation pattern {B(p)} of numbers of
B(p) quantizing bits for subbands having respective band numbers p, these
numbers B(p) being related to the thresholds for masking noise in the
critical bands of the human auditory system by single tones in the centre
of the respective critical bands,
means for determining an allocation pattern {B(p;m)} of respective numbers
of B(p;m) quantizing bits for the signal-block having the block number m
of the subband having band number p, in response to the allocation pattern
stored {B(p)} and the respective characteristic parameters G(p;m) and
comparator signals C(p;m), the allocation pattern {B(p;m)} being equal to
the allocation pattern stored {B(p)} if all comparator signals C(p;m)
within an allocation window have the said first value C(p;m)="1" and, in
the opposite case, the allocation pattern {B(p;m)} in the opposite case
being formed by not allocating quantizing bits to blocks within an
allocation window having a comparator signal of the said second value
C(p;m)="0" and by allocating the sum S of the numbers of B(p) quantizing
bits available within an allocation window for the latter blocks in the
allocation pattern stored {B(p)} to the blocks within an allocation window
having a comparator signal of the said first value C(p;m)="1" and having
the largest values of the characteristic parameter G(p;m), for obtaining
numbers of B(p;m) quantizing bits which are greater than the corresponding
numbers of B(p) quantizing bits in the allocation pattern stored {B(p)},
means for supplying the allocation pattern {B(p;m)} determined thus to the
respective quantizing means in the coder and the respective dequantizing
means in the decoder.
The measures according to the invention are based on the recognition that
the quantizing noise is especially audible in music passages presenting
single tones. During such passages the greater part of the subbands have
very little or no signal energy from the mid-audio frequency range
onwards, whereas each of the few remaining subbands has no more than one
spectral component possessing significant signal energy. If this spectral
component is situated around lower or upper boundary of the subband, the
critical band of the human auditory system for this spectral component
will not correspond with this subband. The quantizing noise, however, is
spread out over the entire subband, so that the quantizing noise outside
the critical band is not masked for this spectral component as contrasted
with the case is which various spectral components possessing significant
energy occur in the subband or in adjacent subbands and the mutually
overlapping critical bands sufficiently mask the quantizing noise for the
various spectral components. In accordance with the invention no
quantizing bits are allocated to blocks of subband signals within an
allocation window which contain little or no signal energy, and the
quantizing bits "saved" thus are used for a finer quantizing of the blocks
of subband signals within the same allocation window which do contain
significant signal energy, starting with a block containing the highest
signal energy and ending when the number of remaining "saved" quantizing
bits is no longer sufficient for a further quantizing refinement or when
all blocks having significant signal energy have undergone a sufficiently
fine quantizing. The total number of quantizing bits for the allocation
window is not changed and the reallocation of any "saved" quantizing bits
is carried out in response to the characteristic parameters representing
the signal energy in a block and which are already present in both coder
and decoder. The refined quantization during music passages presenting
single tones thus results in an effective reduction of the audibility of
quantizing noise without the need of increasing the average number of
quantizing bits per output signal sample of the coder. Extensive listening
tests with widely varying sorts of music signals have shown that generally
no quantizing noise is audible any longer during music passages presenting
single tones thanks to the measures according to the invention.
The only sporadically occurring cases of audible quantizing noise prove to
relate predominantly to passages of music in which the music signal has
strong attacks, the signal energy in substantially all subbands suddenly
changing considerably. In a preferred embodiment of the present system for
subband coding of a digital audio signal also the audibility of the
quantizing noise during passages of music with strong attacks can be
reduced effectively because the bit allocation means in the coder and the
decoder also include means which in response to successive characteristic
parameters G(p;m) and G(p;m+1) of each subband having a band number p
exceeding p.sub.im :
do not allocate any quantizing bits to block (p;m+1) and add the numbers of
B (p;m+1) quantizing bits available for this block to the said sum S, if
the ratio Q=G(p;m)/G(p;m+1) ratio is greater than a predetermined value
R(p) of the order of 10.sup.2 and block (p;m+1) is situated within the
allocation window;
do not allocate any quantizing bits to block (p;m) and add the numbers of
B(p;m) quantizing bits available for this block to the said sum S, if the
ratio Q=G(p;m)/G(p;m+1) is smaller than the value 1/R(p) and block (p;m)
is situated within the allocation window.
These measures exploit the psychoacoustic effect of temporal masking, which
means the property of the human auditory system that its threshold for
perceiving signals shortly before and shortly after the occurrence of
another signal which has a relatively high signal energy appears to be
temporarily higher than in the absence of the latter signal. More
specifically, in this preferred embodiment no quantizing bits are
allocated to blocks with a relatively low signal energy which occur
shortly before and shortly after occurrence of blocks with a relatively
high signal energy, and the quantizing bits "saved" thus are used for the
more refined quantizing of these blocks having a relatively high signal
energy and the consequent reduction of the quantizing noise during these
blocks, whereas the fact that the quantizing bits are not allocated to
adjacent blocks with a relatively low signal energy does in fact not
result in audible distortion owing to the temporal masking by the human
auditory system.
The invention and the advantages realized therewith will now be explained
in the following description of an embodiment with reference to the
drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1: Shows a block diagram of a digital system for subband coding of a
digital audio signal in accordance with the invention;
FIG. 2A: shows a diagram of a series of band splittings and band mergings
which can be used in the filter banks of the system shown in FIG. 1;
FIG. 2B: shows a block diagram of a band splitting and a band merging
according to the quadrature mirror filter technique and
FIG. 2C: shows the amplitude response of the filters used in FIG. 2B;
FIG. 3: shows a table of data relating to the subbands obtained from
applying the diagram shown in FIG. 2A to a 0-22.05 kHz music signal band;
FIG. 4: shows a frequency diagram for qualitatively explaining how
quantizing noise sometimes becomes audible during music passages
presenting single tones;
FIG. 5: shows an example of an allocation window used according to the
invention for allocating quantizing bits in response to parameters of
subband signal levels;
FIG. 6: shows a block diagram of bit allocation means in the system shown
in FIG. 1 which are arranged in accordance with the invention;
FIG. 7: shows a block diagram of a signal processor which can be used in
the bit allocation means shown in FIG. 6;
FIG. 8 and FIG. 9: show flow charts of a possible program routine for a
module of the signal processor shown in FIG. 7;
FIG. 10: shows a table of data relating to a ranking of quantizing options
used in the flow chart shown in FIG. 9;
FIG. 11: shows a flow chart of an optional program routine for an
additional module of the signal processor shown in FIG. 7 which can be
utilized in a preferred embodiment for the subband coding according to the
invention; and
FIG. 12: shows a block diagram of a quantizer and an associated dequantizer
for a subband, in which us is made of a quantization optimized for
probability density functions.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
In FIG. 1 a simplified functional block diagram is shown of a digital
system having a coder 1 and a decoder 2 for subband coding of a digital
audio signal of a given sampling rate 1/T. The basic structure of such a
system is generally known, see the above article by Krasner and the
chapter of "Subband Coding" in the book entitled "Digital Coding of
Waveforms" by N. S. Jayant and P. Noll, Prentice-Hall, Inc., Engelwood
Cliffs, New Jersey, 1984, pp. 486-509. This basic structure will now be
described with reference to FIG. 1 for the case of a digital high-quality
music signal which is accordance with the Compact Disc standard is
represented with 16 bits per signal sample at a sampling rate of 1/T=44.1
kHz. In this description digital signals are denoted in a conventional
manner, x(k) being a quantized signal sample of signal x(t) at instant
t=kT.sub.s and the relevant sampling rate 1/T.sub.s appearing from the
context.
In coder 1 a music signal x(k) having a sampling rate 1/T.sub.s =1/T=44.1
kHz is applied to an analysis filter bank 3 which divides the music signal
band of 0-22.05 kHz according to the quadrature mirror filter technique,
with sampling rate reduction, into a number of p=26 subbands of band
numbers p(1.ltoreq.p.ltoreq.P=26) increasing with the rate. For each
subband the bandwidth W(p) is an integral submultiple of the bandwidth
1/(2T)=22.05 kHz of the music signal band and the sampling rate 1/T.sub.s
(p) is equal to the same submultiple of the sampling rate 1/T=44.1 kHz of
music signal x(k) at the input of filter bank 3. In response to this music
signal x(k) filter bank 3 generates a number of P=26 subband signals
x.sub.p (k) which are quantized blockwise, the signal block for each
subband containing a same number of M=32 signal samples. After being
transmitted via and/or stored in a medium 4 the quantized subband signals
s.sub.p (k) are dequantized blockwise in decoder 2 and the resulting
dequantized subband signals x.sub.p (k) are applied to a synthesis filter
bank 5. The subbands obtained in filter bank 3 of coder 1 are merged in
this synthesis filter bank 5 to become the music signal bank of 0-22.05
kHz according to the quadrature mirror filter technique, with sampling
rate increase. Thus the filter bank 5 constructs a replica x(k) of the
original music signal x(k).
For the quantizing of the subband signals x.sub.p (k) known block-adaptive
PCM methods are used. Thereto, coder 1 contains of each subband a signal
buffer 6 (p), in which a signal block of M=32 samples is stored
temporarily. To each signal buffer 6(p) a level detector 7(p) is connected
to determine for each block stored having block number m a characteristic
parameter G(p;m) representative of the signal level in this block. This
characteristic parameter G(p;m) is used for an optimal adjustment of an
adaptive quantizer 8(p) for quantizing the signal block stored having
block number m. The block of quantized subband signal samples s.sub.p (k)
obtained thus is applied in decoder 2 to an adaptive dequantizer 9(p)
which is also adjusted by characteristic parameter G(p;m). As is well
known, the signal level can be represented by the average value of the
amplitude or the power of the signal samples of a block, but also by the
peak value of the amplitude of the signale samples in a block. The
representation utilized in the level detector 7(p) depends on the type of
quantizer 8(p). Since the same characteristic parameter G(p;m) is used in
quantizer 8(p) and in dequantizer 9(p), level detector 7(p) has to
quantize this parameter G(p;m), in the case of a high-quality music signal
an 8-bit logarithmic quantizing being effected.
In the present system a subdivision of the music signal band of 0-22.05 kHz
is made according to a perceptual criterion, the bandwidths W(p) of the
subbands having the respective band numbers p(1.ltoreq.p.ltoreq.26)
approximately corresponding to the critical bandwidths of the human
auditory system in the respective frequency ranges (see FIG. 2 in the
above article by Krasner). In view of a simple implementation of filter
banks 3 and 5, the quadrature mirror filter technique is used for the
subdivision into subbands and the corresponding reduction of the sampling
rate and the merging of the subbands and the corresponding increase of the
sampling rate, respectively. According to this quadrature mirror filter
technique the subdivision is effected as a series of band splittings and
the reunion as a series of band mergings. For the present case of music
signal band of 0-22.05 kHz FIG. 2A shows the diagram of the series of
splittings and mergings used in filter banks 3 and 5 for obtaining
subbands of an approximately critical bandwidth. FIG. 2B shows how each
band splitting and corresponding band merging is realized. The band of the
input signal is divided into a lower band and an upper band with the aid
of a low-pass filter 10 and a high-pass filter 11, respectively, the
amplitude responses of these filters 10 and 11 being each other's image.
This image is represented in a stylized form in FIG. 2C showing the
magnitude of frequency response H(e.sup.j.omega.) as a function of the
normalized radial frequency .omega.=2.pi.fT.sub.s, where 1/T.sub.s is the
sampling rate of the input signal having the bandwidth 1/(2T.sub.s). The
sampling rate of the output signals of filters 10, 11 is subsequently
halved by means of 2:1 decimators 12, 13. At this band merging, this
halving of the sampling rate is cancelled by means of 1:2 interpolators
14, 15. As undesired periodical repetitions of the signal spectra of the
lower and upper bands occur during this interpolation, the output signals
of the 1:2 interpolators 14, 15 are applied to a low-pass filter 16 and a
high-pass filter 17, respectively, for selecting the desired lower and
upper band. The frequency responses of these filters 16 and 17 are again
each other's image, filter 16 corresponding to filter 10 and filter 17 co
responding to filter 11 (disregarding a sign inversion). The output
signals of filters 16 and 17 are added together by means of an adder 18 to
construct a replica of the input signal of filters 10 and 11. The diagram
of FIG. 2A shows that an equal number of splittings and mergings is not
required for all subbands for the subbands of the numbers p=1-4 this is 8,
but for the subband of the number p=26 this is only 2. Since the
quadrature mirror filters 10, 11 and 16, 17 form the most important
sources of the signal delays in the filter banks 3 and 5, the signals in
the separate subbands have to be delayed by different amounts in order to
maintain in the constructed replica of the music signal the original time
relation between the signals in the respective frequency ranges.
FIG. 3 shows a Table of data relating to the subbands obtained from
applying the diagram of FIG. 2A to the 0-22.05 kHz music signal band. The
first column indicates the band numbers p, the second and third columns
give the values f.sub.co of the lower and upper boundary of the subband,
respectively, and the fourth column gives the value W(p) of the width of
the subband, the values in the second, third and fourth columns being
rounded to integers. The values W(p) are the result of a practical
compromise between aiming at as good an approximation as possible of the
critical bandwidths values of the human auditory system as mentioned in
publications on psychoacoustic experiments, and aiming at as little
complexity as possible of the filter banks 3 and 5 when implementing the
quadrature mirror filter technique.
The choice of a division into subbands of approximately critical bandwidths
is made because, on the basis of psychoacoustic experiments, it may be
expected that the quantizing noise in a subband will then be optimally
masked by the signals in this subband. The noise-masking curve of the
human auditory system providing the threshold for masking noise in a
critical band by a single tone in the centre of this critical band is the
starting point for the quantizing of the respective subband signals
(compare FIG. 3 in the above article by Krasner). The number of quantizing
levels L(p) for a subband of band number p is now related to this
noise-masking curve in a manner such that in each subband the
signal-to-noise ratio is sufficiently high for not perceiving the
quantizing noise. For this purpose a number of L(p)=25 quantizing levels
appears to be amply sufficient in the mid-frequency portion of the audio
signal band, where the noise-masking curve possesses its lowest values,
whilst for higher frequencies ever decreasing numbers of L(p) will
suffice. The latter also holds for the low-frequency portion of the audio
signal band, but in the present embodiment this option is not utilized as
it hardly contributes to a reduction of the number of bits required to
represent the output signals of the coder, as will be explained
hereinafter. The numbers of L(p) quantizing levels used in the present
case are shown in the fifth column of the Table in FIG. 3. As is well
known, a number of L(p) quantizing levels corresponds with a number of
B(p)=log.sub.2 [L(p)] quantizing bits per signal sample. The values of
these numbers B(p) are shown in the sixth column of the Table in FIG. 3,
these values being rounded off to two decimal places. When the quantizers
8(p) and dequantizers 9(p) are implemented in practice, the numbers B(p)
are slightly higher. For example, for quantizing a block of M=32 samples
of a subband signal x.sub.p (k) having a number of L(p)=25 quantizing
levels the theoretical number of quantizing bits per signal sample is
B(p)=log.sub.2 (25)=4.64 and the theoretically required total number of
quantizing bits for the block is 32log.sub.2 (25)=148.60. The practically
required total number of quantizing bits for the block, however, is no
less than 149 so that in practice the number of quantizing bits per signal
sample has a value of at least B(p)=149/32=4.66.
The number of bits per second required for quantizing a subband signal
x.sub.p (k) is indicated by the product of the sampling rate 2W(p) and the
number of B(p) quantizing bits per signal sample. Then the values of W(p)
and B(p) in the Table of FIG. 3 show that the quantizing of all subband
signals x.sub.p (k) requires a theoretical bit capacity of 98.225 kbits/s.
Considering the relatively low values of the sampling rate 2W(p) for the
subbands having the lowest band numbers p, it will be evident that it is
hardly advisable to make use of the possibility of reducing there the
number of B(p) quantizing bits per signal sample without thus affecting
the perceptibility of quantizing noise. For quantizing the characteristic
parameters G(p;m) of each block of M=32 signal samples 8 bits are used, as
was stated before, which narrows down to 8/32=0.25 bit per signal sample.
From the value of the sampling rate 1/T=44.1 kHz of the music signal it
then follows that the quantizing of all characteristic parameters G(p;m)
requires a bit capacity of 11.025 kbits/s. The overall bit capacity
required for representing all output signals of the coder 1 in FIG. 1 is
thus 109.250 kbits/s, so that these output signals can be represented with
an average number of 2.477 bits per signal sample in lieu of 16 bits per
signal sample. As already stated before, the value of B(p) will slightly
higher in practice than the value shown in the table, the representation
of the output signals of the coder 1 in practice requiring a bit capacity
of approximately 110 kbits/s and thus an average number of approximately
2.5 bits per signal sample.
When an analog version x(t) of music signal x(k) is formed at the input of
coder 1 with the aid of a 16-bit digital to analog converter and also an
analog version x(t) of replica x(k) at the output of the decoder 2, and
these analog versions x(t) and x(t) are compared with each other during
listening tests, the quality of the replica x(t) turns out not to differ
perceptibly from the high quality of the original music signal x(t) in
substantially all passages of nearly all kinds of music signals despite
the above significant reduction of the required bit capacity. In certain
passages of specific kinds of music signals, however, the quantizing noise
is still audible. Basically, the audibility of the quantizing noise can
always be reduced by having the number of L(p) quantizing levels for all
subbands exceed the numbers in the fifth column of the Table shown in FIG.
3, but this automatically means that the number of B(p) quantizing bits
per signal sample for all subbands exceeds the numbers in the sixth column
of this Table, resulting in the fact that the representation of the output
signals of the coder 1 requires a larger bit capacity too.
From extensive research into the causes of the occasional audibility of
quantizing noise, the Applications have gained the recognition that the
quantizing noise is especially audible in music passages presenting single
tones. During such music passages the greater part of the subbands have
very little or no signal energy from the mid frequency portion of the
music signal band onwards, whereas only a single spectral component having
significant signal energy occurs in each of the few remaining subbands.
With reference to FIG. 4 it will be qualitatively explained how the
quantizing noise sometimes becomes audible in this case. FIG. 4 shows the
power S of a single sinusoid component X near the upper boundary of a
subband of band number p. When using a sufficiently large number of L(p)
quantizing levels for quantizing the sinusoid component X, the quantizing
noise is distributed substantially uniformly over the whole subband and
the power N of the quantizing noise is lower by an amount of approximately
20 log.sub.10 [.sqroot.1.5 L(p)]dB
than the power S, as shown in FIG. 4. In a stylized form FIG. 4 also shows
two threshold curves for noise-masking in critical bands of the human
auditory system by a sinusoid component in the centre of this frequency
band. The curve shown in the dashed line represents a sinusoid component
having power S situated in the centre of the subband of band number p,
whilst the curve in a solid line represents sinusoid component X also
having power S but now situated near the upper boundary of the subband of
band number p. From FIG. 4 it is evident that in the case of the
dashed-line curve the quantizing noise is fully masked, but that in the
case of the solid curve the shaded part of the quantizing noise lies above
the threshold curve and is thus audible in music passages presenting
single tones. In the more general case when in addition to spectral
component X various other spectral components having significant energy
occur in the subband of band number p and/or in the neighbouring subbands,
the shaded portion of the quantizing noise in FIG. 4, however, will no
longer be audible, because in this case the overlapping threshold curves
for the respective spectral components will result in a compound threshold
curve situated above the quantizing noise and this quantizing noise will
thus be masked adequately.
In accordance with the invention, the system of FIG. 1 is now arranged in
the following manner to combat the audibility of quantizing noise during
music passages presenting single tones without the average number of
quantizing bits per sample of the quantizing output signals being
increased. The subbands are divided into a first group of band numbers p
smaller than p.sub.im (1.ltoreq.p.ltoreq.p.sub.im) and a second group of
band numbers p not smaller than p.sub.im (p.sub.im .ltoreq.p.ltoreq.P), in
which the subband of band number p.sub.im is situated in the portion of
the audio signal band having the lowest threshold values for masking noise
in critical bands of the human auditory system by single tones in the
centre of the respective critical bands. In the present embodiment
p.sub.im =13 is chosen, so that the dividing line between the first and
the second group of subbands is situated at the frequency f=1723 Hz. The
quantizers 8(p) and dequantizers 9(p) for each of the subbands of the
first group (1.ltoreq.p.ltoreq.12) are arranged for quantizing and
dequantizing the subband signals by a fixed number of B(p) bits per signal
sample, in the present embodiment the same values of B(p=log.sub.2 [L(p)]
as shown in the table of FIG. 3 being chosen, thus L(p)=25 and B(p)=4.64.
For the quantizing and dequantizing of the signals in the subbands of the
second group (13.ltoreq.p.ltoreq.26) a fixed total number of B bits is
predetermined indeed for a time interval corresponding with one block of
M=32 signal samples of the signal in the subband having band number
p.sub.im =13 but the number of B(p;m) quantizing bits per signal sample
for the signal block of b | | |