|
Description  |
|
|
TECHNICAL FIELD OF THE INVENTION
This invention relates to a signal quantizer and more particularly to a
signal quantizer that has reduced output fluctuation.
BACKGROUND OF THE INVENTION
Digital encoding of signals is becoming increasingly important in the
modern era of digital communication and storage. One problem with many
sophisticated source coding algorithms is that they do a poor job of
reproducing the background noise when no signal is present, since the
encoding is designed to match the characteristic of a typical input
signals. For example, the VSELP speech coding algorithm, which is the
North American digital cellular telephone standard IS54, introduces
fluctuations into the background noise generated in an automobile,
resulting in an annoying perceptual effect sometimes known as "swirling".
An important step in any digital coding system involves quantization of a
signal, where the signal is represented by one of a finite number of
possible values. In simple waveform quantization, samples of the input
signal are directly quantized, while in more sophisticated coding schemes
some model parameters based on transformations of the input signal may be
quantized. A typical quantizer is represented by FIG. 1.
In traditional quantization schemes, a range of possible output values is
searched for the output which matches the input value with the least
possible error, where the error is defined as:
E=.vertline.Q-X.vertline..sup.2
where Q represents the quantizer output value, X represents the quantizer
input, and .vertline.Y.vertline..sup.2 represents the square of the value
of Y in the case of scalar quantization or the squared norm of the vector
Y in the case of vector quantization. The squared norm is defined as the
inner product of a vector with itself:
.vertline.Y.vertline..sup.2 =<Y,Y>
SUMMARY OF THE INVENTION
A new signal quantization scheme is provided which reduces fluctuation of
the output signal by modeling both the input signal and the variation of
the input signal with time. The error which is minimized by the quantizer
search algorithm is modified to include an additional term corresponding
to the difference between the current and previous input signals, so that
the quantizer is forced to match the fluctuation in the input signal as
well as the signal itself.
DESCRIPTION OF THE DRAWING
In the drawing:
FIG. 1 is a block diagram representing a typical prior art normal
quantizer.
FIG. 2 is a block diagram of a quantizer according to the present
invention.
FIG. 3 is a block diagram of a VSELP Encoder.
FIG. 4 is a block diagram of a modification to the VSELP Encoder of FIG. 3.
FIG. 5 is a flow chart of the modification to the VSELP Encoder of FIG. 4.
FIG. 6 is a block diagram of a transmitter of a cellular telephone system
with a speech/noise detector, a speech encoder including
reduced-fluctuation quantizer, and a simple encoder for background noise.
FIG. 7 is a block diagram of the use of the receiver system in connection
with a cellular telephone which has a pre-stored comfort noise.
DESCRIPTION OF PREFERRED EMBODIMENTS
The present invention introduces a novel signal quantization scheme which
models both the input and the fluctuations in the input signal, so that
the quantizer output will not fluctuate excessively for a stationary
input. If a quantizer is to match both the input signal and the
fluctuation in the input signal, the error measure defining the optimal
quantizer output value must be modified. A simple measure of input
fluctuation is the difference between the current input and the previous
input, and a similar measure can be defined for the output fluctuation.
Then a fluctuation-matching quantizer can be designed to minimize:
E=(1-w.sub.1).vertline.Q-X.vertline..sup.2 +w.sub.1
.vertline..DELTA..sub.out -w.sub.2 .DELTA..sub.in .vertline..sup.2
where w.sub.1 and w.sub.2 are weighing factors, .DELTA..sub.out is the
difference between the current and previous quantizer outputs, and
.DELTA..sub.in is the difference between the current and previous
quantizer inputs.
To better understand the properties of this new error measure, we can
re-arrange the terms as follows:
##EQU1##
Since the only term in this equation which depends on Q is the first term,
finding the best quantizer output is equivalent to using a normal
quantizer but replacing the quantizer input signal X with the new input:
x'=(1-w.sub.1)x+w.sub.1 (Q.sub.prev +w.sub.2 .DELTA..sub.in)
For simple cases of w.sub.1 and w.sub.2, this quantizer is easier to
analyze. For example, if w.sub.1 is equal to zero, the quantizer reduces
to the traditional quantizer since then
x'=x
As another example, if w.sub.2 is equal to zero then the quantizer input
becomes:
x'=(1-w.sub.1)x+w.sub.1 Q.sub.prev
If, in addition, the quantizer is assumed to be a "perfect" quantizer so
that Q is equal to X', we see that the new quantizer algorithm is simply
reduced to one-pole smoothing of the input signal:
x'=(1-w.sub.1)x+w.sub.1 x'.sub.prev
On the other hand, if both w.sub.1 and w.sub.2 are equal to one, the
quantizer will simply ignore the current input and match the fluctuation
of the input signal, since the quantizer input then becomes:
x'=Q.sub.prev +.DELTA..sub.in
Based on this analysis of special cases, we can make the following general
statements about the performance of this new quantizer. First, the
quantizer minimizes a weighted sum of the error in matching the input
signal and the error in matching the fluctuation of the input signal.
Second, both weights w.sub.1 and w.sub.2 are between 0 and 1, but each
weighing factor performs a different function. Third, as w.sub.1
approaches one, the quantizer will give increasing importance to matching
the fluctuation rather than the input signal itself. Finally, as w.sub.2
approaches zero, the amount of input fluctuation is scaled down so that
the output signal will fluctuate less than the input.
Referring to FIG. 2 there is illustrated the quantizer of the subject
invention utilizing the equations stated previously. The input signal to
be quantized is X. One sample input delayed in delay 12 is subtracted from
the next sample input signal at summer 11 to obtain difference signal
.DELTA..sub.in. The input signal X is multiplied by the quantity 1-w.sub.1
factor at multiplier 18. The resultant difference value .DELTA..sub.in
from the one sample and the delayed sample is multiplied by a weight
control factor of w.sub.2 at multiplier 16. The weight factors w.sub.1 and
w.sub.2 are provided by adaptive weight control 13 to be discussed later.
The output from the quantizer 10 is provided to a one sample delay 15 to
provide a previous quantized value Q.sub.prev at a summer 17. Summer 17
sums the output from the difference between the current and previous
quantizer inputs or .DELTA..sub.in multiplied by the weight factor w.sub.2
and the previous quantized value Q.sub.prev. The sum at summer 17 is then
multiplied by the weighted value w.sub.1 at multiplier 19. The output from
multiplier 19, which is =w.sub.1 (Q.sub.prev +w.sub.2 .DELTA..sub.in), is
applied to the summer 21 to sum with output from the multiplier 21 which
is X (1-w.sub.1) to thereby provide the new X'input to the quantizer 10,
which then provides the new output Q. The weight coefficients w.sub.1 and
w.sub.2 control the reduced-fluctuation properties of the quantizer, and
should be adjusted based on the characteristics of the input signal. They
may either be fixed for a given application, or adapted using time-varying
input signal features such as power level and spectral tilt.
In accordance with one preferred embodiment of the present invention, this
novel quantizer is utilized in the North American digital cellular speech
coding system, VSELP, in order to improve the performance of the coder for
acoustic background noise. In the standard coder, the quality is reduced
when the speaker is not talking because the background noise is distorted
by VSELP processing. The characteristics of the encoded noise fluctuate
randomly from frame to frame, introducing a "swirling" effect to the coder
output. In experiments with a computer simulation of the algorithm, we
have found that much of the problem is due to undesirable variations in
the quantized parameters of the VSELP model from one frame to the next. We
have obtained a performance improvement when the spectral information in
VSELP is encoded with the new reduced-fluctuation quantizer described
above.
Referring to FIG. 3, there is illustrated a block diagram of the VSELP
Encoder 30. A description of the system is found in EIA/TIA Standards in
EIA/TIA Project Number 2215 entitled, "Cellular System Dual-Mode Mobile
Station-Base Station Capability Standard, IS 54,"issued December 1989.
This is published by the Electronic Industries Association Engineering
Department, 2001 Eye Street, N.W., Washington, D.C. Referring to FIG. 3,
the speech signal is sampled and converted in Analog-to-Digital Converter
31. The speech coding algorithm is a member of a class of speech coders
known as Code Excited Linear Predictive Coding (CELP), Stochastic Coding,
or Vector Excited Speech Coding. These techniques use codebooks to vector
quantize the excitation (residual) signal. The speech coding algorithm is
a variation on CELP called Vector-Sum Excited Linear Predictive Coding
(VSELP). VSELP uses a codebook which has a predefined structure such that
the computations required for the codebook search process can be
significantly reduced.
For a more detailed description of an encoder, see U.S. Pat. No. 4,817,157
of Gerson, issued Mar. 28,1989 and incorporated herein by reference. The
LPC analysis refers to the analyzer 110. As stated for each block of
speech, a set of linear predictive coding (LPC) parameters are produced in
accordance with prior art technique by coefficient analyzers. Applicant's
improved quantizer can be used with any of these analyzers.
After the analog-to-digital conversion, the signal passes through a fourth
order Chebyshev type II highpass filter 32 with a filter response that is
3 db down at 120 Hz and 40 db down at 60 Hz. The covariance, SST and LPC
analysis is discussed in Section 2.1.3.3.2.4 on pages 24-30 of the EIA
standard cited above. The frame energy value R is discussed in Section
2.1.3.3.2.5 on pages 30 and 31. If the filter is unstable in that the
reflection coefficient is equal to or greater than 1.0, then the
uninterpolated coefficients are used for that subframe's coefficients. The
uninterpolated coefficients for subframe 1 are the previous frame's
coefficients.
In the VSELP coder, the power spectrum of the output signal is largely
determined by the LPC spectrum, as represented by the reflection
coefficients, and the overall power level. Since we believe it is
variations in these parameters that introduce the swirling effect, we use
the fluctuation-reducing quantizer for these coefficients. In addition, we
control the weighting factors w.sub.1 and w.sub.2 with an adaptive
algorithm based on the power level and rate of spectral change of the
input speech, such as the one described below.
The adaptive algorithm computes two intermediate variables before
estimating the weighting coefficients for the quantizer. The transition
strength T is estimated from the change of the power level and spectral
tilt from the previous frame, as given by:
##EQU2##
where
##EQU3##
P and K1 are the power level estimate and the first reflection coefficient
(spectral tilt) for the input speech frame.
The signal strength S is estimated by comparing the power P for the speech
frame with a long-term estimate of the background noise level P.sub.noise.
If the power level for the input frame is only 3 dB above the noise level,
the signal strength is zero, while if the input power is 12 dB above the
noise level then the signal strength is one. This is implemented by the
formula:
##EQU4##
The transition and signal strengths are constrained to be within the range
from 0 to 1 by a clamping algorithm. Finally, the weighting coefficients
are calculated based on the transition and signal strengths using the
formulas:
w.sub.1 =0.75(1-T)(1-S)
and
w.sub.2 =T
In one preferred embodiment of the present invention, the covariance SST
and LPC analysis and quantization is performed under program control using
the Encoder of FIG. 3 with the modifications of FIG. 4 in place of the
elements within the dashed line 35. In the modified Encoder of FIG. 4, the
covariance output at 41 is applied to the frame energy analysis 42 to get
Frame energy value R, to SST 43 for spectral smoothing and to quantizer
weight adaptation 44 to determine w.sub.1 and w.sub.2. The SST output
passes to FLAT 45 to achieve reflection coefficient r.sub.i. The
reflection coefficient r.sub.i is applied to reduced fluctuation quantizer
A. The R output from Frame analysis 42 is applied to the second reduced
fluctuation quantizer B. Quantizers A and B are like FIG. 2. The quantizer
weight adaptation calculates weights w.sub.1 and w.sub.2 using frame
energy R from frame energy analysis 42 and the covariance matrix ac(i,j)
output from 41r.sub.i. The weight values of w.sub.1 and w.sub.2 are used
to quantize the reflection coefficients r.sub.1 and frame energy R. The
program steps follow the flow chart of FIG. 5 and the steps as follows:
1. Calculate signal covariance matrix
##EQU5##
2. Calculate frame energy,
##EQU6##
where N is length of summation. 3. Update noise power estimate
P.sub.noise
P.sub.noise =P.sub.noise * 1.007
if P.sub.noise >R
then P.sub.noise =R
calculate signal strength S,
##EQU7##
constrain S between 0 and 1
##EQU8##
calculate T=1.25(T'-0.2)
constrain T between 0 and 1
Calculate weights w.sub.1 and w.sub.2 for reduced-fluctuation quantizer,
w.sub.1 =0.75(1-T)(1-S)
w.sub.2 =T
4. Quantize R with reduced-fluctuation quantizer B to get R.
5. Perform spectral smoothing (SST).
6. Calculate reflection coefficients r.sub.i for i=1 to 10 and quantize
using reduced fluctuation quantizer A to get r.sub.i.
When this reduced-fluctuation quantizer is used for the reflection
coefficients and overall gain in the VSELP speech coder, the performance
is improved for acoustic background noise, while the speech quality is
unaffected. When the speaker is not talking, the input signal is at the
background noise level with no transitions, so that w.sub.1 is 0.75 and
w.sub.2 is zero and the quantizer reduces fluctuation in the spectral
parameters significantly. When speech is present, w.sub.1 goes to zero so
that the quantizer matches the input signal only and performs in the same
way as the standard algorithm.
In addition, the reduced-fluctuation quantizer could be combined with the
insertion of "comfort noise" to further increase the naturalness of the
background noise. When the input clearly consists only of background
noise, the encoded signal could be replaced by artificially generated
noise. If it is not clear whether the input is only background noise, the
reduced-fluctuation quantizer can be used as described previously.
The use of comfort noise by itself leads to speech dropouts whenever an
input frame containing speech is incorrectly detected as a noise-only
frame. By combining the reduced-fluctuation quantizer with comfort noise,
we can use a conservative approach for detecting noise-only flames so as
to minimize the speech dropout problem. In cases where noise-only frames
are detected as speech frames, the use of the reduced-fluctuation
quantizer will improve the quality of the coder output. Thus, the
integrated approach involving comfort noise and the reduced-fluctuation
quantizer may lead to improved over-all performance.
Referring to FIG. 6, there is illustrated a block diagram of an encoder
system. In accordance with the system of FIG. 6, the input speech is
applied to a speech/noise detector to determine if there is speech or
noise at the input. If speech is detected, input is applied through a full
encoder with the smoothed quantizer as in FIG. 2 and 3 or 4 to the channel
and to a receiver for decoding the speech and reproducing the speech
signal. In the case of a full encoder, the quantizer output is at the
normal bit rate. If the speech/noise sensor determines noise, then the
simple encoder with reduced bit rate is used. In accordance with another
embodiment of FIG. 7 at the receiver if there is speech, to switch the
input signal to the full decoder, and if there is no detected speech or
noise, to apply the pre-stored comfort noise to the output. The speech
noise sensor may include the sensor discussed above in connection with
FIG. 6.
Other Embodiments
Although the present invention and its advantages have been described in
detail, it should be understood that various changes, substitutions and
alterations can be made herein without departing from the spirit and scope
of the invention as defined by the appended claims.
* * * * *
|
|
|
|
|
Description  |
|