|
Claims  |
|
|
What is claimed is:
1. An echo canceller which comprises:
a first adaptive filter for estimating an impulse response of an echo path
between reception and transmission sides, holding the estimated impulse
response as a first filter coefficient, and convoluting the first filter
coefficient with a received input signal to thereby synthesize a first
pseud echo;
a first subtracter for subtracting the first pseud echo synthesized by said
first adaptive filter from a transmitting input signal to thereby cancel
an echo contained in the transmitting input signal;
a second adaptive filter for estimating an impulse response of the echo
path, holding the estimated impulse response as a second filter
coefficient, and convoluting the received input signal with the second
filter coefficient to thereby synthesize a second pseud echo;
a second subtracter for subtracting the second pseud echo synthesized by
said second adaptive filter from the transmitting input signal to thereby
cancel the echo;
a first input-output power ratio estimator which detects short time powers
of input and output signals of said first subtracter to calculate a first
I/O power ratio of an input-short-time-power/output-short-time-power of
said first subtracter;
a second input-output power ratio estimator which detects short time powers
of input and output signals of said second subtracter to calculate a
second I/O power ratio of an input-shot-time-power/output-short-time-power
of said second subtracter;
a divider for dividing the second I/O power ratio by the first I/O power
ratio to calculate a third ratio;
a voice detector for detecting a short time power of the received input
signal to determine whether far-end speech is present or absent, wherein
said voice detector controls said second adaptive filter to execute
adaptation of said second adaptive filter for renewal of an impulse
response of an echo path when far-end speech is present and otherwise to
suspend the adaptation of said second adaptive filter when far-end speech
is absent; and
a double talk detector for controlling said first adaptive filter to
execute or suspend adaptation of said first adaptive filter according to
the determination results of said voice detector.
2. The echo canceller as claimed in claim 1, wherein said double talk
detector controls said first adaptive filter to suspend adaptation of said
first adaptive filter for renewal of an impulse response of an echo path
when said voice detector determines that far-end speech is absent, and
otherwise to execute the adaptation of said first adaptive filter when
said voice detector determines that far-end speech is present under at
least one of the following conditions (a) and (b) is satisfied:
(a) either the first or the second I/O power ratio is greater than a first
threshold; and
(b) the third ratio is greater than a fixed second threshold.
3. The echo canceller as claimed in claim 1, wherein said double talk
detector comprises:
a threshold controller for controlling the first threshold to be produced
therefrom, which receives the first I/O power ratio generated by said
first I/O power ratio estimator to thereby control the first threshold to
be produced according to an output result of said voice detector in such a
manner that, the first threshold is gradually increased so as not to
exceed the first ratio when far-end speech is present and besides the
first ratio is greater than the first threshold, while the first threshold
is gradually decreased when the first ratio is smaller than the current
threshold irrespective of the presence of far-end speech, and otherwise
the previous value of the first threshold is held when no speech is
present; and
an adaptation controller for controlling an adaptation of said first
adaptive filter in such a manner that said adaptation controller
determines to execute the adaptation of said first adaptive filter in
either case when far-end speech is present and besides at least one of the
first and second ratios is greater than the first threshold or in a case
when the third ratio is greater than the second threshold.
4. The echo canceller according to claim 3, wherein said threshold
controller comprises: a memory for storing the first threshold; a
comparator for comparing the first I/O power ratio with the first
threshold; a switch which selects a constant when the comparator has
determined that the first I/O power ratio is greater than the first
threshold and otherwise which selects a constant; a subtracter which
subtracts the first threshold from the first I/O power ratio; a multiplier
which calculates a product of a time constant selected by the switch and
an output of the subtracter; an adder for adding together an output of the
multiplier and the first threshold; a limiter for limiting the maximum and
minimum of an output of the adder; a switch which selects an output of the
limiter when far-end speech is present, and which selects an output of the
memory when no speech is present, in accordance with the output signal of
said voice detector; a delay unit for delaying an output of the switch by
one sample and the resultant output thereof is transferred to the memory.
5. The echo canceller according to claim 3, wherein said adaptation
controller comprises:
a first comparator for comparing the first threshold created by the
threshold controller with the first I/O power ratio calculated by the
first input-output power ratio estimator to yield a positive output when
the first ratio is greater that the other and otherwise to yield a
negative output;
a second comparator for comparing the second ratio calculated by the second
input-output power ratio estimator with the first threshold to yield a
positive output when the second ratio is greater than the other and
otherwise to yield a negative output;
a third comparator for comparing the third ratio calculated by the divider
with the second fixed threshold to yield a positive output when the third
ratio is greater than the other and otherwise to yield a negative output;
an OR circuit for determining a logical OR of the three comparators; and
an AND circuit for determining a logical AND between a determination result
of the voice detector and a logic output of the OR circuit.
6. The echo canceller as claimed in claim 1, wherein the renewal of the
first and second filter coefficients of said first and second adaptive
filters is performed according to an Normalized Least Mean Square
algorithm, and a step gain of the first adaptive filter is smaller than a
step gain of the second adaptive filter, where the step gain of the second
adaptive filter is 1.
7. The echo canceller as claimed in claim 1, wherein a reference input
signal of said second adaptive filter and an input signal of said second
subtracter are band-limited to 1/M, where M is an integer not less than 2,
of a pass band of said first adaptive filter by band-limiting filter means
and the resultant band-limited signals are further decimated to 1/M by
decimator means.
8. The echo canceller as claimed in claim 7, wherein said band-limiting
filter means is a low-pass filter whose cut-off frequency is lower than 1
kHz.
9. The echo canceller as claimed in claim 7, wherein said second adaptive
filter comprises:
a first X-memory for storing a received input signal corresponding to N
samples where N is an integer;
an H-memory for storing filter coefficients corresponding to N samples;
a first convoluter for convoluting the latest data corresponding to N
samples stored in the first X-memory with data stored in the H-memory to
thereby synthesize a pseud echo;
a first coefficient renewer for renewing the filter coefficients stored in
the H-memory by using an output signal of the first subtracter and the
data stored in the first X-memory when the voice detector has determined
that far-end speech is present;
a first delay for delaying the received input signal by M samples;
a second delay for delaying the transmitting input signal by M samples;
a third delay for delaying a detection result of said voice detector by M
samples;
a second X-memory for storing an output of the first delay corresponding to
N samples;
a second convoluter for convoluting the data corresponding to N samples
stored in the second X-memory with the data stored in the H-memory to
synthesize a pseud echo;
a third subtracter for subtracting an output of the second convoluter from
an output of the second delay; and
a second coefficient renewer for renewing the filter coefficients stored in
the H-memory by using an output signal and step gain of the third
subtracter and the data stored in the second X-memory when a detection
result of the voice detector corresponding to the last M samples delayed
by the third delay is determined that far-end speech is present.
10. The echo canceller as claimed in claim 1 further comprising a step gain
controller for controlling a step gain to be applied to said first
adaptive filter according to both the determination result of said voice
detector and the third ratio.
11. The echo canceller as claimed in claim 10, wherein said step gain
controller further comprises: a divider for calculating a reciprocal of
the third ratio; a subtracter for subtracting an output of said divider
from 1; a limiter for limiting an output of said subtracter within a
specified range; and an averaging circuit for averaging an output of said
limiter by a specified time constant when said voice detector has detected
speech, and otherwise for yielding a value of a previous step gain when
said voice detector has detected no speech.
12. The echo canceller as claimed in claim 10, wherein renewal of the first
and second filter coefficients of said first and second adaptive filters
is performed according to an Normalized Least Mean Square algorithm where
the step gain of said second filter is 1.
13. The echo canceller as claimed in claim 10, wherein a reference input
signal of said second adaptive filter and an input signal of said second
subtracter are band-limited to 1/M, where M is an integer not less than 2,
of a pass band of said first adaptive filter by band-limiting filter means
and the resultant band-limited signals are further decimated to 1/M by
decimator means.
14. The echo canceller as claimed in claim 13, wherein said band-limiting
filter means is a low-pass filter whose cut-off frequency is lower than 1
kHz.
15. The echo canceller as claimed in claim 13, wherein said second adaptive
filter comprises:
a first X-memory for storing a received input signal corresponding to N
samples where N is an integer;
an H-memory for storing filter coefficients corresponding to N samples;
a first convoluter for convoluting the latest data corresponding to N
samples stored in the first X-memory with data stored in the H-memory to
thereby synthesize a pseud echo;
a first coefficient renewer for renewing the filter coefficients stored in
the H-memory by using an output signal of the first subtracter and the
data stored in the first X-memory when the voice detector has determined
that far-end speech is present;
a first delay for delaying the received input signal by M samples;
a second delay for delaying the transmitting input signal by M samples;
a third delay for delaying a detection result of said voice detector by M
samples;
a second X-memory for storing an output of the first delay corresponding to
N samples;
a second convoluter for convoluting the data corresponding to N samples
stored in the second X-memory with the data stored in the H-memory to
synthesize a pseud echo;
a third subtracter for subtracting an output of the second convoluter from
an output of the second delay; and
a second coefficient renewer for renewing the filter coefficients stored in
the H-memory by using an output signal and step gain of the third
subtracter and the data stored in the second X-memory when a detection
result of the voice detector corresponding to the last M samples delayed
by the third delay is determined that far-end speech is present.
16. The echo canceller as claimed in claim 1 further comprising a loss
controller for calculating a loss to be added to the transmitting output
signal according to a determination result of the voice detector, the
first ratio, and a third ratio; and an insertion loss circuit for
attenuating an output signal of said first subtracter by the loss
calculated by said loss controller.
17. The echo canceller as claimed in claim 16, wherein said loss controller
further comprises:
a first averaging circuit for averaging the third ratio by a specified time
constant to be yielded therefrom when the voice detector has detected
speech, and otherwise holding a previous operation result to be yielded
when no speech has been detected;
a comparator for comparing an output of the first averaging circuit with a
predetermined third threshold, thereby to yield a positive output when the
output of the first averaging circuit is greater than the third threshold
and otherwise to yield a negative output;
an AND circuit for determining a logical AND between an output of said
comparator and an output of the voice detector;
a divider for dividing a predetermined maximum loss by the first ratio;
a second averaging circuit for averaging an output of said divider by a
specified time constant when the voice detector has detected speech, and
otherwise holding a previous operation result thereof when no speech has
been detected; and
a switch for selecting an output of said second averaging circuit as a loss
when the AND circuit has yielded a positive output and otherwise selecting
a numerical value 1.
18. The echo canceller as claimed in claim 16, wherein a reference input
signal of said second adaptive filter and an input signal of said second
subtracter are band-limited to 1/M, where M is an integer not less than 2,
of a pass band of said first adaptive filter by band-limiting filter means
and the resultant band-limited signals are further decimated to 1/M by
decimator means.
19. The echo canceller as claimed in claim 18, wherein said band-limiting
filter means is a low-pass filter whose cut-off frequency is lower than 1
kHz.
20. The echo canceller as claimed in claim 18, wherein said second adaptive
filter comprises:
a first X-memory for storing a received input signal corresponding to N
samples where N is an integer;
an H-memory for storing filter coefficients corresponding to N samples;
a first convoluter for convoluting the latest data corresponding to N
samples stored in the first X-memory with data stored in the H-memory to
thereby synthesize a pseud echo;
a first coefficient renewer for renewing the filter coefficients stored in
the H-memory by using an output signal of the first subtracter and the
data stored in the first X-memory when the voice detector has determined
that far-end speech is present;
a first delay for delaying the received input signal by M samples;
a second delay for delaying the transmitting input signal by M samples;
a third delay for delaying a detection result of said voice detector by M
samples;
a second X-memory for storing an output of the first delay corresponding to
N samples;
a second convoluter for convoluting the data corresponding to N samples
stored in the second X-memory (108) with the data stored in the H-memory
to synthesize a pseud echo;
a third subtracter for subtracting an output of the second convoluter from
an output of the second delay; and
a second coefficient renewer for renewing the filter coefficients stored in
the H-memory by using an output signal and step gain of the third
subtracter and the data stored in the second X-memory when a detection
result of the voice detector corresponding to the last M samples delayed
by the third delay is determined that far-end speech is present. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to an echo canceller for controlling acoustic
echoes in loud telephone lines and teleconference systems as well as
echoes in two-to-four line switching hybrid of telephone lines.
2. Description of the Prior Art
Acoustic echo cancellers generally used in teleconference systems are
described here as an example. In a teleconference system, there are used
loudspeakers and microphones to implement speech communication. Far-end
speech sent from a far-end to a near end is loudened by a loudspeaker and
reaches near-end speakers, while the far-end speech loudened by the
loudspeaker is affected by reflection by walls and the like, and is
entered to a microphone as an acoustic echo. The acoustic echo that has
entered to the microphone is transferred to the far-end by line and then
loudened by a loudspeaker in the far-end. To far-end speakers, the far-end
speech returns back as an echo delayed to an extent of the line
reciprocation.
In view of the above fact, there has been developed an acoustic echo
canceller for cancelling an echo in which an impulse response of an
acoustic echo path ranging from a loudspeaker to a microphone is estimated
so as to convolute an input signal of a loudspeaker with the estimated
impulse response together to thereby synthesize a pseud echo, and
subtracting the pseud echo from sn output of the microphone.
Generally, an echo canceller adaptively estimates an impulse response of an
echo path by using a spoken speech. To accurately estimate an impulse
response of an acoustic echo path, it is required to renew the impulse
response (hereinafter, referred to also as "adapt") when only a far-end
speaker is speaking (referred to as a single talk). That is, it is natural
that renewal of the impulse response be suspended when no far-end speaker
is speaking or when only a near-end speaker is speaking, whereas renewal
of the impulse response should also be suspended for a double-talk period
during which both a near-end speaker and a far-end speaker are
simultaneously speaking. Thus, an echo canceller is provided with a
function of double talk detection for detecting a double talk and
suspending the renewal of the impulse response.
Below described is an acoustic echo canceller as disclosed in the U.S. Pat.
No. 4,894,820 which uses a double talk detector with reference to FIG. 17.
As shown in FIG. 17, a subtracter 129 determines the difference in level
between the signals Lrin and Lres, and outputs a signal Acoms to an adder
130 and a threshold control section 131. The adder adds a margin to the
signal Acoms to generate a signal FLG which is applied to the threshold
control section 131. The threshold control section 131 receives Acoms and
FLG, and then generates a variable double-talk detection threshold TRIM. A
comparator 132 compares a signal Lrin with a reference signal XTH, and
detects the idle state of the received signal Rin, and generates an
estimation function inhibit signal INH and a control inhibit signal S32.
When Lrin<XTH, an inhibit signal INH is generated to inhibit updating, and
generates a control inhibit signal S32 to inhibit updating of the
double-talk detection threshold TRIM by the threshold control section 131.
Meanwhile, a comparator 133 compares a signal Lsin with a reference signal
YTH, and detects the idle state of the signal Sin when Lsin<YTH. Then, a
clear signal CL1 is generated to clear the estimation function inhibit
signal INH to zero, and also generated is a clear signal CL2 to clear the
double-talk detection threshold TRIM output from the threshold control
section 131 to zero. A comparator 134 compares the threshold TRIM
calculated by the threshold control section 131 and the signal FLG output
by the adder 130. When TRIM.gtoreq.FLG, the double-talk state is detected
to inhibit the estimation function of an adaptive digital filter. When
TRIM<FLG, the single-talk state is detected to set the inhibit signal INH
to "0". The comparator 134 also generates a control signal S34 which
selects the method of control of the threshold value TRIM according to the
detected state.
Thus, the conventional double talk detector measures a short time power of
an input signal to a loudspeaker and a short time power of an output of
the echo canceller and calculates an averaged value of the ratio (Acom) of
the short time power of the loudspeaker input signal to the short time
power of the echo canceller output, thereby estimating a loss (TRIM) of
the echo path including the echo canceller. In the case of a single talk,
the value of Acom increases generally monotonously in the converging
process where an adaptive filter, which is to estimate the impulse
response of the echo path, has not enough estimated characteristics of the
echo path, while Acom becomes an approximately constant value when the
adaptive filter has enough estimated the characteristics of the echo path.
As Acom increases, TRIM also increases gradually. In a single-talk state,
a value (FLG) resulting from adding a slight margin to Acom becomes
greater than TRIM. Since FLG is greater than TRIM, the double talk
detector controls to execute renewal (updating) of a filter coefficient of
the adaptive filter. In the case of a double-talk state, on the other
hand, since a near-end speaker'signal is contained in an output signal of
an echo canceller, the output of the echo canceller increases so that both
Acom and FLG abruptly decrease until FLG becomes smaller than TRIM. Thus,
the double talk detector, upon detection of the fact that FLG has become
smaller than TRIM, suspends renewal of the impulse response, thereby
preventing the impulse response estimated by the adaptive filter from
being disturbed in the double-talk state.
When the echo path characteristic is changed due to such as movement of the
near-end speaker or microphone or the like, the impulse response estimated
by the adaptive filter and the impulse response of the after-change echo
path do not agree with each other. As a result, there is an increasing
residual echo so that FLG decreases as in a double-talk state, in which
case the double talk detector suspends adaptation of the echo canceller.
TRIM is gradually decreased for the period during which a received input
is present and adaptation is kept suspended, so that the echo path change
can be managed. After a while, TRIM becomes smaller than FLG, where the
double talk detector resumes adaptation of the echo canceller.
The conventional double talk detector as described above may require a few
seconds to discriminate between a double talk and an echo path change,
during which time there will be generated an echo. Also, if TRIM is
decreased at higher speed for the purpose of improving tracking
performance with respect to a change of the echo path, the echo canceller
would erroneously adapt to a long-time double talk, in some cases causing
the filter coefficient of the echo canceller to be disturbed. In a
literature, IEEE International Communications Conference Vol. 3 46.5 1985,
"A Double Talk Detection Method for an Echo Canceller," based on the fact
that 96% of double talks in 5-member teleconferences are occupied by those
within 3 seconds, the period for suspending the renewal of the filter
coefficient is limited within 3 seconds. However, in this case, there may
take place a time delay of 3 seconds in maximum in response to an echo
path change.
SUMMARY OF THE INVENTION
Accordingly, an essential objective of the present invention is to provide
an echo canceller which ensures stability in response to a double-talk
state in the same degree as in the conventional one and further which the
echo canceller can promptly adapt to any echo path change upon its
occurrence.
In order to achieve the above objective, the present invention provides an
echo canceller which comprises: a first adaptive filter for estimating an
impulse response of an echo path between reception and transmission side
according to an NLMS (Normalized Least Mean Square) algorithm and
convoluting a filter coefficient and a received input signal together to
synthesize a first pseud echo only when it has been determined that it is
in a single talk state; a first subtracter for subtracting the first pseud
echo synthesized by the first adaptive filter from a transmitting input
signal to thereby cancel an echo contained in the transmitting input
signal, thus synthesizing a transmitting output signal; a second adaptive
filter which is provided to determine whether it is an echo path change or
a double talk, and which, when it has been determined that a received
signal is present, renews a filter coefficient by an NLMS algorithm and
convolutes the received input signal and the filter coefficient together
to synthesize a second pseud echo; a second subtracter for subtracting the
second pseud echo synthesized by the second adaptive filter from the
transmitting input signal; a first input-output power ratio estimator for
detecting short time power of input and output signals of the first
subtracter and calculating the ratio therebetween (hereinafter, referred
to as a first ratio); a second input-output power ratio estimator for
detecting short time power of input and output signals of the second
subtracter and calculating the ratio therebetween (hereinafter, referred
to as a second ratio); a divider for dividing the second ratio by the
first ratio; a voice detector for detecting short time power of the
received input signal to determine whether far-end speech is present or
absent; and a double talk detector for controlling to execute or suspend
renewal of the filter coefficients of the first and second adaptive
filters by using a determination result of the voice detector, the first
and second ratios, and an output of the divider (hereinafter, referred to
as a third ratio).
The second adaptive filter executes adaptation when the voice detector has
detected far-end speech. When the voice detector has detected no far-end
speech, the double talk detector suspends adaptation of the first adaptive
filter. On the other hand, when it has been determined that far-end speech
is present, the double talk detector controls to execute adaptation of the
first adaptive filter if at least one of the following conditions (1) and
(2) is satisfied:
(1) either of the first or the second ratio is greater than a first
threshold; and
(2) the third ratio is greater than a second threshold.
The first threshold is 1 as its initial value, and is variably controlled
so as to gradually increase at a time constant slower than the converging
speed of the first adaptive filter while the first adaptive filter is
executing adaptation, where values in a range of approximately 25 to 100
at maximum are used for the first threshold. Also, the second threshold is
a predetermined fixed value, being around 4 to 9, a small value compared
with the first threshold.
With the above arrangement, the echo canceller of the present invention
operates as follows.
In a single talk state, the voice detector detects far-end speech, where
adaptation of the second adaptive filter is executed. The echo is
cancelled by the second subtracter, and the second ratio increases
monotonously. Meanwhile, the first threshold controlled by the double talk
detector so as to increase slowly, so that the second ratio becomes
greater than the first threshold. Thus, the double talk detector controls
to execute adaptation of the first adaptive filter.
In a double talk period, near-end speech is added to the echo and, passing
through the first and second subtracters, appears in their outputs, thus
causing the first and second ratios to abruptly decrease. If a double talk
takes place, the first and second ratios become smaller than the first
threshold. Further, since the first and second ratios decrease in like
manner, the third ratio becomes approximately 1, smaller than the second
threshold. As a result, the double talk detector suspends adaptation of
the first adaptive filter, so that the filter coefficient of the first
adaptive filter is no longer renewed. Thus, the coefficient of the first
adaptive filter can be prevented from being disturbed by near-end speech.
In an early stage following an echo path change, there will be an increased
difference between the impulse response of the echo path that has changed
and the impulse response estimated by the first and second adaptive
filters, so that the echo will not be cancelled by the first and second
subtracters. The moment that the echo path has changed, the first and
second ratios decrease as in a double talk, where the double talk detector
suspends adaptation of the first adaptive filter whereas the second
adaptive filter executes adaptation. As a result, the impulse response
estimated by the second adaptive filter gradually approaches the impulse
response of the echo path that has changed, allowing the second subtracter
to cancel the echo. Since adaptation of the first adaptive filter is
suspended, the first ratio remains approximately unchanged after the echo
path change, while only the second ratio is gradually increasing. Thus,
the third ratio resulting from dividing the second ratio by the first
ratio increases until it becomes greater than the second threshold. The
double talk detector detects that the third ratio has become greater than
the second threshold, and controls to immediately execute adaptation of
the coefficient of the first adaptive filter. The time required for the
third ratio to become greater than the second threshold is only a slight
time required for the second adaptive filter to increase the echo return
loss enhancement to a degree of approximately 6 to 10 dB.
Whereas time was used to discriminate between a double talk and an echo
path change in the prior art, the present invention uses the second
adaptive filter to discriminate between the two essentially and yet in
short time. As a result, the echo canceller of the present invention can
substantially improve the tracking performance to echo path changes.
BRIEF DESCRIPTION OF THE DRAWINGS
These and other objects and features of the present invention will become
apparent from the following description taken in conjunction with the
preferred embodiment thereof with reference to the accompanying drawings,
in which:
FIG. 1 is a block diagram showing a construction of an echo canceller
according to a first embodiment of the present invention;
FIG. 2 is a block diagram showing a detailed construction of an
input-output power ratio estimator of the first embodiment;
FIG. 3 is a block diagram showing a detailed construction of a voice
detector of the first embodiment;
FIG. 4 is a block diagram showing a detailed construction of a double talk
detector of the first embodiment;
FIG. 5 is a block diagram showing a detailed construction of a threshold
controller of the first embodiment;
FIG. 6 is a block diagram showing a detailed construction of an adaptation
controller of the first embodiment;
FIGS. 7(a), 7(b), 7(c), 7(d), 7(e) and 7(f) are timing charts for
explaining the principle of double talk detection in the echo canceller of
the first embodiment;
FIG. 8 is a block diagram showing a construction of an echo canceller
according to a second embodiment of the present invention;
FIG. 9 is a block diagram showing a construction of a step gain controller
of the second embodiment;
FIGS. 10(a), 10(b), 10(c) and 10(d) are timing charts for explaining the
principle of step gain control in the echo canceller of the second
embodiment;
FIG. 11 is a block diagram showing a construction of an echo canceller
according to a third embodiment of the present invention;
FIG. 12 is a block diagram showing a construction of an insertion loss
circuit of the third embodiment;
FIGS. 13(a), 13(b), 13(c) and 13(d) are timing charts for explaining the
principle of loss control in transmitting output in the echo canceller of
the third embodiment;
FIG. 14 is a block diagram showing a construction of an echo canceller
according to a fourth embodiment of the present invention;
FIG. 15 is a chart showing an example of long-time spectrum of speech;
FIG. 16 is a block diagram showing a construction of the second adaptive
filter which executes renewal of the filter coefficient two times within a
sample; and
FIG. 17 is block diagram of a conventional double talk detector.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
Before the description proceeds, it is to be noted that, since the basic
structures of various preferred embodiments are in common, like parts are
designated by the same reference numerals throughout the accompanying
drawings.
A first embodiment of the present invention is described with reference to
FIGS. 1 to 7.
FIG. 1 shows a construction of an echo canceller according to a first
embodiment of the present invention. It is assumed that signal processing
in the following description is performed all in digital form and that the
signal is sampled at a frequency of 8 kHz. Referring to FIG. 1, the echo
canceller includes a first adaptive filter 1, a first subtracter 2, a
second adaptive filter 3, a second subtracter 4, a first input-output
power ratio estimator 5 for calculating an input-output power ratio R1 of
the first subtracter 2, and a second input-output power ratio estimator 6
for calculating an input-output power ratio R2 of the second subtracter 4,
where the first and second adaptive filters 1 and 3 renew their filter
coefficients by NLMS (Normalized Least Mean Square). In FIG. 1, the echo
canceller further includes a divider 7 for dividing the second ratio R2 by
the first ratio R1 (hereinafter, R2/R1=R3 is referred to as "a third
ratio"), and a voice detector 8 for detecting short time power of a
received input signal to detect whether far-end speech is present or
absent, where the second adaptive filter 3 executes adaptation when the
voice detector 8 has detected a far-end speech. In the echo canceller, a
double talk detector 9 controls to estimate an impulse response of an echo
path of the first adaptive filter 1 when, as a result of the voice
detector 8 having detected far-end speech, either one of the first and
second ratios R1 and R2 is greater than a first threshold Th1 or the third
ratio R3 is greater than a second threshold Th2 (Th2=2, fixed in the
embodiments).
The first adaptive filter 1 synthesizes a pseud echo Yh by formula (1), in
more detail, the first adaptive filter 1 estimates an impulse response of
an echo path between a reception side of a signal x and a transmission
side of a signal y, holding the estimated impulse response as a filter
coefficient, and convoluting the filter coefficient together with a
received input signal to thereby synthesize a pseud echo Yh. The first
subtracter 2 subtracts the pseud echo synthesized by the first adaptive
filter 1 from a transmitting input signal y to cancel an echo contained in
the transmitting input signal according to formula (2), thus yielding an
output (e), where the first adaptive filter 1 is renewing a filter
coefficient series {h.sub.0, h.sub.1, . . . , h.sub.N-1 } by an NLMS
(Normalized Least Mean Square) algorithm as represented in formula (3).
The second adaptive filter 3 similarly estimates an impulse response of
the echo path, and holds the estimated impulse response as a filter
coefficient, convoluting the received input signal together with the
filter coefficient to thereby synthesize a pseud echo Yhs. The second
adaptive filter 3 synthesizes the pseud echo Yhs according to formula (4)
like the first adaptive filter 1. The second subtracter 4 subtracts the
pseud echo (Yhs) synthesized by the second adaptive filter 3 from the
transmitting input signal to thereby cancel the echo according to formula
(5), where the second adaptive filter 3 is renewing a filter coefficient
series {hs.sub.d 0, hs.sub.1, . . . , hs.sub.N-1 } by an NLMS algorithm as
represented in formula (6). As for denotation, N is the order of an
adaptive filter, j is a sample number, x is a reference input signal, y is
a transmitting input, e is an output of the first subtracter 2, and es is
an output of the second subtracter 4. It is assumed that a step gain
.alpha.1 of the NLMS algorithm of the first adaptive filter 1 is set as
small as .alpha.1=1/4so that a greater amount of indoor echo return loss
enhancement can be obtained finally. A step gain .alpha.2 of the second
adaptive filter 3 is set to .alpha.2=1 where the converging speed is the
fastest so that echo path change and double talk can be discriminated from
each other in short time. In the embodiments, it is also noted here that
the echo canceling time of the first and second adaptive filters is
assumed to be 250 msec.
##EQU1##
FIG. 2 shows a detailed construction of the input-output power ratio
estimator 5, which includes a pair of power detectors 21 and 22 each
having a similar construction for measuring a short time power of signals.
A divider 23 divides input power detected by the power detector 21 by an
output power detected by the power detector 22.
The construction of the power detector 21 is now described. Referring to
FIG. 2, the estimator is provide with a square circuit 211 for calculating
a square of an input signal y, a memory 212 for storing a short time
average power of the input signal y, a comparator 213 for comparing a
previous power value Py stored in the memory with an output y.sup.2 of the
square circuit 211 and, when the output y.sup.2 of the square circuit is
greater than the previous stored power Py, yielding an output of 1. A
switch 214 selects a constant k1 when the comparator 213 has yielded an
output of 1, and otherwise selects a constant k2. A subtracter 215
subtracts an output of the memory 212 from the output of the square
circuit 211 while a multiplier 216 calculates a product of an output of
the subtracter 215 and a constant selected by the switch 214. An adder 217
adds together an output of the multiplier 216 and the output of the memory
212 to yield an output of short time power of the signal, which the output
of the adder 217 is delayed by a delay unit 218 and transferred to the
memory 212. With the above-described arrangement of the power detector 21
and 22 having the constants kl set relatively large and k2 set relatively
small, the power detectors 21, 22 are capable of detecting without time
lag the power of a signal whose leading edge of amplitude is steep and
trailing edge is relatively gentle such as of speech. The constants are
set to, for example, k1=0.01, k2=0.001, in the case of a sampling
frequency of 8 kHz. The second input-output power ratio estimator 6 is
similar in construction to FIG. 2.
The voice detector 8 is constructed in the same manner as in the
conventional one, which makes use of a conventionally available detection
method as shown in FIG. 3. Referring to FIG. 3, a power detector 31 is the
same as the power detector 21 or 22 for detecting a short time power as in
FIG. 2. A minimum holder 32, although similar in construction to the power
detector 31, has its constants k1 and k2 set as to detect the minimum Pmin
of the signal power, where k1 is set small and k2 is set large, e.g.
k1=0.00001, k2=0.001. An amplifier 33 doubles an output Pmin of the
minimum holder 32, and a comparator 34 compares received signal power Px
detected by the power detector 31 with an output 2Pmin of the amplifier
33. When the output of the power detector 31 is greater than the output of
the amplifier 33 (i.e., Px.gtoreq.2Pmin), the comparator 34 yields an
output of VD=1 based on a detection that speech is present,.
FIG. 4 shows a detailed construction of the double talk detector 9.
Referring to FIG. 4, a threshold controller 41 controls the first
threshold Th1 in such a following way. The first threshold Th1 is
gradually increased so as not to exceed the first ratio R1 when the voice
detector 8 has determined that far-end speech is present and besides the
first ratio Rt is greater than the first threshold Th1. Whereas, the
threshold controller 41 makes the first threshold Th1 decrease very slowly
when R1 is smaller than the current threshold Th1 irrespective of the
presence of received speech. Further, the threshold controller 41 holds
the previous value of the first threshold Th1 when it has been determined
that no speech is present. An adaptation controller 42 determines to
execute adaptation of the first adaptive filter 1 when the voice detector
8 has determined that speech is present and besides when either one of the
first and second ratios R1 and R2 is greater than the first threshold Th1
or when the third ratio R3 is greater than the second threshold Th2.
FIG. 5 shows a detailed construction of the threshold controller 41.
Referring to FIG. 5, the threshold controller 41 includes a memory 51 for
storing a threshold Th1 and a comparator 52 for comparing the first ratio
R1 with the threshold Th1. A switch 53 selects a constant Kup when the
comparator 52 has determined that R1 is greater than Th1 and otherwise
selects a constant Kdn. A subtracter 54 subtracts Th1 from R1, and a
multiplier 55 calculates a product of a time constant K selected by the
switch 53 and an output (R1-Th1) of the subtracter 54. An adder 56 then
adds together an output of the multiplier 55 and the threshold Th1.
Further, a limiter 57 limits the maximum and minimum of an output of the
adder 56. A switch 58 is controlled by the voice detector 8 in response to
the output signal VD thereof. The switch 58 selects an output Of the
limiter 57 when speech is present, and selects an output of the memory 51
when no speech is present. A delay unit 59 delays an output of the switch
58 by one sample and the resultant output is transferred to the memory 51.
Referring to the constants Kup and Kdn selected by the switch 53, the
constant Kup is a constant determined by a formula Kup=Ts/Tup, where Ts is
sampling frequency and Tup is time constant. Likewise, the constant Kdn is
a constant determined by a formula Kdn=Ts/Tdn, where Tdn is time constant.
In the embodiments, it is assumed that Tup is relatively short, approx.
500 msec, and Tdn is a very long time constant around 30 sec.
The threshold controller 41 with the above-described arrangement performs
the control in such a way that, when no received speech has been detected
by the voice detector 8, the switch 58 selects an output of the memory 51
to be applied to the input of the delay unit 59, so that a previous value
of the threshold stored in the memory 51 is held. Whereas, when speech has
been detected by the voice detector 8, the first threshold is renewed in
the following way. That is, the first ratio R1 and the first threshold Th1
are compared with each other by the comparator 52, where, when
R1.gtoreq.Th1, the constant Kup is selected by the switch 53 and, when
R1<Th1, the constant Kdn is selected by the switch 53. Assuming that the
result of calculations by the subtracter 54, the multiplier 55, and the
adder 56 is Th, then the value Th can be represented by the following
formula (7):
##EQU2##
The above formula represents that the value Th is calculated so as to
gradually approach to R1 by the time constant Tup or Tdn. When
R1.gtoreq.Th1, the constant Kup is selected by the switch 53, where Th
approaches to R1 by a time constant of 500 msec. When R1<Th1, the constant
Kdn is selected by the switch 53, where Th approaches to R1 by a time
constant of 30 sec. When the resultant calculation value Th by formula (7)
is smaller than `1`, the limiter 57 limits the threshold to `1`. When the
resultant value is greater than a predetermined first threshold upper
limit Th1max, the limiter 57 limits the threshold to Th1max, thereby
arranging the first threshold Th1 to fall within an appropriate range. The
threshold upper limit Th1max is assumed to be 25.
FIG. 6 shows | | |