|
Description  |
|
|
BACKGROUND OF THE INVENTION
The invention relates to a digital echo canceller with a receive path
between a receive input and a receive output, and a send path between a
send input and a send output, said echo canceller being used for
cancelling an additive echo signal at the send input which has occurred in
response to a receive input signal applied to the receive input, said echo
canceller including:
a filter combination comprising
a first digital filter with a programmable filter coefficient memory for
generating in response to the receive input signal a first replica signal
which is an estimate of the echo signal, and
a second, adaptive digital filter with a filter coefficient memory for
generating in response to the receive input signal a second replica signal
which is an estimate of the echo signal, the second digital filter further
including means for forming an error signal that is representative of the
difference between a signal applied to the send input and the second
replica signal, and an adaptation processor for adaptively modifying
filter coefficients in response to the receive input signal and the error
signal and applying the modified filter coefficients to the filter
coefficient memory of this second digital filter;
means for forming a send output signal as the difference between the signal
applied to the send input and the first replica signal;
controllable gate means for selectively applying the modified filter
coefficients to the programmable filter coefficient memory of the first
digital filter; and
control means for blockwise determining the respective levels of the error
signal and the send output signal, and for generating in response to the
levels thus found a control signal for the gate means which depends in a
predetermined manner on differences between the respective levels.
A digital echo canceller of such a structure is known from an article
entitled "Echo Canceler with Two Echo Path Models" by K. Ochiai et al.,
published in IEEE Transactions on Communications, Vol. COM-25, No. 6, June
1977, pp. 589-595.
The echo canceller described in this article is especially arranged for
combatting the disturbing effect on the echo canceller adjustment caused
by double-talk. Double-talk occurs when a desired signal to be transmitted
and an echo signal are simultaneously applied to the send input. The
superpositioning of these signals then results in that the adjustment of
the echo canceller for cancelling the echo signal can be deranged
significantly by the signal to be transmitted that is also present. This
means that the current echo signal is then no longer cancelled
sufficiently by the replica formed by the echo canceller. In the above
article a robust solution is given to the problem of a possible
misadjustment of the echo canceller caused by double-talk. For this
solution a filter combination is used consisting of a first digital filter
which comprises a programmable filter coefficient memory and which is used
for the echo cancellation proper, as well as a second, adaptive digital
filter with an associated filter coefficient memory. These two filters
each generate a replica of the echo signal and as long as the replica
generated by the adaptive filter is a better estimate of the echo signal
than the replica generated by the programmable filter the filter
coefficients of the adaptive filter are transferred to the programmable
filter. During double-talk the adjustment of the adaptive filter is
disturbed and then the transfer of filter coefficients to the programmable
filter is interrupted. This achieves that the adjustment of the adaptive
filter does not disturb the operation of the programmable filter for the
echo canceller proper during double-talk.
This known digital echo canceller is implemented completely in the
time-domain. In the field of speech and data transmission, most
applications utilize time-domain adaptive filters (TDAF) that are realized
as adaptive transversal filters using a least mean square (LMS) algorithm
for the adaptation of filter coefficients. When the length of the impulse
response of the echo path assumes large values, such as, for example, can
be found in applications in the field of acoustics, a TDAF realized as a
transversal filter presents the drawback that the complexity expressed in
terms of computational operations (multiplications and additions) per
output sample increases linearly with the number of discrete-time
components with which the impulse response of the echo path can be
represented. More specifically, for this known echo canceller it holds
that the number of operations required for computing N components is in
the order of magnitude of N.sup.2 for the programmable filter with respect
to the calculation of N samples of the first replica signal, and in the
order of magnitude of N.sup.2 for the adaptive filter with respect to the
calculation of N samples of the second replica signal and also in the
order of magnitude of N.sup.2 with respect to the calculation of
adaptations for N filter coefficients. In addition, a TDAF realized as a
transversal filter has a low convergence speed for strongly (auto)
correlated input signals, such as speech and specific kinds of data. This
is because the convergence speed decreases as the ratio between maximum
and minimum eigenvalues of the correlation matrix of the input signal
increases. In this connection reference is made to an article entitled
"Echo Cancellation Algorithms" by C. W. K. Gritton and D. W. Lin,
published in IEEE ASSP Magazine, April 1984, pp. 30-38, more specifically
the section "LMS Algorithm" on pp. 32-33.
SUMMARY OF THE INVENTION
The invention has for its object to obviate the above-mentioned drawbacks
of a digital echo canceller of the type described in the preamble of Claim
1.
Thereto, a digital echo canceller according to the invention is
characterized in that:
in the said filter combination the second digital filter is a
frequency-domain block-adaptive filter having a block length of N'
components and having, for each signal block m, a number of N'
frequency-domain filter coefficients W(p;m) with p=0,1,2, . . ., N'-1, and
the first digital filter is a time-domain programmable digital filter
having a number of N time-domain filter coefficients W(i;m) with i=0,1,2,
. . . , N-1, with N' exceeding N; and
the echo canceller further includes transforming means, cascaded with the
gate means, for forming the N time-domain filter coefficients w(i;m)
required in the first digital filter as components of an N'-point Discrete
Orthogonal Transform of a block of N' frequency-domain filter coefficients
W(p;m).
Within the scope of the present invention it should be observed that known
implementations of a frequency-domain block-adaptive filter can be used
for example, those described in an article entitled "A Unified Approach to
Time-and Frequency-Domain Realization of FIR Adaptive Digital Filters" by
G. A. Clark et al., published in IEEE Transactions on Acoustics, Speech
and Signal Processing, Vol. ASSP-31, No. 5, October 1983, pp. 1073-1083,
and in an article entitled "Unconstrained Frequency-Domain Adaptive
Filter" by D. Mansour et al., published in IEEE Transactions on Acoustics,
Speech and Signal Processing, Vol. ASSP-30, No. 5, October 1982, pp.
726-734.
An echo canceller structured in accordance with the invention will lead to
a considerable reduction of the computational complexity when for the
implementation of the Discrete Orthogonal Transforms (DOT's) or Inverse
Discrete Orthogonal Transforms (IDOT's), respectively, fast and efficient
algorithms are used. For the widely used Discrete Fourier Transforms
(DFT's) as described in the above articles, such computational efficient
algorithms are known as Fast Fourier Transform (FFT). More specifically,
in that case the order of magnitude of the number of operations required
for computing a number of N components can then be given as follows: for
the time-domain programmable first digital filter, N.sup.2 with respect to
the computation of the first replica signal; for the frequency-domain
block-adaptive second digital filter, N'logN' with respect to the
computation of the adaptations for the N' frequency-domain filter
coefficients and also N'logN' with respect to the computation of the
second replica signal.
The use of such a frequency-domain adaptive filter (FDAF) also enables to
considerably improve the convergence properties for highly
(auto)correlated input signals, as for each of the substantially
orthogonal frequency-domain components the adaption gain of the adaptation
algorithm can be normalized in a simple manner according to the power of
the frequency-domain component concerned. In this connection reference is
again made to the article "Echo Cancellation Algorithms" by C. W. K.
Gritton and D. W. Lin, in which such a normalization is briefly described
on page 36.
A further considerable advantage linked with the specific filter
combination as used in an echo canceller according to the invention is
found in the negligible delay with which the first replica signal
generated by the time-domain programmable filter (TDPF) is available for
cancelling a current echo signal. Owing to this specific filter
combination (FDAF and TDPF) the cancelling of the echo signal proper is
not accompanied with a significant delay, which would be the case if both
filters were implemented in the frequency-domain. For, such a slight delay
means that in a duplex-communication link no additional delay in the
signal path need be introduced to make a generated replica signal coincide
in time with the current echo signal to be cancelled.
After convergence, the filter coefficients of an adaptive digital filter
will still fluctuate around their end values, more specifically, due to
the presence of noise or signals of a different nature superposed on the
reference signal (current echo signal) and due to the finite precision
(that is to say, the word length or the number of bits) with which the
different signals are represented in the digital filter. With the
customary assumptions as to the statistical independence of the different
magnitudes in a filter, which assumptions appeared to hold in practice,
the separate filter coefficients have the same variance when so-called
window functions are not utilized in the adaptation loop of the filter.
This means that at a like speed of convergence of the adaptive filter
(that is to say, a like gain factor in the adaptation algorithm) the use
of N' in lieu of N filter coefficients results in an increase of the
filter bottom noise factor (also known as final misalignment factor) by
for example 3dB in the practicle event when N'=2N, since this bottom noise
factor is determined by the sum of the filter coefficient variances. In
practice the gain factor in the adaptation algorithm is selected such that
a predetermined value of the bottom noise factor is not exceeded. In order
to compensate for an increase of this bottom noise factor in a FDAF, this
gain factor should be halved in this practical event of N'=2N, resulting
in that the speed of convergence is halved as well.
In the above article by Clark et al. a solution to this problem is proposed
in which the N' filter coefficients are obtained by utilizing the window
means included in the adaptation loop for carrying out an operation of
which the time-domain equivalent is a multiplication by a rectangular
window function having the length of N' time-discrete components, and
imposing the value of zero on the last N'-N components.
In a digital echo canceller according to the invention it is not necessary,
however, that such window means be used in the FDAF, whereas the then
increased bottom noise factor nevertheless does not affect the TDPF. True
enough, the FDAF uses N' frequency-domain filter coefficients
corresponding to N' time-domain filter coefficients, but transferring
filter coefficients from the FDAF to the TDPF requires a transformation
from the frequency-domain to the time-domain. From the N' corresponding
time-domain coefficients the N'-N time-domain coefficients which are
extraneous as regards the TDPF can be eliminated in a simple manner during
this transformation, so that also the contribution of these N'-N
time-domain coefficients to the bottom noise factor of the FDAF does not
penetrate to the TDPF. In addition, the window function thus realized can
be implemented in a much simpler manner than the window function used in a
FDAF, whose implementation considerably adds to computational complexity
of a FDAF (compare the description of FIG. 3 in the above article by Clark
et al.).
Summarizing: an echo canceller according to the invention results in the
following advantages with respect to the prior art techniques:
a considerable reduction of the number of computational operations required
for cancelling echo signals; this especially plays a role in echo paths
having an impulse response of a large length, such as acoustical echo
paths in which a relatively large number (1000 to 2000) of filter
coefficients is required for the echo cancellation;
a speed of convergence that can be increased in a simple manner;
an echo cancellation with a negligible delay; and
a slight bottom noise despite the absence of window means in the FDAF.
In addition, these advantages are obtained while maintaining a robust
solution to the problem of double-talk.
Although in case of double-talk the TDPF is isolated because the supply of
filter coefficients from the FDAF is interrupted, this FDAF itself can be
misadjusted considerably during double-talk. This means that after
double-talk has ended, additional time would be required to remove such a
misadjustment (or: misalignment) and subsequently resume supplying
properly adjusted coefficients from the FDAF to the TDPF. According to a
further aspect of the invention this objection can be met. Thereto, a
preferred embodiment of an echo canceller according to the invention is
characterized in that:
the echo canceller also comprises a cascade arrangement of second
controllable gate means and second transforming means for forming the N'
frequency-domain filter coefficients W(p;m) as components of a Discrete
Orthogonal Transform of the N filter coefficients w(i;m) in the
programmable filter coefficient memory of the first digital filter and for
selectively feeding back the thus formed filter coefficients W(p;m) to the
second digital filter; and in that
the control means are also arranged for blockwise determining the
respective levels of the receive input signal and the send input signal,
and for generating a second control signal for opening the second gate
means if the level of the receive input signal thus determined is lower
than the level of the send input signal thus determined.
The properly adjusted filter coefficients present in the TDPF just prior to
the occurrence of double-talk can thus be fed back from the TDPF to the
FDAF during double-talk. It should be observed that the conveying of
filter coefficients from the FDAF to the TDPF and vice versa will take
place in mutually separated time intervals, so that the respective
transformations concerned are not effected simultaneously. This means that
in a practical implementation the available process time can be used
efficiently.
An echo canceller according to the invention can advantageously be used in
a loudspeaking telephone set. In this respect it is to be recommended,
when installing the telephone set, to preprogram the programmable filter
coefficient memory in which the filter coefficients supplied are
temporarily stored and which forms a part of the TDPF. This is quite
possible since a large portion of the echo path between loudspeaker and
microphone is determined by the set itself and, therefore, is known.
Consequently, the impulse response to be modelled (which is characteristic
of the current echo path) can be reasonably approximated in the first
instance by inputting as initial values in this programmable filter
coefficient memory filter coefficients representing such an approximated
model.
BRIEF DESCRIPTION OF THE DRAWING
The invention and its advantages will be further explained hereinbelow with
reference to the drawings, in which:
FIG. 1: shows a diagram illustrating the manner in which an echo canceller
is generally used in a set used for receiving and transmitting speech
signals;
FIG. 2: shows a diagram of a discrete-time model of a known embodiment of a
digital echo canceller;
FIG. 3: shows a general diagram of a discrete-time model of an embodiment
of a digital echo canceller according to the invention;
FIG. 4: shows a more detailed diagram of a discrete-time model of an
embodiment of a digital echo canceller according to the invention;
FIG. 5: shows a diagram of a variant of a portion of the embodiment as
shown in FIG. 4, to illustrate a different implementation of the
overlap-save method; and
FIG. 6: shows a diagram of an optional implementation of the control means
shown in FIG. 5 for generating control signals for the gate means used for
conveying filter coefficients.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
In FIG. 1 is shown a simplified conceptual block diagram of the use of an
echo canceller in a telephone set having loudspeaker reproduction of a
received speech signal. Such an echo canceller 1 comprises a receive path
2 with a receive input RI and a receive output RO, as well as a send path
3 with a send input SI and a send output SO. A receive input signal x(t),
henceforth being denoted as far-end signal, is applied to receive input RI
and transferred via receive path 2 to receive output RO which is connected
to a loudspeaker 5 via a receive amplifier 4. A microphone 6 generates a
signal to be transmitted which is applied as a send input signal s(t) to
send input SI via a send amplifier 7, the signal henceforth being denoted
as near-end signal. This near-end signal s(t) is transferred via a send
path 3 to a send output SO. Between loudspeaker 5 and microphone 6 there
is an acoustic echo path symbolically represented in FIG. 1 by an arrow 8.
Via this acoustic echo path 8 a far-end signal x(t) at receive output RO
can introduce an undesired additive echo signal e(t) at signal input SI,
so that a sum signal z(t)=s(t)+e(t) is applied to signal input SI. Echo
canceller 1 now has as its task to cancel this undesired echo signal e(t)
in the best possible way. Thereto, echo canceller 1 comprises a filter
which in response to far-end signal x(t) in receive path 2 generates a
signal e(t) that is a replica of the undesired echo signal e(t). This
replica signal e(t) is subtracted from sum signal z(t)=s(t)+e(t) at send
input SI by means of a combining circuit 10 for forming a send output
signal r(t) which can be described as
r(t)=s(t)+[e(t)-e(t)]
From this expression it appears that signal r(t) at send output SO
represents the signal to be transmitted s(t) when replica signal e(t) is a
reliable estimate of echo signal e(t), as in that case the second term in
the right-hand member of this expression will be practically equal to
zero. Generally, the transfer characteristic of the echo path between the
receive output RO and send input SI will be time-varying, in which
especially acoustic echo path 8 may show large variations. As echo signal
e(t), in a good approximation, may be considered to be the linear
convolution of far-end signal x(t) with the impulse response h(t) of the
echo path between the receive output RO and the send input SI, the shape
of time-varying impulse response h(t) will lead to corresponding
variations of echo signal e(t) at send input SI. Therefore, filter 9 in
echo canceller 1 is arranged as an adaptive filter having as its task to
adjust its impulse response w(t) in the best way possible for fitting the
impulse response h(t) of echo path RO-SI. The adaptive adjustment of this
filter 9 is controlled by signal r(t) at the output of combining circuit
10. This adaptive adjustment is continued as long as there is a
correlation between the control signal r(t) and the far-end signal x(t).
When only the far-end signal x(t) is present (and, consequently, near-end
signal s(t)=0), adaptive filter 9 will generate a replica signal e(t)
which is a reliable estimate of the echo signal e(t). However, when
far-end signal x(t) as well as near-end signal s(t) are present, a
situation will arise which is commonly referred to as double-talk. If no
proper measures are taken, adaptive filter 9 can be considerably
misadjusted during double-talk owing to the presence of near-end signal
s(t) as a disturbing factor in control signal r(t). This misadjustment of
adaptive filter 9 leads to a replica signal e(t) which is no longer a
reliable estimate of echo signal e(t), so that at send output SO a signal
r(t) will occur which is disturbed to an annoying degree by an
inadequately or even improperly cancelled echo signal.
Since the present invention relates to a digital echo canceller, a
discrete-time modelling will be utilized in the following description.
Such a modelling can be obtained in the most convenient manner by assuming
in the conceptual diagram of FIG. 1 that the signals x(t) and z(t) are
applied to receive input RI and send input SI via analog-to-digital
converters (not shown), the signals x(t) and r(t) at receive output RO and
send output SO are taken off via digital-to-analog converters (not shown),
and further that all relevant signals in echo canceller 1 are digital
signals. These digital signals are denoted in a conventional manner so
that, for example, x(k) denotes a quantized sample of continuous-time
signal x(t) at instant t=kT, in which 1/T is the sampling frequency. For
completeness it should be observed that in practice a discrete-time signal
x(k) to be applied to filter 9 is derived from continuous-time signal x(t)
in receive path 2 by means of an analog-to-digital converter between
receive path 2 and the input of filter 9 so as to achieve that no
unnecessary quantizing noise is introduced into the signal x(t) applied to
the loudspeaker 5 through a cascade arrangement of analog-to-digital and
digital-to-analog converters. The latter item is disregarded for the
further description.
FIG. 2 shows a diagram of a discrete-time model of a known embodiment for a
digital echo canceller offering a robust solution to the problem caused by
double-talk set forth hereinbefore. Corresponding elements in FIGS. 1 and
2 (and also in the further Figures) are always denoted by the same
reference symbols. In FIG. 2 the complete echo path between receive output
RO and send input SI is further shown diagrammatically by means of a
dashed line block 8, which path produces an undesired echo signal e(k) in
response to signal x(k) and is connected through an adder shown in a
dashed line to send input SI for forming the sum signal z(k)=s(k)+e(k).
The configuration of echo canceller 1 in FIG. 2 corresponds to that of FIG.
1 in the aforementioned article by Ochiai et al. This echo canceller 1
comprises a filter combination including a first time-domain programmable
filter 9 having a programmable memory 9(1) for the filter coefficients,
and a second time-domain adaptive filter 11. In response to far-end signal
x(k) the first time-domain filter 9 produces a first replica signal e(k)
which is an estimate of echo signal e(k). Thereto, filter 9 comprises a
circuit 9(2) for effecting a linear convolution of signal x(k) with the
impulse response of filter 9 which is represented with a number of N
filter coefficients in memory 9(1) which have values w(i), where i=0,1,2,
. . . ,N-1 and N is equal to the number of samples used for a sufficiently
accurate representation of impulse response h(i) of echo path 8. This
first replica signal e(k) generated by filter 9 is used for the echo
cancellation proper and is thereto subtracted in combining circuit 10 from
sum signal z(k)=s(k)+e(k) at send input SI for forming signal r(k) at send
output SO. The second time-domain adaptive filter 11 is arranged to always
have the most suitable values w(i) for the N filter coefficients of the
first time-domain filter 9 available. Thereto, filter 11 comprises a
time-domain filter section 12, an adaptation processor 13 and a combining
circuit 14. In response to far-end signal x(k) filter section 12 generates
a second replica signal e.sub.a (k) which is also an estimate of echo
signal e(k). Filter section 12 thereto comprises a memory 12(1) for a
number of N filter coefficients and a circuit 12(2) for effecting a linear
convolution of signal x(k) with the impulse response of filter 11 which is
represented with the N filter coefficients in memory 12(1). By means of
combining circuit 14 an error signal r.sub.a (k) is formed representing
the difference between sum signal z(k)=s(k)+e(k) at the send input SI and
this second replica signal e.sub.a (k). Adaptation processor 13 is
arranged for adaptively correcting each of the N coefficients of filter 11
in response to far-end signal x(k) and error signal r.sub.a (k), and also
for constantly applying these N corrected filter coefficients to memory
12(1) of filter section 12 of filter 11. In this manner recently corrected
values for the N filter coefficients of the first, programmable filter 9
are constantly available in memory 12(1) of the second, adaptive filter
11. With respect to the adaptation effected in processor 13 it should be
observed that for this purpose known algorithms of the least-mean-square
type can be used. Furthermore, it is worth mentioning that error signal
r.sub.a (k) in FIG. 2 is equal to the difference z(k)-e.sub.a (k) between
the signals at the send input SI and the output of the filter section 12,
but that it is equally possible to utilize in a known manner strongly
quantized versions of this difference as error signal r.sub.a (k), as a
result of which the implementation of adaptation processor 13 can be
simplified.
The values of the N filter coefficients in memory 12(1) of adaptive filter
11 are continuously adapted to the variations of impulse response h(i) of
echo path 8, but these continuously adapted values are applied only
selectively to programmable memory 9(1) of filter 9 via controllable gate
means 15. As long as the second replica signal e.sub.a (k) generated by
the adaptive filter 11 is a better estimate of echo signal e(k) than the
first replica signal e(k) generated by programmable filter 9, these gate
means 15 are open and the adapted filter coefficients in memory 12(1) of
adaptive filter 11 are transferred to programmable memory 9(1) of filter
9. During the double-talk both the near-end signal s(k) and the echo
signal e(k) are present at the send input SI, and adaptive filter 11 can
be misadjusted to a considerable extent by the presence of near-end signal
s(k) as a disturbing term in error signal r.sub.a (k), from which the
situation may arise that the first replica signal e(k) is a better
estimate of the echo signal e(k) than the second replica signal e.sub.a
(k). In such a situation the gate means 15 are blocked and the supply of
adapted filter coefficients to programmable memory 9(1) is interrupted.
This achieves that programmable filter 9 is rendered so to say immune to a
possible misadjustment of adaptive filter 11 during double-talk, so that
in that case too an effective cancellation of echo signal e(k) is
guaranteed as long as echo path 8 is not subject to significant changes.
As a most important criterion for controlling the gate means 15 is used the
mutual relationship between the levels of the error signal r.sub.a (k) in
adaptive filter 11 and of signal r(k) at send output SO. Thereto, the echo
canceller 1 in FIG. 2 comprises control means 16 arranged for blockwise
determining the respective levels of the error signal r.sub.a (k) and send
output signal r(k). In response to these levels, determined blockwise, of
the signals r.sub.a (k) and r(k), control means 16 produce a control
signal for gate means 15 which depends in a predetermined manner on the
differences between these levels. In accordance with the above article by
Ochiai et al. a control signal is produced for opening gate means 15 for
applying adapted filter coefficients to programmable memory 9(1) when the
following condition is satisfied
L[r.sub.a (k)]<c.sub.1 L[r(k)] (1)
In this expression the symbol L denotes the level of the relevant signal in
a block of a number of M consecutive signal samples and c.sub.1 is a
positive constant of a value smaller than 1. In the case of speech signals
having a sampling rate of 1/T=8 kHz, a possible choice for M and c.sub.1
can be, for example, the values of M=128 and c.sub.1 =0.875. As is known,
the level L of a signal in a block of M signal samples can be represented
in different ways, for example, by the average value, over one block, of
the amplitude or of the power of the signal samples, but also by the peak
value of the amplitude of the signal samples in this block. In the
following description it will be assumed that the level L is related to
the average value, over one block, of the power of the signal samples, but
self-evidently, this assumption does not imply a restriction of the idea
of signal level.
The above condition (1) indicates that the level of the error signal
r.sub.a (k) is more than -20 log c.sub.1 dB lower than the level of the
send output signal r(k). In a practical implementation of control means 16
condition (1) is further required to be satisfied for a number of D
consecutive blocks of M signal samples before the control signal opens the
gate means 15 for applying filter coefficients to programmable memory 9(1)
of filter 9. For example, the value D=3 can be selected for the said
values for M and c.sub.1.
The control of gate means 15 on the basis of condiditon (1) appears to be
sufficient in practice to avoid in most cases the disturbance of echo
cancellation during double-talk. This control can be improved by demanding
in accordance with the above article by Ochiai et al., that a control
signal for opening gate means 15 should only be generated when, in
addition, the following conditions:
L[r.sub.a (k)]<c.sub.2 L[z(k)] (2)
L[z(k)]<c.sub.3 L[x(k)] (3)
are satisfied simultaneously, in which c.sub.2 and c.sub.3 are positive
constants having a value smaller than 1. Once these conditions (1), (2)
and (3) are satisfied simultaneously for D consecutive blocks of M signal
samples, the gate means 15 are opened and in the opposite case the gate
means 15 are blocked so that the supply of adapted filter coefficients to
programmable memory 9 (1) is interrupted. Condition (2) indicates that the
level of error signal r.sub.a (k) is more than -20 log c.sub.2 dB lower
than the level of the send input signal z(k). When double-talk is absent,
thus in the case when near-end signal s(k)=0, condition (2) implies that
the cancelling of echo signal e(k) by second replica signal e.sub.a (k) is
better than -20 log c.sub.2 dB. Condition (3) implies that the transfer of
adapted filter coefficients is interrupted when the level of send input
signal z(k) is less than -20 log c.sub.3 dB lower than the level of
far-end signal x(t), so when it is evident that double-talk occurs.
As it has already extensively been explained hereinbefore, such an echo
canceller with a filter combination of two time-domain filters 9 and 11
has the inherent objections that the complexity, expressed in terms of
computational operations (multiplications and additions) per output
sample, for programmable filter 9 as well as adaptive filter 11, increases
linearly with the number of N samples used for a sufficiently accurate
representation of impulse response h(i) of echo path 8, whilst the
adaptive filter 11 usually realized as a transversal filter has a low
speed of convergence for strongly (auto)correlated far-end signals x(k),
such as speech and specific types of data. These objections weigh
especially heavily when impulse response h(i) of echo path 8 has a large
length with values N from 1000 to 2000, as is the case with the acoustic
echo paths under discussion.
FIG. 3 shows a general diagram of a discrete-time model of an embodiment
for a digital echo canceller according to the invention, by means of which
the above objections of the known echo canceller described with reference
to FIG. 2 are met.
Thereto, an echo canceller according to the invention includes a specific
filter combination, in which the first digital filter 9 is a time-domain
programmable filter (TDPF) and the second digital filter 11 is a
frequency-domain block-adaptive filter (FDAF). The choice of this specific
filter combination (9,11) is based on the consideration that TDPF 9
operates on a sample-by-sample basis so that the first replica signal e(k)
for cancelling a current echo signal e(k) at the send input SI is
available with a delay of one sample interval which is negligibly small in
practice, and the use of a FDAF 11 enables a considerable saving in
numbers of computational operations and, in addition, enables to improve
considerably in a simple manner the convergence behaviour of the echo
canceller 1, while retaining the robust solution to the problem of
double-talk which is explained with reference to FIG. 2.
In FIG. 3 the general structure of a frequency-domain block-adaptive filter
11 is represented in a general diagram. In FIG. 3 and in the next Figures
double-lined signal paths denote paths in the frequency-domain and
single-lined signal paths denote paths in the time-domain. Transformations
from the time-domain to the frequency-domain and vice versa are effected
by means of Discrete Orthogonal Transforms (DOT's) and their inverses
(IDOT's), respectively. An illustrative example of such transformations is
the Discrete Fourier Transform (DFT) and its inverse (IDFT), which are
utilized extensively, compare the above articles by Clark et al. and
Mansour et al. For practical reasons of computational complexity and
permissible signal delays these DOT's have a finite block length N' and in
the literature such transformations are known as N'-point DOT's, in which
"point" may refer both to a discrete time-domain component and to a
discrete frequency-domain component. With respect to the block length N'
the following observations are made. FDAF 11 has to generate a replica
signal e.sub.a (k) which is a good estimate of echo signal e(k). Echo
signal e(k), in its turn, may be considered to be a linear convolution of
far-end signal x(k) with impulse response h(i) of echo path 8 with
i=0,1,2, . . . ,N-1. It need not be further elucidated that FDAF 11 also
has to present an impulse response of a length N for generating a replica
signal e.sub.a (k) as the linear convolution of far-end signal x(k) with
the impulse response of FDAF 11. In FDAF 11 the operations required
thereto are effected on blocks of N' points in the frequency-domain, and
it is well known that these operations correspond with a circular
convolution in the time-domain, in which the period is equal to the block
length N'. The desired linear convolution can then be obtained by applying
a suitable sectioning of the time-domain signals which are involved in the
N'-point DOT's, in which the most common sectioning procedures are the
overlap-save method and the overlap-add method. The above implies that
generally the block length N' of the DOT's exceeds the desired length N of
the impulse response of FDAF 11. In the above article by Clark et al. it
is stated that for the most efficient implementation of FDAF 11 presenting
an impulse response having the length of N, use is made of DFT's having a
block length N'=2N, and of a sectioning of time-domain signals into blocks
of N'=2N points in which each block overlaps the previous block by N
points. For large values of N, for example N=1000 to N=2000 in the present
case of acoustic echo paths 8, the computational complexity can yet be
decreased considerably by utilizing efficient implementations of the DFT's
known as "Fast Fourier Transform" (FFT), due to which the number of
computational operations per N points of replica signal e.sub.a (k) is of
the order of N log N. Such computationally efficient implementations are
also known for other types of DOT's than the DFT, but for simplicity it
will be assumed hereinafter that the N'-point DOT is an N'-point DFT with
N'=2N. In addition, frequency-domain signals will be denoted by upper case
letters to differentiate in a simple manner between frequency-domain and
time-domain signals, whilst the time-domain signals, as was done in the
foregoing, are denoted by lower-case letters. Finally, the further
description is aimed at applying the overlap-save method as a sectioning
procedure of the time-domain signals.
The structure of FDAF 11 shown in FIG. 3 can most conveniently be described
with reference to the structure of time-domain adaptive filter 11 in FIG.
2. FDAF 11 again comprises a filter section 12, an adaptation processor 13
and a combining circuit 14, but in FIG. 3 filter section 12 and adaptation
processor 13 operate in the frequency-domain, so that three domain
transformations have to be effected, that is to say
by means of transforming means 17 and associated sectioning means 17(1): a
2N-point DOT, for transforming each block of 2N time-domain points of
far-end signal x(k) into a block of 2N frequency-domain points, which are
denoted X(p;m) with p=1,2,2, . . . ,2N-1 for a block of block number m;
by means of transforming means 18 and associated sectioning means 18(1): a
2N point IDOT, for transforming each block of 2N frequency-domain points
E.sub.a (p;m) into a block of N time-domain points of replica signal
e.sub.a (k);
by means of transforming means 19 and associated sectioning means 19(1): a
2N-point DOT, for transforming each block of N time-domain points of error
signal r.sub.a (k), after it has been augmented to a block of 2N
time-domain points, to a block of 2N frequency-domain points R.sub.a
(p;m).
The details of the overlap-save method used for the sectioning procedure
will be further explained with reference to FIG. 4. Filter section 12 of
FDAF 11 comprises a memory 12(1) for storing the 2N frequency-domain
filter coefficients W(p;m) of block m and a circuit 12(2) for multiplying
each frequency-domain point X(p;m) by an associated frequency-domain
filter coefficient W(p;m), for forming products X(p;m)W(p;m) representing
the 2N frequency-domain points E.sub.a (p;m). Adaptation processor 13 is
further arranged for obtaining in response to the 2N frequency-domain
points X(p;m) and R.sub.a (p;m) blockwise-adapted frequency-domain filter
coefficients W(p;m) which are stored in memory 12(1).
Because it is the task of FDAF 11 to always present the most suitable
values for the filter coefficients of TDPF 9, but because the filter
coefficients W(p;m) of FDAF 11 are frequency-domain filter coefficients, a
domain-transformation has to be effected for the selective transfer of
filter coefficients from FDAF 11 to TDPF 9. Thereto, echo canceller 1 in
FIG. 3 also includes transforming means 20 and associated sectioning means
20(1) for effecting a 2N-point IDOT to transform the 2N frequency-domain
filter coefficients W(p;m) at the output of memory 12(1) of FDAF 11 into N
time-domain filter coefficients w(i;m) to be applied to programmable
memory 9(1) of TDPF 9. These transforming and sectioning means 20,20(1)
are included in a cascade arrangement with controllable gate means 15
(subsequent thereto in FIG. 3). It has further been shown diagrammatically
in FIG. 3 by means of dashed lines to control means 16 that the control of
gate means 15 does not only depend on the levels of signals r.sub.a (k)
and r(k) in accordance with condition (1), but can also depend on the
levels of signals z(k) and x(k) in accordance with conditions (2 ) and (3)
as explained in the description of FIG. 2.
A drawing in further detail of a digital echo canceller according to FIG. 3
will now be given with reference to FIG. 4 in which it has been indicated
explicitly that the N'-point DOT is an efficient implementation of an
N'-point DFT with N'=2N known as 2N-point FFT, and in which the sectioning
means are shown separated from the transforming means to describe their
functioning in a clearer manner.
In FIG. 4 far-end signal x(k) is applied to sectioning means 17(1) to be
subdivided into blocks of 2N points by means of serial-to-parallel
conversion, each block overlapping its predecessor by N points, as is
shown symbolically in FIG. 4. The points of a block of block number m are
denoted x(i;m) with i=1,2, . . . ,2N-1. With the aid of transforming means
17 for effecting a 2N-point FFT the 2N time-domain points x(i;m) are
transformed into 2N frequency-domain points X(p;m) with p=1,2, . . .
,2N-1. In multiplier circuit 12(2) each point X(p;m) is multiplied by a
relevant filter coefficient W(p;m) from memory 12(1) for forming products
X(p;m)W(p;m) which represent 2N points E.sub.a (p;m). With the aid of
transforming means 18 for effecting a 2N point IFFT these 2N points
E.sub.a (p;m) are transformed into 2N points e.sub.a (i;m) in the
time-domain. Because the filter coefficients W(p;m) can be considered to
| | |