WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Subband echo cancellation method for multichannel audio teleconference and echo canceller using the same    
United States Patent6246760   
Link to this pagehttp://www.wikipatents.com/6246760.html
Inventor(s)Makino; Shoji (Machida, JP); Shimauchi; Suehiro (Tokyo, JP); Haneda; Yoichi (Tokyo, JP); Nakagawa; Akira (Kokubunji, JP); Kojima; Junji (Tokyo, JP)
AbstractIn a subband echo cancellation for a multichannel teleconference, received signals x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) of each channel are divided into N subband signals, an echo y(k) picked up by a microphone 16.sub.j after propagation over an echo path is divided into N subband signals y.sub.0 (k), . . . ,y.sub.N-1 (k), and vectors each composed of a time sequence of subband received signals x.sub.1 (k), . . . , x.sub.I (k) are combined for each corresponding subband. The combined vector and an echo cancellation error signal in the corresponding subband are input into an estimation part 19.sub.n, wherein a cross-correlation variation component is extracted. The extracted component is used as an adjustment vector to iteratively adjust the impulse response of an estimated echo path. The combined vector is applied to an estimated echo path 18.sub.n formed by the adjusted value to obtain an echo replica. An echo cancellation error signal e.sub.n (k) is calculated from the echo replica and a subband echo y.sub.n (k).



 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 6246760
Subband echo cancellation method for multichannel audio teleconference and

     echo canceller using the same - US Patent 6246760 Drawing
Subband echo cancellation method for multichannel audio teleconference and echo canceller using the same
Inventor     Makino; Shoji (Machida, JP); Shimauchi; Suehiro (Tokyo, JP); Haneda; Yoichi (Tokyo, JP); Nakagawa; Akira (Kokubunji, JP); Kojima; Junji (Tokyo, JP)
Owner/Assignee     Nippon Telegraph & Telephone Corporation (Tokyo, JP)
Patent assignment
All assignments
Publication Date     June 12, 2001
Application Number     08/927,961
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     September 11, 1997
US Classification     379/406.08
Int'l Classification     H04M 001/00 H04M 009/08 H04M 009/00
Examiner     Woo; Stella
Assistant Examiner    
Attorney/Law Firm     Connolly Bove Lodge & Hutz LLP
Address
Parent Case    
Priority Data     Sep 13, 1996[JP]8-243524
USPTO Field of Search     379/410 379/411 379/406 379/407 379/408 379/409 379/201 379/202 370/286 370/287 370/291 381/66 381/94.1 381/71.1
Patent Tags     subband echo cancellation multichannel audio teleconference and echo canceller
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5889857
Boudy
379/406.14
Mar,1999

[0 after 0 votes]
5818945
Makino

Oct,1998

[0 after 0 votes]
5774561
Nakagawa
381/66
Jun,1998

[0 after 0 votes]
5761318
Shimauchi
381/66
Jun,1998

[0 after 0 votes]
5721782
Piket
381/66
Feb,1998

[0 after 0 votes]
5661813
Shimauchi

Aug,1997

[0 after 0 votes]
5323459
Hirano
379/391
Jun,1994

[0 after 0 votes]
5272695
Makino
370/291
Dec,1993

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A subband echo cancellation method for a multichannel teleconference in which received signals of plural channels are reproduced as acoustic signals by loudspeakers corresponding to said plural channels, said acoustic signals being received by at least one microphone after propagating over each echo path thereto, an echo replica being subtracted from an echo provided from said at least one microphone, an echo cancellation error signal resulting from said subtraction and said received signal of each of said plural channels being used to calculate an adjustment vector, said adjustment vector being used iteratively to adjust an estimated value of an impulse response of each echo path, estimated echo paths having said adjusted impulse response being generated for each of said echo paths, and the corresponding one of said received signals being applied to each estimated echo path to generate said echo replica, the method further comprising the steps of:

(a) dividing said received signal and said echo into N subbands in each of said plural channels, and decimating them with predetermined decimation rates, respectively, to generate N subband received signals and N subband echoes, N being an integer equal to or greater than 2;

(b) generating N echo replicas by providing said N subband received signals to N estimated echo paths, each formed by a digital filter having a filter coefficient of a predetermined number of taps which simulates the impulse response of said echo path in each of said N subbands;

(c) subtracting said N echo replicas from corresponding N subband echoes to generate echo cancellation error signals in said N subbands;

(d) iteratively adjusting said filter coefficients of said digital filters in a manner to minimize said N echo cancellation error signals on the basis of said N echo cancellation error signals and corresponding N subband received signals; and

(e) combining said echo cancellation error signals in said N subbands into a full band send signal having said echoes suppressed; and

(f) extracting a variation component of the cross-correlation between said received signals of said channels as said adjustment vector.

2. The method of claim 1, wherein a combined received signal vector by combining received signal vectors of a sequence of received signals of each of said channels is calculated and a variation in the correlation between current and previous ones of said combined received signal vector is detected and used as said cross-correlation variation component.

3. The method of claim 2, wherein a method for detecting said variation in the cross-correlation between said current and previous combined received signal vectors in each of said channels is set optimum in said N subbands.

4. The method of claim 3, wherein said method for detecting said variation in the cross-correlation between said current and previous combined received signal vectors in each of said each channel is a projection algorithm or ESP algorithm and the projection order is set at an optimum value in each of said N subbands.

5. The method of claim 4, wherein the order of said projection algorithm or ESP algorithm is set at a minimum value at which the convergence speed of an echo return loss enhancement substantially saturates with respect to said received signal in said each subband, the number of taps of said digital filter corresponding to a lower one of said N subbands is larger than the number of taps corresponding to a higher subband.

6. The method of claim 4, wherein the order of said projection algorithm or ESP algorithm in each of said N subbands is set at a minimum value at which whitening of an estimation error vector at the time of having whitened said received signal by a linear predictive coding filter substantially saturates, the order of said projection or ES projection algorithm in said lower subband being set larger than the order of said projection or ES projection algorithm in said higher subband.

7. The method of claim 4, wherein the number of taps of said digital filter forming said estimated echo path in each of said N subbands is predetermined on the basis of at least one of the energy distribution in the frequency region of a desired one of said received signals, the room reverberation characteristic and the human psychoacoustic characteristic.

8. The method of claim 4, wherein the number of taps of said digital filter corresponding to a lower one of said N subbands is larger than the number of taps of said digital filter corresponding to a higher subband.

9. The method of claim 4, wherein the order of said projection algorithm or ESP algorithm in a lower one of said subbands is set larger than the order of said projection algorithm or ESP algorithm in a higher subband.

10. A subband echo cancellation method for a multichannel teleconference in which received signals of plural channels are reproduced as acoustic signals by loudspeakers corresponding to said plural channels, said acoustic signals being received by at least one microphone after propagating over each echo path thereto, an echo replica being subtracted from an echo provided from said at least one microphone, an echo cancellation error signal resulting from said subtraction and said received signal of each of said plural channels are used to calculate an adjustment vector, said adjustment vector is used to iteratively adjust an estimated value of an impulse response of said each echo path, estimated echo paths having said adjusted impulse responses corresponding to said each echo paths, and the corresponding one of said received signals is applied to said each estimated echo path to generate said echo replica, said method comprising the steps of:

(a) dividing said received signal and said echo into N subbands in each of said plural channels and decimating them with predetermined decimation rates, respectively, to generate N subband received signals and N subband echoes, N being an integer equal to or greater than 2;

(b) generating N echo replicas by providing said N subband received signals to N estimated echo paths each being formed by a digital filter having a filter coefficient of a predetermined number of taps which simulates the impulse response of said echo path in each of said N subbands;

(c) subtracting said N echo replicas from corresponding N subband echoes to generate echo cancellation error signals in said N subbands;

(d) iteratively adjusting said filter coefficients of said digital filters in a manner to minimize said N echo cancellation error signals on the basis of said N echo cancellation error signals and corresponding N subband received signals; and

(e) combining said echo cancellation error signals in said N subbands into a full band send signal having said echoes suppressed;

(f) adding a variation component to the cross-correlation between said received signals of said plural channels, each of said received signals being reproduced by said loudspeaker of one of said plural channels; and

(g) deriving said adjustment vector from said received signal added with said cross-correlation variation component.

11. The method of claim 10, wherein, letting the number of reproduction channels be represented by I and said received signals of said plural channels by x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) as functions of a discrete time k, said received signals x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) are input into time-variant filters with different time-variant characteristics for said plural channels, wherein they are convoluted, indicated by *, with impulse responses f.sub.1 (k), f.sub.2 (k), . . . , f.sub.I (k) of said filters for conversion into signals x.sub.1 (k), x.sub.2 (k), . . . ,x.sub.I (k) which satisfy

x.sub.1 (k)=f.sub.1 (k)*x.sub.1 (k)

x.sub.2 (k)=f.sub.2 (k)*x.sub.2 (k)

x.sub.I (k)=f.sub.I (k)*x.sub.I (k)

whereby said variation component of said cross-correlation between said received signal of said plural channels is added thereto.

12. The method of claim 10, wherein letting the number of reproduction channels be represented by I and said received signals of said plural channels by x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) as functions of a discrete time k, said received signals x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) are multiplied by different functions g.sub.1 (k), g.sub.2 (k), . . . , g.sub.I (k) for conversion into signals x.sub.1 (k), x.sub.2 (k), . . . x.sub.I (k) which satisfy

x.sub.1 (k)=g.sub.1 (k).multidot.x.sub.1 (k)

x.sub.2 (k)=g.sub.2 (k).multidot.x.sub.2 (k)

x.sub.I (k)=g.sub.I (k).multidot.x.sub.I (k)

whereby said variation component of said cross-correlation between said received signal of said plural channels is added thereto.

13. The method of claim 10, wherein letting the number of reproduction channels be represented by I and said received signals of said plural channels by x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) as functions of a discrete time k, said received signals x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) are added to different functions n.sub.1 (k), n.sub.2 (k), n.sub.I (k), respectively, for conversion into signals x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) which satisfy

x.sub.1 (k)=x.sub.1 (k)+n.sub.1 (k)

x.sub.2 (k)=x.sub.2 (k)+n.sub.2 (k)

x.sub.I (k)=x.sub.I (k)+n.sub.I (k)

whereby said variation component of said cross-correlation between said received signal of said plural channels is added thereto.

14. The method of claim 10, wherein letting the number of reproduction channels be represented by I and said received signals of said plural channels by x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) as functions of a discrete time k, said received signals x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) are converted into signals x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) by subjecting the frequency characteristic of each of said received signals to different time-variant frequency axis expansion/compression processing, whereby said variation component of said cross-correlation between said received signal of said plural channels is added thereto.

15. The method of claim 10, wherein the method of adding said variation component of said cross-correlation between said received signals of said plural channels is set optimum in each of said N subbands in a manner to reduce degradation of the psychoacoustic quality of said acoustic signal.

16. The method of any one of claims 1 to 15, wherein said subband received signals and said subband echoes are real-number signals.

17. The method of any one of claims 1 to 15, wherein said subband received signals and said subband echoes are complex signals.

18. A subband echo canceller for a multichannel teleconference in which received signals of plural channels are reproduced as acoustic signals by loudspeakers corresponding to said plural channels, said acoustic signals being received by at least one microphone after propagating over each echo path thereto, an echo replica being subtracted from an echo provided from said at least one microphone, an echo cancellation error signal resulting from said subtraction and said received signal of each of said plural channels being used to calculate an adjustment vector, said adjustment vector being used iteratively to adjust an estimated value of an impulse response of each echo path, estimated echo paths each having said adjusted impulse response being generated for each of said each echo paths, and the corresponding one of said received signals being applied to said each estimated echo path to generate said echo replica, said echo canceller comprising:

subband echo generating means for dividing said received signal and said echo into N subbands in each of said plural channels, and decimating them with predetermined decimation rates, respectively, to generate N subband received signals and N subband echoes, N being an integer equal to or greater than 2;

N estimated echo path means, each formed by a digital filter which is given a filter coefficient of a predetermined number of taps and simulates the impulse response of said echo path in each of said N subbands, said N estimated echo path means being supplied with said N subband received signals and generating N echo replicas, respectively;

error signal generating means for subtracting said N echo replicas from corresponding N subband echoes to generate echo cancellation error signals in said N subbands;

echo path estimating means for iteratively adjusting said filter coefficients of said digital filters in a manner to minimize said N echo cancellation error signals on the basis of said N echo cancellation error signals and said corresponding N subband received signals, said echo path estimation means comprising: cross-correlation variation extracting means for extracting a variation component of the cross-correlation between said received signals of said plural channels, and including adjustment means for using said variation component as said adjustment vector;

subband synthesis means for combining said echo cancellation error signals in said N subbands into a full band send signal having said echoes suppressed.

19. A subband echo canceller for a multichannel teleconference in which received signals of plural channels are reproduced as acoustic signals by loudspeakers corresponding to said plural channels, said acoustic signals being received by at least one microphone after propagating over each echo path thereto, an echo replica being subtracted from an echo provided from said at least one microphone, an echo cancellation error signal resulting from said subtraction and said received signal of each of said plural channels being used to calculate an adjustment vector, said adjustment vector being used iteratively to adjust an estimated value of an impulse response of each echo path, estimated echo paths each having said adjusted impulse response, and the corresponding one of said received signals is applied to said each estimated echo path to generate said echo replica, said echo canceller comprising:

subband echo generating means for dividing said received signal and said echo into N subbands in each of said plural channels and decimating them with predetermined decimation rates, respectively, to generate N subband received signals and N subband echoes, N being an integer equal to or greater than 2;

N estimated echo path means, each formed by a digital filter which is given a filter coefficient of a predetermined number of taps and simulates the impulse response of said echo path in each of said N subbands, said N estimated echo path means being supplied with said N subband received signals and generating N echo replicas, respectively;

error signal generating means for subtracting said N echo replicas from corresponding N subband echoes to generate echo cancellation error signals in said N subbands;

echo path estimating means for iteratively adjusting said filter coefficients of said digital filters in a manner to minimize said N echo cancellation error signals on the basis of said N echo cancellation error signals and said corresponding N subband received signals;

subband synthesis means for combining said echo cancellation error signals in said N subbands into a full band send signal having said echoes suppressed; and

cross-correlation variation adding means for adding a variation component of the cross-correlation between said received signals of said plural channels, received signals added with said cross-correlation variation component being used to derive said adjustment vector.

20. The echo canceller of claim 19, wherein said cross-correlation variation adding means is means by which, letting the number of reproduction channels be represented by I and said received signals of said plural channels by x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) as functions of a discrete time k, said received signals x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) are input into time-variant filters with different time-variant characteristics for said plural channels, wherein they are convoluted, indicated by *, with impulse responses f.sub.1 (k), f.sub.2 (k), . . . , f.sub.I (k) of said filters for conversion into signals x.sub.1 (k), x.sub.2 (k), . . . ,x.sub.I (k) which satisfy

x.sub.1 (k)=f.sub.1 (k)*x.sub.1 (k)

x.sub.2 (k)=f.sub.2 (k)*x.sub.2 (k)

x.sub.I (k)=f.sub.I (k)*x.sub.I (k)

whereby said variation component of said cross-correlation between said received signal of said plural channels is added thereto.

21. The echo canceller of claim 19, wherein said cross-correlation variation adding means is means by which, letting the number of reproduction channels be represented by I and said received signals of said plural channels by x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) as functions of a discrete time k, said received signals x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) are multiplied by different functions g.sub.1 (k), g.sub.2 (k), . . . , g.sub.I (k) for conversion into signals x.sub.1 (k), x.sub.2 (k), . . . x.sub.I (k) which satisfy

x.sub.1 (k)=g.sub.1 (k).multidot.x.sub.1 (k)

x.sub.2 (k)=g.sub.2 (k).multidot.x.sub.2 (k)

x.sub.I (k)=g.sub.I (k).multidot.x.sub.I (k)

whereby said variation component of said cross-correlation between said received signal of said plural channels is added thereto.

22. The echo canceller of claim 19, wherein said cross-correlation variation adding means is means by which, letting the number of reproduction channels be represented by I and said received signals of said plural channels by x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) as functions of a discrete time k, said received signals x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) are added to different functions n.sub.1 (k), n.sub.2 (k), n.sub.I (k), respectively, for conversion into signals x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) which satisfy

x.sub.1 (k)=x.sub.1 (k)+n.sub.1 (k)

x.sub.2 (k)=x.sub.2 (k)+n.sub.2 (k)

x.sub.I (k)=x.sub.I (k)+n.sub.I (k)

whereby said variation component of said cross-correlation between said received signal of said plural channels is added thereto.

23. The echo canceller of claim 19, wherein said cross-correlation variation adding means is means by which, letting the number of reproduction channels be represented by I and said received signals of said plural channels by x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) as functions of a discrete time k, said received signals x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) are converted into signals x.sub.1 (k), x.sub.2 (k), . . . , x.sub.I (k) by subjecting the frequency characteristic of each of said received signals to different time-variant frequency axis expansion/compression processing, whereby said variation component of said cross-correlation between said received signal of said plural channels is added thereto.

24. The echo canceller of any one of claims 18 to 23, wherein said subband received signals and said subband echoes are real-number signals.

25. The echo canceller of any one of claims 18-23, wherein said subband received signals and said subband echoes are complex signals.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

The present invention relates to an echo cancellation method for cancelling room echoes which would otherwise cause howling and give rise to psychoacoustic problems in a teleconferencing system using a multi-receive-system and, more particularly, to a subband echo cancellation method and apparatus for a multichannel audio teleconference which updates or corrects an estimated impulse response of an echo path for each subband through utilization of a projection algorithm or the like.

ONE-CHANNEL ECHO CANCELLATION

An echo canceller is used to offer a hands-free telecommunication system that has an excellent double-talk function and is virtually free from echoes.

A description will be given first, with reference to FIG. 1, of a one-channel echo canceller. In hands-free communication, speech uttered by a person at a remote place is provided as a received signal to a received signal terminal 11 and is radiated from a loudspeaker 12 directly or after being subjected to some processing by a received signal processing part 13 that automatically adjusts the gain of the received signal according to its amplitude, power or similar magnitude. For this reason, the received signal x.sub.1 (k) herein mentioned is not limited specifically to the received signal itself but shall refer to a processed received signal as well when the received signal processing part 13 is employed. In FIG. 1, k indicates discrete time. An echo canceller 14 cancels an echo y(k) which is produced when the received signal x.sub.1 (k) radiated from the loudspeaker 12 is picked up by a microphone 16 after propagating over an echo path 15. The echo y.sub.1 (k) can be modeled by such a convolution as follows: ##EQU1##

where .SIGMA. indicates a summation from 1=0 to L-1, h.sub.11 (k,n) is the impulse response indicating the transfer function of the echo path 15 at time k and L is the number of taps, which is a constant preset corresponding to the reverberation time of the echo path 15. In the first place, received signals x.sub.1 (k) from the current time to L-1 are stored in a received signal storage and vector generating part 17. The L received signals thus stored are outputted as a received signal vector x.sub.1 (k), that is, as

x.sub.1 (k)=[x.sub.1 (k), x.sub.1 (k-1), . . . , x.sub.1 (k-L+1)].sup.T (2)

where *.sup.T indicates a transposition. In an estimated echo generating part 18, the inner product of the received signal vector x.sub.1 (k) of Eq. (2) and an estimated echo path vector h.sub.11 (k), which is provided from an echo path estimating part 19, is calculated as follows:

y.sub.1 (k)=h.sub.11.sup.T (k)x.sub.1 (k) (3)

As a result, an estimated echo or echo replica y.sub.1 (k) is generated. This inner product calculation is equivalent to such a convolution as Eq. (1). In the echo path estimating part 19, the estimated echo path vector h.sub.11 (k) is generated which is used in the estimated echo generating part 18.

Since the impulse response h.sub.11 (k,1) of the echo path 15 from the loudspeaker 12 to the microphone 16 varies with a sound field variation by a movement of a person or object, for instance, the estimated echo path vector h.sub.11 (k) needs to be varied following the time-varying impulse response of the echo path 15. In this example, the echo canceller 14 is formed by an adaptive FIR (Finite Impulse Response) filter. The most common algorithm for the echo path estimation is an NLMS (Normalized Least Mean Square) algorithm. With the NLMS algorithm, the received signal vector x.sub.1 (k) at time k and a residual echo e.sub.1 (k), i.e. the following error, obtained by subtracting the estimated echo signal y.sub.1 (k) from the output y.sub.1 (k) of the microphone 16 by a subtractor 21,

e.sub.1 (k)=y.sub.1(k)-y.sub.1 (k) (4)

are used to calculate an estimated echo path vector h.sub.11 (k+1) which is used at time k+1, by the following equation:

h.sub.11 (k+1)=h.sub.11 (k)+.mu.e.sub.1 (k)x.sub.1 (k)/(x.sub.1.sup.T (k)x.sub.1 (k)) (5)

where .mu. is called a step size parameter, which is used to adjust adaptation within the range of 0<.mu.<2. By repeating the above processing, the estimated echo path vector h.sub.11 (k) in the echo path estimating part 19 can be gradually brought into agreement with a true echo path vector h.sub.11 (k) whose elements are impulse response sequences h.sub.11 (k, 1) of the true echo path 15, that is, the following echo path vector:

h.sub.11 (k)=[h.sub.11 (k,0), h.sub.11 (k,1), . . . , h.sub.11 (k,L-1)].sup.T (6)

As the result of this, the residual echo e.sub.1 (k) given by Eq. (4) can be reduced.

The most effective algorithm now in use for the echo path estimation is a projection algorithm or ES projection algorithm (hereinafter referred to as an ESP algorithm). The projection algorithm is based on an idea of improving the convergence speed for correlated signals such as speech by removing the auto-correlation between input signals in the algorithm. The removal of auto-correlated components means whitening of signals in the time domain. The projection algorithm is described in detail in K. Ozeki and T. Umeda, "An Adaptive filtering Algorithm Using an orthogonal Projection to an Affine Subspace and Its Properties," T rans.(A), IEICE Japan, vol.J67-A, No.2, pp.126-132, February, 1984.

In general, the p-order projection algorithm updates the estimated echo path vector h(k) in such a manner as to obtain correct outputs y(k), y(k-1), . . . , y(k-p+1) for the last p input signal vectors x(k), x(k-1), . . . , x(k-p+1). That is, h(k+1) is computed which satisfies the following equations:

x.sup.T (k)h(k+1)=y(k)

x.sup.T (k-1)h(k+1)=y(k-1)

x.sup.T (k-p+1)h(k+1)=y(k-p+1) (7)

where

x(k)=[x(k),x(k-1), . . . ,x(k-L+1)].sup.T (8)

When the number p of equations is smaller than the number of unknown numbers (the number of taps) L, the solution h(k+1) of the simultaneous equations (7) is indeterminate. Hence, the estimated echo path vector is updated to minimize the value or magnitude of the updating .parallel.h(k+1)-h(k).parallel.. The p-order projection algorithm in such an instance is expressed by the following equation:

h(k+1)=h(k)+.mu.[X.sup.T (k)].sup.+ e(k) =h(k)+.mu.X(k)[X.sup.T (k)X(k)].sup.-1 e(k) =h(k)+.mu.X(k).beta.(k) =h(k)+.mu.[.beta..sub.1 x(k)+.beta..sub.2 x(k-1)+. . . +.beta..sub.p x(k-p+1)] (9)

where

X(k)=[x(k),x(k-1), . . . ,x(k-p+1)] (10)

e(k)=[e(k),(1-.mu.)e(k-1), . . . ,(1-.mu.).sup.P-1 e(k-p+1)].sup.T (11)

e(k)=y(k)-y(k) (12)

y(k)=h(k).sup.T X(k) (13)

.beta.(k)=[.beta..sub.1, .beta..sub.2, . . . , .beta..sub.P ].sup.T (14)

.sup.+ : generalized inverse matrix

-.sup.1 : inverse matrix.

In the above, .beta.(k) is the solution of the following simultaneous linear equation with p unknowns:

[X.sup.T (k)X(k)].beta.(k)=e(k) (15)

To avoid instability in the inverse matrix operation, a small positive constant .delta. may be used as follows:

[X.sup.T (k)X(k)+.delta.I].beta.(k)=e(k) (15)'

where I is a unit matrix. The second term on the right-hand side of Eq. (9) is an updated vector, with which the estimated echo path vector is iteratively updated. X(k).beta.(k) in Eq. (9) represents processing for removing the auto-correlation of the input signal. The removal of auto-correlation means suppression of input signal variations in the time domain, and hence it means whitening of the signals in the time domain. That is, the projection algorithm can be said to increase the impulse response updating speed by the whitening of the input signal in the time domain. Several fast projection algorithms have been proposed to reduce the computational complexity, and they are described in detail in [X.sup.T (k)AX(k)].beta.(k)=e(k) Japanese Patent Application Laid-Open Gazettes Nos. 312535/95 and 92980/95. Further, setting the input/output at a negative time zero and p infinity corresponds to the RLS algorithm.

The ESP algorithm is a combination of the projection algorithm with the ES algorithm that only reflects the variation characteristic of the echo path and permits implementation of an echo canceler of higher convergence speed than does the projection algorithm. The p-order ESP algorithm can be expressed by the following equation:

h(k+1)=h(k)+.mu.[{AX(k)}].sup.+ e(k) =(k)+.mu.AX(k)[X.sup.T (k)AX(k)].sup.-1 e(k) =h(k)+.mu.AX(k).beta.(k) =h(k)+.mu.A[.beta..sub.1 x(k)+.beta..sub.2 x(k-1)+ . . . +.beta..sub.P x (k-p+1)] (16)

where:

A=diag[.alpha..sub.1, .alpha..sub.2, . . . ,.alpha..sub.L ]: step size matrix

.alpha..sub.i =.alpha..sub.0.lambda. (i=1,2, . . . ,L)

.lambda.: attenuation rate of impulse response variation (0<.lambda.<1)

.mu.: second step size (scalar quantity)

In the above, .beta.(k)is the solution of the following simultaneous linear equation with p unknowns:

(17)

To avoid instability in the inverse matrix operation, a small positive constant .delta. may be used as follows:

(17)'

where I is a unit matrix.

When the estimated echo path 18 is formed by a digital FIR filter, its filter coefficient vector h.sub.11 (k) is a direct simulation of the impulse response h.sub.11 (k) of the room echo path 15. Accordingly, the [X.sup.T (k)AX(k)+.delta.I].beta.(k)=e(k)value of adjustment of the filter coefficient that is required according to variations of the room echo path 15 is equal to the variation in its impulse response h.sub.11 (k). Then, the step size matrix A, which represents the step size in the filter coefficient adjustment, is weighted using the time-varying characteristic of the impulse response. The impulse response variation in a room sound field is usually expressed as an exponential function using the attenuation rate .lambda.. As depicted in FIG. 2A, the diagonal elements .alpha..sub.1 (where 1=1,2, . . . ,L) of the step size matrix A exponentially attenuates, as 1 increases, from .alpha..sub.0 and gradually approaches zero with the same gradient as that of the exponential attenuation characteristic of the impulse response. This algorithm utilizes an acoustics finding or knowledge that when the impulse response of a room echo path varies as a person or object moves, its variation (a difference in the impulse response) exponentially attenuates with the same attenuation rate as that of the impulse response. By adjusting initial coefficients of the impulse response with large variations in large steps and the subsequent coefficients with small variations in small steps, it is possible to offer an echo canceler of fast convergence.

In the case of constructing the echo canceler with plural DSP (Digital Signal Processor) chips, the exponential decay curve of the step size .alpha..sub.1 is approximated stepwise and the step size .alpha..sub.1 is set in discrete steps with a fixed value for each chip as shown in FIG. 2B. This permits implementation of the ESP algorithm with the computational load and storage capacity held about the same as in the case of the conventional projection algorithm. The ESP algorithm is described in detail in S. Makino and Y. Kaneda, "Exponentially weighted step-size projection algorithm for acoustic echo cancellers", Trans. IEICE Japan, vol. E75-A, No. 11, pp. 1500-1508, November, 1992.

In the case of adjusting the estimated echo path vector h(k) by the conventional NLMS algorithm based on Eq. (5), it is adjusted in the direction of the input signal vector x(k). on the other hand, according to the ESP algorithm based on Eqs. (9) and (16), the second term on the right side of the fourth equation of Eq. (9) and (16) is set as follows:

v(k)=.beta..sub.1 x(k)+.beta..sub.2 x(k-1)+ . . . +.beta..sub.P x(k-p+1) (18)

and the estimated echo path vector h(k) is adjusted in the direction of the vector v(k), that is, in the direction in which the correlation (auto-correlation) to all of previous combined input signal vectors x(k-1), . . . , x(k-p+1) has been removed from the current combined vector x(k) of input signals. In other words, the coefficients .beta..sub.1 to .beta..sub.P are determined so that vectors similar to the previous input signal vectors are removed as much as possible from the current adjusted input signal vector v(k). In consequence, the input signal is whitened in the time domain.

As described above, the conventional projection algorithm whitens the monoral input signal in the time domain by removing the auto-correlation component of the input signal so as to provide increased convergence speed of the echo path estimation. The afore-mentioned Makino et al literature shows the results of computer simulations of convergence of ERLE (Echo-Return-Loss-Enhancement) by the ESP algorithm and by the NLMS algorithm in the case where the received signal was a male voice. According to the results of computer simulations, the time for the ERLE to reach 20 dB is about 1 sec in the case of the NLMS algorithm and 0.2 sec or less in the case of the ESP algorithm, and the time for substantial convergence of the ERLE is approximately in the range of 1 to 3 sec at the longest in either algorithm. This is considered to indicate the whitening effect of the input signal.

On the other hand, there is known a subband scheme that increases the convergence speed of the echo path estimation by whitening the monoral input signal in the frequency domain. This scheme divides the input signal into plural subbands, then sequentially adjusts in each subband the filter coefficient of the estimated echo path 18 based on variations of the echo path 15 by the NLMS algorithm or the like, and combines and outputs residuals in the respective subbands. This is disclosed in, for instance, U.S. Pat. No. 5,272,695, S. Gay and R. Mammone, "Fast converging subband acoustic echo cancellation using RAP on the WE.sup.R DSP16A", Proc. ICASSP90, pp. 1141-1144, April 1990, and Makino et al, "Subband Echo Canceller with an Exponentially Weighted Stepsize NLMS Adaptive Filter", Trans. IEICE Japan, A Vol. 379-A, No. p6, pp.1138-1146, June 1996. This subband scheme involves flattening or what is called whitening of signals in the frequency domain, increasing the convergence speed in the estimation of the filter coefficient of the estimated echo path at the time of variations of the echo path. This subband scheme is used in the echo path estimation for a one-channel input signal and increases the convergence speed of the echo cancellation by flattening (whitening) of the signal in each subband. This is attributable to the whitening of the signal and hence has nothing to do with the number of channels of the input signal. That is, in a teleconferencing system using plural loudspeakers and plural microphones the application of the subband scheme to each of the multichannel input signals would produce the same whitening effect as described above. However, it has not been considered that the subband scheme could be expected to produce any further effects.

Echo Cancellation for Teleconferencing System

In general, a teleconferencing system of the type having an I (.gtoreq.2) channel loudspeaker system and a J (.gtoreq.1) channel microphone system employs, for echo cancellation, such a configuration as shown in FIG. 3. That is to say, an echo cancellation system 23 is composed of I-channel echo cancellers 221, 222, . . . , 22J for processing I-input-one-output time sequence signals, which are each interposed between all of I channels of the receiving (loudspeaker) side and one channel of the sending (microphone) side. In this instance, the echo cancellation system has a total of I.times.J echo paths 15ij (1.ltoreq.i.ltoreq.I, 1.ltoreq.j.ltoreq.J). The I-channel echo cancellers 221, 222, . . . , 22J, which are each connected between all of the I channels of the receiving side and one channel of the sending side, have such a configuration as shown in FIG. 4, which is an extended version of the configuration of the echo canceller 14 depicted in FIG. 1. This is described in detail, for example, in T.Fujii, S.Shimada "Multichannel Adaptive Digital Filter," Trans. IEICE Japan, '86/10, V ol.J69-A, No.10.

Now, consider the I-channel echo canceller 22J connected to an j-th channel (1.ltoreq.j.ltoreq.J) of the sending side. The echo signal that is picked up the j-th channel microphone 16J is obtained by adding together respective received signals of all channels at the sending side after propagation over respective echo paths 151j to 15Ij. Hence, it is necessary to devise how to make the echo path estimation by evaluating only one residual echo ej(k) in common to all the receiving side channels. In the first place, for the received signal of each channel, the following received signal vectors are generated in the received signal storage and vector generating parts (171, 172, . . . 17r):

x.sub.1 (k)=[x.sub.1 (k), x.sub.1 (k-1), . . . , x.sub.1 (k-L.sub.1 +1)].sup.T (19)

x.sub.2 (k)=[x.sub.2 (k), x.sub.2 (k-1), . . . , x.sub.2 (k-L.sub.2 +1)].sup.T (20)

x.sub.I (k)=[x.sub.I (k), x.sub.I (k-1), . . . , x.sub.I (k-L.sub.I +1)].sup.T (21)

where L.sub.1, L.sub.2, . . . , L.sub.I are the numbers of taps, which are constants preset corresponding to reverberation times of the respective echo paths 151j, 152j, . . . , 15Ij. The vectors thus generated are combined in a vector combining part 24 as follows:

x(k)=[x.sub.1.sup.T (k), x.sub.2.sup.T (k), . . . , x.sub.I.sup.T (k)].sup.T (22)

Also in the echo path estimating part 19j, estimated echo path vectors h.sub.1j (k), h.sub.2j (k), . . . , h.sub.Ij (k), which are used to simulate I echo paths between the respective receiving side channels and the j-th sending side channel, are combined as follows:

h.sub.j (k)=[h.sub.Ij.sup.T (k), h.sub.2j.sup.T (k), . . . , h.sub.Ij.sup.T (k)].sup.T (23)

In the case of using the NLMS algorithm, the updating of the combined estimated echo path vector h.sub.j (k) is done as follows:

h.sub.j (k+1)=h.sub.j (k)+.mu.e.sub.j (k)x(k)/{x.sup.T (k)x(k)} (24)

In the estimated echo generating part 18j, an estimated echo y.sub.j (k) for the echo y.sub.j (k) picked up in the j-th sending channel is generated by the following inner product calculation:

y.sub.j (k)=h.sub.j.sup.T (k)x(k) (25)

By combining vectors in the respective channels into one vector, the flow of basic processing becomes the same as in the one-channel echo canceller of FIG. 1.

Of the defects of the conventional echo cancellation system for application to the teleconferencing system composed of an I-channel speaker system and a J-channel microphone system, the defect that the present invention is to solve will be described in connection with a concrete example.

In the case of applying the conventional echo cancellation system to the stereo teleconferencing system which sends and receives signals between the points A and B over two channels as shown in FIG. 5, there is presented a problem that each time a speaker at the point A moves or changes to another, an echo from the point B by the speech at the point A increases even if the echo paths 1511and 1521remain unchanged. The reason for this is that the echo path