WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Microphone array apparatus    
United States Patent6317501   
Link to this pagehttp://www.wikipatents.com/6317501.html
Inventor(s)Matsuo; Naoshi (Kawasaki, JP)
AbstractA microphone array apparatus includes a microphone array including microphones, one of the microphones being a reference microphone, filters receiving output signals of the microphones, and a filter coefficient calculator which receives the output signals of the microphones, a noise and a residual signal obtained by subtracting filtered output signals of the microphones other than the reference microphone from a filtered output signal of the reference microphone and which obtain filter coefficients of the filters in accordance with an evaluation function based on the residual signal.



 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 6317501
Microphone array apparatus - US Patent 6317501 Drawing
Microphone array apparatus
Inventor     Matsuo; Naoshi (Kawasaki, JP)
Owner/Assignee     Fujitsu Limited (Kasawaki, JP)
Patent assignment
All assignments
Publication Date     November 13, 2001
Application Number     09/039,777
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     March 16, 1998
US Classification     381/92 381/66 381/91 381/94.1 381/122
Int'l Classification     H04R 003/00 H04B 003/20
Examiner     Mei; Xu
Assistant Examiner    
Attorney/Law Firm     Rosenman & Colin, LLP
Address
Parent Case    
Priority Data     Jun 26, 1997[JP]9-170288
USPTO Field of Search     381/66 381/71.1 381/71.11 381/71.12 381/91 381/92 381/94.1 381/94.2 381/94.4 381/122 381/94.7 381/94.5 381/71.9 381/FOR 123 381/FOR 124 379/406 379/410 379/411 708/322
Patent Tags     microphone array
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
6041127
Elko
381/92
Mar,2000

[0 after 0 votes]
5853607
Zhao

Dec,1998

[0 after 0 votes]
5796819
Romesburg

Aug,1998

[0 after 0 votes]
5754665
Hosoi

May,1998

[0 after 0 votes]
5740256
Castello Da Costa
381/94.7
Apr,1998

[0 after 0 votes]
5561598
Nowak
700/55
Oct,1996

[0 after 0 votes]
5471538
Sasaki
381/92
Nov,1995

[0 after 0 votes]
5027393
Yamamura
379/406.06
Jun,1991

[0 after 0 votes]
4355368
Zeidler
708/422
Oct,1982

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A microphone array apparatus comprising:

a microphone array including microphones, one of the microphones being a reference microphone;

filters receiving output signals of the microphones; and

a filter coefficient calculator which receives the output signals of the microphones, a noise and a residual signal obtained by subtracting filtered output signals of those of the microphones other than the reference microphone from a filtered output signal of the reference microphone and which obtain filter coefficients of the filters in accordance with an evaluation function represented by power of the residual signal.

2. The microphone array apparatus as claimed in claim 1, further comprising:

delay units provided in front of the filters; and

a delay calculator which calculates amounts of delays of the delay units on the basis of a maximum value of a crosscorrelation function obtained by a sum-of-product operation on the output signals of the microphones.

3. The microphone array apparatus as claimed in claim 1, wherein the noise is a signal which drives a speaker.

4. The microphone array apparatus as claimed in claim 1, further comprising a supplementary microphone which outputs a signal from a noise source.

5. The microphone array apparatus as claimed in claim 1, wherein the filter coefficient calculator includes a cyclic type low-pass filter which applies a comparatively small weight to past values of an output of said cyclic type low-pass filter.
 Description Submit all comments and votes
 


BACKGROUND THE INVENTION

Field of the Invention

The present invention relates to a microphone array apparatus which has an array of microphones in order to detect the position of a sound source, emphasize a target sound and suppress noise.

The microphone array apparatus has an array of a plurality of omnidirectional microphones and equivalently define a directivity by emphasizing a target sound and suppressing noise. Further, the microphone array apparatus is capable of detecting the position of a sound source on the basis of a relationship among the phases of output signals of the microphones. Hence, the microphone array apparatus can be applied to a video conference system in which a video camera is automatically oriented towards a speaker and a speech signal and a video signal can concurrently be transmitted. In addition, the speech of the speaker can be clarified by suppressing ambient noise. The speech of the speaker can be emphasized by adding the phases of speech components. It is now required that the microphone array apparatus can stably operate.

If the microphone array apparatus is directed to suppressing noise, filters are connected to respective microphones and filter coefficients are adaptively or fixedly set so as to minimize noise components (see, for example, Japanese Laid-Open Patent Application No. 5-111090). If the microphone array apparatus is directed to detecting the position of a sound source, the relationship among the phases of the output signals of the microphones is detected, and the distance to the sound source is detected (see, for example, Japanese Laid-Open Patent Application Nos. 63-177087 and 4-236385).

An echo canceller is known as a device which utilizes the noise suppressing technique. For example, as shown in FIG. 1, a transmit/receive interface 202 of a telephone set is connected to a network 203. An echo canceller is connected between a microphone 204 and a speaker 205. A speech of a speaker is input to the microphone 204. A speech of a speaker on the other (remote) side is reproduced through the speaker 205. Hence, a mutual communication can take place.

A speech transferred from the speaker 205 to the microphone 204, as indicated by a dotted line shown in FIG. 1 forms an echo (noise) to the other-side telephone set. Hence, the echo canceller 201 is provided that includes a subtracter 206, an echo component generator 207 and a coefficient calculator 208. Generally, the echo generator 207 has a filter structure which produces an echo component from the signal which drives the speaker 205. The subtracter 206 subtracts the echo component from the signal from the microphone 204. The coefficient calculator 208 controls the echo generator 207 to update the filter coefficients so that the residual signal from the subtracter 206 is minimized.

The updating of the filter coefficients c1, c2, . . . , cr of the echo component generator 207 having the filter structure can be obtained by a known maximum drop method. For example, the following evaluation function J is defined based on an output signal e (the residual signal in which the echo component has been subtracted) of the subtracter 206:

J=e.sup.2 (1)

According to the above evaluation function, the filter coefficients c1, c2, . . . , cr are updated as follows: ##EQU1##

where 0.0<.alpha.<0.5

f.sub.norm =(f(1).sup.2 +f(2).sup.2 + . . . f(r).sup.2).sup.1/2 (3)

In the above expressions, a symbol "*" denotes multiplication, and "r" denotes the filter order. Further, f(1), . . . , f(r) respectively denote the values of a memory (delay unit) of the filter (in other words, the output signals of delay units each of which delays the respective input signal by a sample unit). A symbol "f.sub.norm " is defined as equation (3), and a symbol ".alpha." is a constant, which represents the speed and precision of convergence of the filter coefficients towards the optimal values.

The echo canceller 201 has filter orders as many as 100. Hence, another echo canceller using a microphone array as shown in FIG. 2 is known. There are provided an echo canceller 211, a transmit/receive interface 212, microphones 214-1-214-n forming a microphone array, a speaker 215, a subtracter 216, filters 217-1-217-n, and a filter coefficient calculator 218.

In the structure shown in FIG. 2, acoustic components from the speaker 215 to the microphones 214-1-214-n are propagated along routes indicated by broken lines and serve as echoes. Hence, the speaker 215 is a noise source. The updating control of the filter coefficients c11, c12, . . . , c1r, . . . , cn1, cn2, . . . , cnr in the case where the speaker does not make any speech is expressed by using the evaluation function (1) as follows: ##EQU2##

where p=2, 3, . . . , n

The equation (4) relates to a case where one of the microphones 214-1-214-n, for example, the microphone 214-1 is defined as a reference microphone, and indicates the filter coefficients c11, c12, . . . , c1r of the filter 217-1 which receives the output signal of the above reference microphone 214-1. The equation (5) relates to the microphones 214-2 - 214-n other than the reference microphones, and indicates the filter coefficients c21, c22, . . . , c2r, . . . , cn1, cn2, . . . , cnr. The subtracter 216 subtracts the output signals 217-2-217-n of the microphones 214-2-214-n from the output signal 217-1 of the reference microphone 214-1.

FIG. 3 is a block diagram for explaining a conventional process of detecting the position of a sound source and emphasizing a target sound. The structure shown in FIG. 3 includes a target sound emphasizing unit 221, a sound source detecting unit 222, delay units 223 and 224, a number-of-delayed-samples calculator 225, an adder 226, a crosscorrelation coefficient calculator 227, a position detection processing unit 228 and microphones 229-1 and 229-2.

The target sound emphasizing unit 221 includes the delay units 223 and 224 of Z.sup.-da and Z.sup.-db, the number-of-delayed-samples calculator 225 and the adder 226. The sound source position detecting unit 222 includes the crosscorrelation coefficient calculator 227 and the position detection processing unit 228. The number-of-delayed samples calculator 225 is controlled by the following factors. The crosscorrelation coefficient calculator 227 of the sound source position detecting unit 222 obtains a crosscorrelation coefficient r(i) of output signals a(j) and b(j) of the microphones 229-1 and 229-2. The position detection processing unit 228 obtains the sound source position by referring to a value of i, imax, at which the maximum of the crosscorrelation coefficient r(i) can be obtained.

The crosscorrelation coefficient r(i) is expressed as follows:

r(i)=.SIGMA..sup.n.sub.j=1 a(j)*b(j+i) (6)

where .SIGMA..sup.n.sub.j=1 denotes a summation of j=1 to j=n, and i has a relationship -m.ltoreq.i .ltoreq.m. The symbol "m" is a value dependent on the distance between the microphones 229-1 and 229-2 and the sampling frequency, and is written as follows:

m=[(sampling frequency)*(intermichrophone distance)]/(speed of sound) (7)

where n is the number of samples for a convolutional operation.

The number of delayed samples da of the Z.sup.-da delay unit 223 and the number of delayed samples db of the Z.sup.-db delay unit 224 can be obtained as follows from the value imax at which the maximum value of the crosscorrelation coefficient r(i) can be obtained:

where i.gtoreq.0, da=i, db=0

where i.gtoreq.0, da=0, db=-i.

Hence, the phases of the target sound from the sound source are made to coincide with each other and are added by the adder 226. Hence, the target sound can be emphasized.

However, the above-mentioned conventional microphone array apparatus has the following disadvantages.

In the conventional structure directed to suppressing noise, when the speaker of the target sound source does not speak, the echo components from the speaker to the microphone array can be canceled by the echo canceller. However, when a speech of the speaker and the reproduced sound from the speaker are concurrently input to the microphone array, the updating of the filter coefficients for canceling the echo components (noise components) does not converge. That is, the residual signal e in the equations (4) and (5) corresponds to the sum of the components which cannot be suppressed by the subtracter 216 and the speech of the speaker. Hence, if the filter coefficients are updated so that the residual signal e is minimized, the speech of the speaker which is the target sound is suppressed along with the echo components (noise). Hence, the target noise cannot be suppressed.

In the conventional structure directed to detecting the sound source position and emphasizing the target sound, the output signals a(j) and b(j) of the microphones 229-1 and 229-2 shown in FIG. 3 generally have an autocorrelation in the vicinity of the sampled values. If the sound source is white noise or pulse noise, the autocorrelation is reduced, while the autocorrelation for vice is increased. The crosscorrelation function r(i) defined in the equation (6) has a less variation as a function of i with respect to a signal having comparatively large autocorrelation than a variation with respect to a signal having comparatively small autocorrelation. Hence, it is very difficult to obtain the correct maximum value and precisely and rapidly detect the position of the sound source.

In the conventional structure directed to emphasizing the target sound so that the phases of the target sounds are synchronized, the degree of emphasis depends on the number of microphones forming the microphone array. If there is a small crosscorrelation between the target sound and noise, the use of N microphones emphasizes the target sound so that the power ratio is as large as N times. If there is a large correction between the target sound and noise, the power ratio is small. Hence, in order to emphasize the target sound which has a large crosscorrelation to the noise, it is required to use a large number of microphones. This leads to an increase in the size of the microphone array. It is very difficult to identify, under noisy environment, the position of the power source by utilizing the crosscorrelation coefficient value of the equation (6).

SUMMARY OF THE INVENTION

It is a general object of the present invention to provide a microphone array apparatus in which the above disadvantages are eliminated.

A more specific object of the present invention is to provide a microphone array apparatus capable of stably and precisely suppressing noise, emphasizing a target sound and identifying the position of a sound source.

The above objects of the present invention are achieved by a microphone array apparatus comprising: a microphone array including microphones (which correspond to parts indicated by reference numbers 1-1-1-n in the following description), one of the microphones being a reference microphone (1-1); filters (2-1-2-n) receiving output signals of the microphones; and a filter coefficient calculator (4) which receives the output signals of the microphones, a noise and a residual signal obtained by subtracting filtered output signals of the microphones other than the reference microphone from a filtered output signal of the reference microphone and which obtain filter coefficients of the filters in accordance with an evaluation function based on the residual signal. With this structure, even when speech of a speaker corresponding to the sound source and the noise are concurrently applied to the microphones, the crosscorrelation function value is reduced so that the noise can be effectively suppressed and the filter coefficients can continuously be updated.

The above microphone array apparatus may be configured so that it further comprises: delay units (8-1-8-n) provided in front of the filters; and a delay calculator (9) which calculates amounts of delays of the delay units on the basis of a maximum value of a crosscorrelation function of the output signals of the microphones and the noise. Hence, the filter coefficients can easily be updated.

The microphone array apparatus may be configured so that the noise is a signal which drives a speaker. This structure is suitable for a system that has a speaker in addition to the microphones. A reproduced sound from the speaker may serve as noise. By handling the speaker as a noise source, the signal driving the speaker can be handled as the noise, and thus the filter coefficients can easily be updated.

The microphone array apparatus may further comprise a supplementary microphone (21) which outputs the noise. This structure is suitable for a system which has microphones but does not have a speaker. The output signal of the supplementary microphone can be used as the noise.

The microphone array apparatus may be configured so that the filter coefficient calculator includes a cyclic type low-pass filter (FIG. 10) which applies a comparatively small weight to memory values of a filter portion which executes a convolutional operation in an updating process of the filter coefficients.

The above objects of the present invention are also achieved by a microphone array apparatus comprising: a microphone array including microphones (51-1, 51-2); linear predictive filters (52-1, 52-2) receiving output signals of the microphones; linear predictive analysis units (53-1, 53-2) which receives the output signals of the microphones and update filter coefficients of the linear predictive filters in accordance with a linear predictive analysis; and a sound source position detector (54) which obtains a crosscorrelation coefficient value based on linear predictive residuals of the linear predictive filters and outputs information concerning the position of a sound source based on a value which maximizes the crosscorrelation coefficient. Hence, even when speech of a speaker corresponding to the sound source and the noise are concurrently applied to the microphones, autocorrelation function values of samples about the speech signal are reduced to the linear predictive analysis, so that the position of the target source can accurately be detected. Thus, speech from the target sound can be emphasized and noise components other than the target sound can be suppressed.

The microphone array apparatus may be configured so that: a target sound source is a speaker; and the linear predictive analysis unit updates the filter coefficients of the linear predictive filters by using a signal which drives the speaker. Hence, the linear predictive analysis unit can be commonly used to the linear predictive filters corresponding to the microphones.

The above-mentioned objects of the present invention are achieved by a microphone array apparatus comprising: a microphone array including microphones (61-1, 61-2); a signal estimator (62) which estimates positions of estimated microphones in accordance with intervals at which the microphones are arranged by using the output signals of the microphones and a velocity of sound and which outputs output signals of the estimated microphones together with the output signals of the microphones forming the microphone array; and a synchronous adder (63) which pulls phases of the output signals of the microphones and the estimated microphones and then adds the output signals. Hence, even if a small number of microphones is used to form an array, the target sound can be emphasized and the position of the target sound source can precisely be detected as if a large number of microphones is used.

The microphone array apparatus may further comprise a reference microphone (71) located on an imaginary line connecting the microphones forming the microphone array and arranged at intervals at which the microphones forming the microphone array are arranged, wherein the signal estimator which corrects the estimated positions of the estimated microphones and the output signals thereof on the basis of the output signals of the microphones forming the microphone array.

The microphone array apparatus may further comprise an estimation coefficient decision unit (74) weights an error signal which corresponds to a difference between the output signal of the reference microphone and the output signals of the signal estimator in accordance with an acoustic sense characteristic so that the signal estimator performs a signal estimating operation on a band having a comparatively high acoustic sense with a comparatively high precision.

The microphone array apparatus may be configured so that: given angles are defined which indicate directions of a sound source with respect to the microphones forming the microphone array; the signal estimator includes parts which are respectively provided to the given angles; the synchronous adder includes parts which are respectively provided to the given angles; and the microphone array apparatus further comprises a sound source position detector which outputs information concerning the position of a sound source based on a maximum value among the output signals of the parts of the synchronous adder.

The above objects of the present invention are also achieved by a microphone array apparatus comprising: a microphone array including microphones (91-1, 91-2); a sound source position detector (92) which detects a position of a sound source on the basis of output signals of the microphones; a camera (90) generating an image of the sound source; a second detector (93) which detects the position of the sound source on the basis of the image from the camera; and a joint decision processing unit (94) which outputs information indicating the position of the sound source on the basis of the information from the sound source position detector and the information from the second detector. Hence, the position of the target sound source can by rapidly and precisely detected.

BRIEF DESCRIPTION OF THE DRAWINGS

Other objects, features and advantages of the present invention will become more apparent from the following detailed description when read in conjunction with the accompanying drawings, in which:

FIG. 1 is a block diagram of a conventional echo canceller;

FIG. 2 is a diagram of a conventional echo canceller using a microphone array;

FIG. 3 is a block diagram of a structure directed to detecting the position of a sound source and emphasizing the target sound;

FIG. 4 is a block diagram of a first embodiment of the present invention;

FIG. 5 is a block diagram of a filter which can be used in the first embodiment of the present invention;

FIG. 6 is a block diagram of a second embodiment of the present invention;

FIG. 7 is a flowchart of an operation of a delay calculator used in the second embodiment of the present invention;

FIG. 8 is a block diagram of a third embodiment of the present invention;

FIG. 9 is a block diagram of a fourth embodiment of the present invention;

FIG. 10 is a block diagram of a low-pass filter used in a filter coefficient updating process executed in the embodiments of the present invention;

FIG. 11 is a block diagram of a structure using a digital signal processor (DSP);

FIG. 12 is a block diagram of an internal structure of the DSP shown in FIG. 11;

FIG. 13 is a block diagram of a delay unit;

FIG. 14 is a block diagram of a fifth embodiment of the present invention;

FIG. 15 is a block diagram of a detailed structure of the fifth embodiment of the present invention;

FIG. 16 is a diagram showing a relationship between the sound source position and imax;

FIG. 17 is a block diagram of a sixth embodiment of the present invention;

FIG. 18 is a block diagram of a seventh embodiment of the present invention;

FIG. 19 is a block diagram of a detailed structure of the seventh embodiment of the present invention;

FIG. 20 is a block diagram of an eighth embodiment of the present invention;

FIG. 21 is a block diagram of a ninth embodiment of the present invention; and

FIG. 22 is a block diagram of a tenth embodiment of the present invention.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

A description will now be given, with reference to FIG. 4, of a microphone array apparatus according to a first embodiment of the present invention. The apparatus shown in FIG. 4 is made up of n microphones 1-1-1-n forming a microphone array, filters 2-1-2-n, an adder 3, a filter coefficient calculator 4, a speaker (target sound source) 5, and a speaker (noise source). The speech of the speaker 5 is input to the microphones 1-1-1-n, which converts the received acoustic signals into electric signals, which pass through the filters 2-1-2-n and are then applied to the adder 3. The output signal of the adder 3 is then to a remote terminal via a network or the like. A speech signal from the remote side is applied to the speaker 6, which is thus driven to reproduce the original speech. Hence, the speaker 5 communicates with the other-side speaker. The reproduced speech is input to the microphones 1-1-1-n, and thus functions as noise to the speech of the speaker 5. Hence, the speaker 6 is a noise source with respect to the target sound source.

The filter coefficient calculator 4 is supplied with the output signals of the microphones 1-1-1-n, a noise (an input signal for driving the speaker serving as noise source), and the output signal (residual signal) of the adder 3, and thus updates the coefficients of the filters 2-1-2-n. In this case, the microphone 1-1 is handled as a reference microphone. The subtracter 3 subtracts the output signals of the filters 2-2-2-n from the output signal of the filter 2-1.

Each of the filters 2-1-2-n can be configured as shown in FIG. 5. Each filter includes Z-.sup.-1 delay units 11-1-11-r-l, coefficient units 12-1-12-r for multiplication of filter coefficients cp1, cp2, . . . , cpr, and adders 13 and 14. A symbol "r" denotes the order of the filter.

When the signal from the noise source (speaker 6) is denoted as xp(i) and the signal from the target sound source (speaker 5) is denoted as yp(i) (where i denotes the sample number and p is equal to 1, 2, . . . , n), the values fp(i) of the memories of the filters 2-1-2-n (the input signals to the filters and the output signals of the delay units 11-1-11-r-1) are defined as follows:

fp(i)=xp(i)+yp(i) (8)

The output signal e of the adder in the echo canceller using the conventional microphone array is as follows: ##EQU3##

where f1(1), f1(2), . . . , f1(r), . . . , fi(1), fi(2), . . . , fi(r) denote the values of the memories of the filters. The adder subtracts the output signals of the filters other than the reference filter from the output signal of the reference filter.

In contrast, the present invention controls the signals xp(i) in phase and performs the convolutional operation. The output signal e' of the adder thus obtained is as follows: ##EQU4##

where (p) in x(1)(p), . . . , x(q)(p) denotes signals from the noise source obtained when the microphones 1-1-1-n are in phase, and the symbol "q" denotes the number of samples on which the convolutional operation is executed.

When the signals xp(i) from the noise source and the signals yp(i) of the target sound source are concurrently input, that is, when the speaker 5 speaks at the same time as the speaker 6 outputs a reproduced speech, there is a small crosscorrelation therebetween because the coexisting speeches are uttered by different speakers. Hence, the equation (11) can be rewritten as follows: ##EQU5##

It can be seen from the above equation (12), an influence of the signals yp(i) from the target sound source to [fp(1)', . . . , fp(r)'] is reduced. The signal e' in the equation (10) is obtained by using the equation (12), and then, an evaluation function J=(e').sup.2 is calculated based on the obtained signal e'. Then, based on the evaluation function J=(e').sup.2, the filter coefficients of the filters 2-1-2-n are updated. That is, even in the state in which speeches from the speaker (target sound source) 5 and the speaker (noise source) 6 are concurrently applied to the microphones 1-1-1-n, the noise contained in the output signals of the microphones 1-1-1-n has a large crosscorrelation to the input signal applied to the filter coefficient calculator 4 and used to drive the speaker 6, while having a small crosscorrelation to the target sound source 5. Hence, the filter coefficients can be updated in accordance with the evaluation function J=(e').sup.2. Hence, the output signal of the adder 3 is the speech signal of the speaker 5 in which the noise is suppressed.

FIG. 6 is a block diagram of a microphone array apparatus according to a second embodiment of the present invention in which parts that are the same as those shown in the previously described figures are given the same reference numbers. The structure shown in FIG. 6 includes delay units 8-1-8-n (Z.sup.-d1 -Z.sup.-dn), and a delay calculator 9.

The updating of the filter coefficients according to the second embodiment of the present invention is based on the following. The delay calculator 9 calculates the number of delayed samples in each of the delay units 81-1-8-n so that the output signals of the microphones 1-1-1-n are pulled in phase. Further, the filter coefficient calculator 4 calculates the filter coefficients of the filters 2-1-2-n. The delay calculator 9 is supplied with the output signals of the microphones 1-1-1-n, and the input signal (noise) for driving the speaker 6. The filter coefficient calculator 4 is supplied with the output signals of the delay units 8-1-8-n, the output signal of the adder 3 and the input signal (noise) for driving the speaker 6.

When the output signals of the microphones 1-1-1-n are denoted as gp(i) where p=1, 2, . . . , n; j is the sample number, a crosscorrelation function Rp(i) to the signals x(j) from the noise source is as follows:

Rp(i)=.SIGMA..sup.s.sub.j=1 gp(j+i)*x(j) (13)

where .SIGMA..sup.s.sub.j=1 denotes a summation from j=1 to j=s, and s denotes the number of samples on which the convolutional operation is executed. The number s of samples may be equal to tens to hundreds of samples. When a symbol "D" denotes the maximum delayed sample corresponding to the distances between the noise source and the microphones, the term "i" in the equation (13) is such that i=0, 1, 2, . . . , D.

For example, when the maximum distance between the noise source and the furthest microphone is equal to 50 cm, and the sampling frequency is equal to 8 kHz, the speed of sound is approximately equal to 340 m/s, and thus the maximum number D of delayed samples is as follows:

D=(sampling frequency)*(maximum distance between the noise source and microphone)/(speed of sound)=8000*(50/34000)=11.76.div.12.

Hence, the symbol "i" is equal to 1, 2, . . . , 12. When the maximum distance between the noise source and the microphone is equal to 1m, the maximum number D of delayed samples is equal to 24.

The value ip (p=1, 2, . . . , n) is obtained which is the value of i obtained when the absolute value of the crosscorrelation function value Rp(i) obtained by equation (13). Further, the maximum value imax of the ip is obtained. The above process is comprised of steps (A1)-(A11) shown in FIG. 7. The term imax is set to an initial value (equal to, for example, 0) and the variable p is set equal to 1, at step A1. At step A2, the term Rpmax is set to an initial value (equal to, for example, 0.0), and the term ip is set to an initial value (equal to, for example, 0). Further, at step A2, the variable i is set equal to 0. At step A3, the crosscorrelation function value Rp(i) defined by the equation (13) is obtained.

At step A4, it is determined whether the crosscorrelation function value Rp(i) is greater than the term Rpmax. If the answer is YES, the Rp(i) obtained at that time is set to Rpmax at step A5. If the answer is NO, the variable i is incremented by 1 (i=i+1) at step A6. At step A7, it is determined whether i.ltoreq.D. If the value i is equal to or smaller than the maximum number D of delayed samples, the process returns to step A3. If the value i exceeds the maximum number D of delayed samples, the process proceeds with step A8. At step A8, it is determined that the value ip is greater than the value imax. If the answer is YES, the value ip obtained at that time is set to imax at step A9. If the answer is NO, the variable p is incremented by 1 (p=p+1) at step A10. At step All it is determined whether p.ltoreq.n. If the answer of step All is YES, the process returns to step A2. If the answer is NO, the retrieval of the crosscorrelation function value Rp(i) ends, so that the maximum value imax of the IP within the range of i<D.

The number dp of delayed samples of the delay unit can be obtained as follows by using the terms ip and imax obtained by the above maximum value detection:

dp=imax-ip (14)

Hence, the numbers di-dn of delayed samples of the delay units 8-1-8-n can be set by the delay calculator 9.

The filters 2-1-2-n can be configured as shown in FIG. 5. When the output signals of the filters 2-1-2-n are denoted as outp (p=1, 2, . . . , n) defined by the following:

outp=.SIGMA..sup.n.sub.i=1 cpi*fp(i) (15)

where .SIGMA..sup.n.sub.i=1 denotes a summation from i=1 to i=n, cpi denotes the filter coefficients, and fp(i) denotes the values of the memories of the filters and are also input signals applied to the filters.

The filter coefficient calculator 4 calculates the crosscorrelation between the present and past input signals of the filters 2-1-2-n and the signals form the noise source, and thus updates the filler coefficients. The crosscorrelation function value fp(i)' is written as follows:

fp(i)'=.SIGMA..sup.q.sub.n=1 x(j)*fP(i+j-r1) (16)

where .SIGMA..sup.q.sub.n=1 denotes a summation from j=1 to J=q, and the symbol q denotes the number of samples on which the convolutional operation is carried out in order to calculate the crosscorrelation function value and is normally equal to tens to hundreds of samples.

By using the above crosscorrelation function value fp(i)', the output signal e' of the adder 3 is obtained as follows:

e'=.SIGMA..sup.r.sub.j=1 [f1(j)'*c1j]-.SIGMA..sup.n.sub.j=1 [fi(j)'*cij] (17)

The above operation is the convolutional operation and can be thus implemented by a digital signal processor (DSP). In this case, the adder 3 subtracts the output signals of the microphones 1-2-1-n obtained via the filters 2-2-2-n from the output signal of the reference microphone 1-1 obtained via the filter 2-1.

The evaluation function is defined so that J=(e').sup.2 where the output signal e' of the adder 3 is handled as an error signal. By using the evaluation function J=(e').sup.2, the filter coefficients are obtained. For example, the filter coefficients can be obtained by the steepest descent method. By using the following expressions, the filter coefficients c11, c12, . . . , cn1, cn2, . . . , cnr can be obtained as follows: ##EQU6##

where the norm fp.sub.norm corresponds to the aforementioned formula (3) and can be written as follows:

fp.sub.norm =[(fp(1)').sup.2 +(fp(2)').sup.2 +. . . +(fp(r)').sup.2 ].sup.1/2 (20)

The term .alpha. in the equations (18) and (19) is a constant as has been described previously, and represents the speed and precision of convergence of the filter coefficients towards the optimal values.

Hence, the output signal e' of the adder 3 is obtained as follows:

e'=out1-.SIGMA..sup.n.sub.i=2 outi (21)

The delay units 8-1-8-n change the phases of the input signals applied to the filters 2-1-2-n. Hence, the filter coefficients can easily be updated by the filter coefficient calculator 4. Even under a situation such that the speaker 5 speaks at the same time as a sound is emitted from the speaker 6, the updating of the filter coefficients can be realized. Hence, it is possible to definitely suppress the noise components that enter the microphones 1-1-1-n from the speaker 6 which serves as a noise source.

FIG. 8 is a block diagram of a third embodiment of the present invention, in which parts that are the same as those shown in FIG. 4 are given the same reference numbers. In FIG. 8, there are a noise source 16 and a supplementary microphone 21. The supplementary microphone 21 can have the same structure as that of the microphones 1-1-1-n forming the microphone array.

The structure shown in FIG. 8 differs from that shown in FIG. 4 in that the output signal of the supplementary microphone 21 can be input to the filter coefficient calculator 4 as a signal from the noise source. Hence, even in a case where the noise source 16 is an arbitrary noise source other than the speaker, such as an air conditioning system, the noise can be suppressed by using the evaluation function J=(e').sup.2 used to update the filter coefficients, as has been described with reference to FIG. 4.

FIG. 9 is a block diagram of a fourth embodiment of the present invention, in which parts that are the same as those shown in FIGS. 6 and 7 are given the same reference numbers. The structure shown in FIG. 9 is almost the same as that shown in FIG. 6 except that the output signal of the supplementary microphone 21 is applied, as the signal from a noise source, to the delay calculator 9 and the filter coefficient calculator 4. Hence, as in the case of the structure shown in FIG. 6, the numbers of delayed samples of the delay units 2-1-2-n are controlled by the delay calculator 9, and the filter coefficients of the filters 2-1-2-n are updated by the filter coefficient calculator 4. Hence, noise can be compressed.

FIG. 10 is a block diagram of a low-pass filter used in the filter coefficient updating process used in the embodiments of the present invention. The low-pass filter shown in FIG. 10 includes coefficient units 22 and 23, an adder 24 and a delay unit 25. The structure shown in FIG. 10 is directed to calculating the aforementioned crosscorrelation function value fp(i)' in which the coefficient unit 23 has a filter coefficient .beta. and the coefficient unit 22 has a filter coefficient (1-.beta.). The value fp(i)' is obtained as follows:

fp(i)'=.beta.*fp(i)'.sub.old +(1-.beta.)*[x(1)*fp(i)] (22)

where the coefficient .beta. is set so as to satisfy 0.0<.beta.<1.0 and fp(i)'.sub.old denotes the value of a memory (delay unit 25) of the low-pass filter.

The low-pass filter shown in FIG. 10 is a cyclic type low-pass filter, in which weighting for the past signals is made comparatively light in order to prevent the convolutional operation from outputting an excessive output value and thus stably obtain the crosscorrelation function value fp(i)'.

FIG. 11 is a block diagram of a structure directed to implementing the embodiments of the present invention by using a digital signal processor (DSP). Referring to FIG. 11, there are provided the microphones 1-1-1-n forming a microphone array, a DSP 30, low-pass filters (LPF) 31-1-31-n, analog-to-digital (A/D) converters 32-1-32-n, a digital-to-analog (D/A) converter 33, a low-pass filter (LPF) 34, an amplifier 35 and a speaker 36.

The aforementioned filters 2-1-2-n and the filter coefficient calculator 4 used in the structure shown in FIG. 4 and the filters 2-1-2-n, the filter coefficient calculator 4 and the delay units 8-1-8-n used in the structure shown in FIG. 6 can be realized by the combinations of a repetitive process, a sum-of-product operation and a condition branching process. Hence, the above processes can be implemented by operating functions of the DSP 30.

The low-pass filters 31-1-31-n function to eliminate signal components located outside the speech band. The A/D converters 32-1-32-n converts the output signals of the microphones 1-1-1-n obtained via the low-pass filters 31-1-31-n into digital signals and have a sampling frequency of, for example, 8 kHz. The digital signals have the number of bits which corresponds to the number of bits processed in the DSP 30. For example, the digital signals consists of 8 bits or 16 bits.

An input signal obtained via a network or the like is converted into an analog signal by the D/A converter 33. The analog signal thus obtained passes through the low-pass filter 34, and is then applied to the amplifier 35. An amplified signal drives the speaker 36. The reproduced sound emitted from the speaker 36 serves as noise with respect to the microphones 1-1-1-n. However, as has been described previously, the noise can be suppressed by updating the filter coefficients by the DSP 30.

FIG. 12 is a block diagram showing functions of the DSP that can be used in the embodiments of the present invention. In FIG. 12, parts that are the same as those shown in the previously described figures are given the same reference numbers. In FIG. 12, the low-pass filters 31-1-31-n and 34, the A/D converters 32-1-32-n, the D/A converter 33 and the amplifier 35 shown in FIG. 11 are omitted. The filer coefficient calculator 4 includes a crosscorrelation calculator 41 and a filter coefficient updating unit 42. The delay calculator 9 includes a crosscorrelation calculator 43, a maximum value detector 44 and a number-of-delayed-samples calculator 3545.

The crosscorrelation calculator 43 of the delay calculator 9 receives the output signals gp(j9 of the microphones 1-1-1-n and the drive signal for he speaker 36 (which functions as a noise source), and calculates the crosscorrelation function value Rp(i) defined in formula (13). The maximum value detector 44 detects the maximum value of the crosscorrelation function value Rp(i) in accordance with the flowchart of FIG. 7. The number-of-delayed-samples calculator 45 obtain the numbers dp of delayed samples of the delay units 8-1-8-n by using the ip and imax obtained during the maximum value detecting process. The numbers of delayed samples thus obtained are then set in the delay units 8-