A method and apparatus for processing audio signals utilizing reverberation in combination with directional cues to capture both the temporal and spatial dimensions of a three-dimensional natural reverberant environment. Reverberant streams are generated and directionalized to simulate a selected model environment utilizing pinna cues and other directional cues to simulate reflected sound from various spatial regions of the model environment.
A reverberation circuit comprises an amplifier, attenuation circuit, a feedback component, and a delay circuit. The amplifier, delay circuit, and feedback component form a feedback loop. The amplifier amplifies an input signal supplied to an input terminal and outputs an amplified signal to the delay circuit. The output of the delay circuit is fed through a feedback component back to the amplifier. The feedback component controllably varies the feedback amount of the delayed signal to the amplifier in accordance with the desired depth of reverberation. An attenuation circuit is placed in the feedback loop, or before or after the feedback loop. The attenuation circuit selectively attenuates the level of specific frequency component of a signal passing therethrough whose frequency may be different depending on the depth of reverberation or feedback amount. The attenuation circuit is adjusted in interlocked relation with the feedback component such that when the feedback amount is increased, the level of the specific frequency components in the output signal of the reverberation circuit is decreased.
Spatialization of soundfields is accomplished by filtering audio signals using filters having unvarying frequency response characteristics and amplifying signals using amplifier gains adapted in response to signals representing sound source location and/or listener position. The filters are derived using a singular value decomposition process which finds the best set of component impulse responses to approximate a given target set of impulse responses corresponding to head related transfer functions. Efficient implementations for rendering reflection effects, air absorption losses and other ambient effects, and for spatializing multiple sound sources and/or generating multiple output signals are disclosed.
The illusion of distinct sound sources distributed throughout the three-dimensional space containing the listener is possible using only conventional stereo playback equipment by processing monaural sound signals prior to playback on two spaced-apart transducers. A plurality of such processed signals corresponding to different sound source positions may be mixed using conventional techniques without disturbing the positions of the individual images. Although two loudspeakers are required the sound produced is not conventional stereo, however, each channel of a left/right stereo signal can be separately processed according to the invention and then combined for playback. The sound processing involves dividing each monaural or single channel signal into two signals and then adjusting the differential phase and amplitude of the two channel signals on a frequency dependent basis in accordance with an empirically derived transfer function that has a specific phase and amplitude adjustment for each predetermined frequency interval over the audio spectrum. Each transfer function is empirically derived to relate to a different sound source location and by providing a number of different transfer functions and selecting them accordingly the sound source can be made to appear to move.
A sound field switcher inputs an output sound field signal from a sound field producer. The inputted sound field signal is delayed sequentially by plural delayers, after which they are multiplied by coefficients by plural multipliers, and the multiplied results are all added by an adder. Coefficients stored in an area of a memory indicated by a coefficient counter are read out and set to the plural multipliers. The sound field switcher carries out control of the timing for switching the sound field signal produced by the sound field producer and the timing for the coefficient counter to commence counting. Assuming the time required for switching the sound field to be N sampling periods, in case of carrying out a sound field switching from a sound field A to a sound field B, a sound field switching controller issues an order to the coefficient counter so as to count up one by one from 0 to N, and on each count the coefficients stored in an area of the coefficient memory is set to the multipliers. When the count value of the coefficient counter comes to (N/2) or nearest to (N/2), the sound field switching controller issues an order to the sound field producer to output the sound field signal of the sound field B. When the count value of the coefficient counter comes to N, the switching from the sound field A to the sound field B is completed.
To produce a three dimensional sound, an original sound recording is duplicated and the original of its duplicate are then recorded with a time delay in the range of 25 ms to 990 ms, preferably 100 to 500 ms. The resulting first echo recording is duplicated to produce a second echo recording and these are recorded with another time delay, preferably less than the first but within the aforementioned range to yield a fifth tape which has a so-called third echo imparting to the sound a three-dimensionality which cannot be achieved by conventional reverberation and echo methods.