|
Claims  |
|
|
What is claimed is:
1. A low bit rate encoder adapted for compression-encoding digital audio
signals of a plurality of channels, the low bit rate encoder comprising:
energy detecting means for detecting energies of the digital audio signals
of each respective channel;
bit allocation amount determining means for determining bit allocation
amounts for the respective channels on the basis of the detected energies;
compression-encoding means for compression-encoding the digital audio
signals on the basis of the bit allocation amounts allocated to the
respective channels; and
multiplexing means for multiplexing the compression-encoded digital audio
signals of the respective channels,
the bit allocation amount determining means operative to determine bit
allocation amounts so that the relationship between the energy and the bit
allocation amount of the digital audio signals represents a non-linear
characteristic based upon human hearing sense such that, as energy of the
digital audio signals increases, the bit allocation amounts, as a whole,
increase, wherein the non-linear characteristic of the bit allocation
amount determining means is approximated by a substantially S-shaped
curve, beginning at a low bit allocation amount for a first energy level,
increasing to a higher bit allocation amount for a second energy level
higher than the first energy level, and decreasing to a bit allocation
lower than the higher bit allocation amount for a third energy level
higher than the second energy level.
2. A low bit rate encoder adapted for compression-encoding digital audio
signals of a plurality of channels, the low bit rate encoder comprising:
energy detecting means for detecting energies of the digital audio signals
of each respective channel;
bit allocation amount determining means for determining bit allocation
amounts for the respective channels on the basis of the detected energies;
compression-encoding means for compression-encoding the digital audio
signals on the basis of the bit allocation amounts allocated to the
respective channels; and
multiplexing means for multiplexing the compression-encoded digital audio
signals of the respective channels,
the bit allocation amount determining means operative to determine bit
allocation amounts so that the relationship between the energy and the bit
allocation amount of the digital audio signals represents a non-linear
characteristic based upon human hearing sense such that, as energy of the
digital audio signals increases, the bit allocation amounts, as a whole,
increase, wherein the non-linear characteristic of the bit allocation
amount determining means is such that, when energy of the digital audio
signal is sufficiently large, the bit allocation amount decreases.
3. A low bit rate encoding method of compression-encoding digital audio
signals of a plurality of channels, the method comprising the steps of:
detecting energies of the digital audio signals of each respective channel;
determining bit allocation amounts for the respective channels on the basis
of the detected energies;
compression-encoding the digital audio signals on the basis of the bit
allocation amounts allocated to the respective channels; and
multiplexing the compression-encoded digital audio signals of the
respective channels,
wherein, in the step of determining bit allocation amounts, the
relationship between energy and bit allocation amount of the digital audio
signals represents a non-linear characteristic based upon human hearing
sense such that, as energy of the digital audio signals increases, the bit
allocation amounts, as a whole, increase, wherein the non-linear
characteristic is approximated by a substantially S-shaped curve,
beginning at a low bit allocation amount for a first energy level,
increasing to a higher bit allocation amount for a second energy level
higher than the first energy level, and decreasing to a bit allocation
lower than the higher bit allocation amount for a third energy level
higher than the second energy level.
4. A low bit rate encoding method of compression-encoding digital audio
signals of a plurality of channels, the method comprising the steps of:
detecting energies of the digital audio signals of each respective channel;
determining bit allocation amounts for the respective channels on the basis
of the detected energies;
compression-encoding the digital audio signals on the basis of the bit
allocation amounts allocated to the respective channels; and
multiplexing the compression-encoded digital audio signals of the
respective channels,
wherein, in the step of determining bit allocation amounts, the
relationship between energy and bit allocation amount of the digital audio
signals represents a non-linear characteristic based upon human hearing
sense such that, as energy of the digital audio signals increases, the bit
allocation amounts, as a whole, increase, wherein the non-linear
characteristic is such that, when energy of the digital audio signal is
sufficiently large, the bit allocation amount decreases. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to a low bit rate encoder and a low bit rate
encoding method for compression-encoding audio signals of multi-channel
system, a low bit rate decoder and a low bit rate decoding method for
decoding compression-coded signals, and recording media on which signals
encoded by such encoder/encoding method are recorded, which are used for
cinema film projection systems or stereo or multi-sound acoustic systems
such as video tape recorder or video disc player, etc.
2. Description of the Related Art
Various efficient encoding techniques and devices for audio or speech
signals, etc. are known.
As an example of the efficient encoding technique, there is a blocking
frequency band division system, which is the so called transform coding,
for blocking, e.g., an audio signal, etc. in a time domain, to thereby
transform signals in the time domain each of blocks into signals on the
frequency domain for every block of time (orthogonal transform) thereafter
to divide them into signal components in a plurality of frequency bands to
encode those signal components every respective frequency band.
Moreover, there can be enumerated sub-band coding (SBC) which is
non-blocking frequency band division system in which an audio signals,
etc. in the time domain is divided into signal components in a plurality
of frequency bands without blocking such signals every unit time
thereafter to encode the signals.
Further, there have been proposed efficient coding techniques and devices
in which the sub-band coding and the transform coding described above are
combined. In this case, e.g., an input signal is divided into signal
components in a plurality of frequency bands by the sub-band coding
thereafter to orthogonally transform signals for every respective
frequency bands into signals in the frequency domain to implement coding
to these orthogonally transformed signal components in the frequency
domain.
Here, as a filter for frequency band division of the above-described
sub-band coding, there is, e.g., a filter of QMF, etc. Such filter is
described in, e.g., the literature "Digital coding of speech in subbands"
R. E. Crochiere, Bell Syst. Tech. J., Vol. 55, No. 8, 1976. This filter of
QMF serves to halve the frequency band into bands of equal bandwidth. This
filter is characterized in that so called aliasing does not take place in
synthesizing the above-mentioned divided frequency bands at later
processing stage.
Moreover, in the literature "Polyphase Quadrature filters-A new subband
coding technique", Joseph H. Rothweiler ICASSP 83, BOSTON, filter division
technique of equal bandwidth is described. This polyphase quadrature
filter is characterized in that division can be made at a time in dividing
a signal into signal components in a plurality of frequency bands of equal
bandwidth.
Further, as the above-described orthogonal transform processing, there is,
e.g., such an orthogonal transform system to divide an input audio signal
into blocks by a predetermined unit time (frame) to carry out Fast Fourier
Transform (FFT), Discrete Cosine Transform (DCT), or Modified DCT
Transform (MDCT), etc. for every respective blocks to thereby transform
signals in the time domain into those in the frequency domain.
This MDCT is described in the literature "Subband/Transform Coding Using
Filter Bank Designs Based on Time Domain Aliasing Cancellation", J. P.
Princen A. B. Bradley, Univ. of Surrey Royal Melbourne Inst. of Tech.
ICASSP 1987.
Further, as frequency division width in the case of encoding (quantizing)
respective frequency components divided into frequency bands, there is
band division in which, e.g., hearing sense characteristic of the human
being is taken into consideration. Namely, there are instances where an
audio signal is divided into signal components in plural (e.g., 25) bands
by a bandwidth such that the bandwidth becomes broader according as
frequency shifts to higher frequency band side, which is generally called
critical band.
In addition, in encoding data every respective bands at this time, coding
by a predetermined bit allocation every respective bands or adaptive bit
allocation every respective bands is carried out.
For example, in encoding coefficient data obtained after MDCT processing,
coding is carried out by adaptive allocation bit number with respect to
MDCT coefficient data obtained by MDCT processing for every respective
band for every respective blocks.
As the bit allocation technique and device therefore, the following two
techniques and device are known.
For example, in the literature "Adaptive Transform Coding of Speech
Signals", IEEE Transactions of Acounstics, Speech, and Signal Processing,
vol. ASSP-25, No. 4, August 1977, bit allocation is carried out on the
basis of magnitudes of signals every respective bands.
Moreover, for example, in the literature "The critical band coder--digital
encoding of the perceptual requirements of the auditory system", M. A.
Kransner MIT, ICASSP 1980, there are described the technique and the
device in which necessary signal-to-noise ratios are obtained every
frequency bands by making used of the hearing sense masking to carry out
fixed bit allocation.
Meanwhile, in the efficient compression encoding system for audio signals
using subband coding, etc. as described above, such a system to compress
audio data by making use of the characteristic of the hearing sense of the
human being so that its data quantity becomes equal to about 1/5 has been
already put into practice.
It should be noted that there is a system called ATRAC (Adaptive TRansform
Acoustic Coding, a Trade Mark of SONY Corporation) used in, e.g., MD (Mini
Disc, a Trade Mark of SONY Corporation) as the efficient encoding system
of compressing audio data so that its data quantity becomes equal to about
1/5.
However, in the efficient coding system utilizing the characteristic of the
hearing sense of the human being, there are instances where a sound of a
musical instrument or a voice of a human being, etc. obtained by
compression-coding a speech signal thereafter decoded, might be changed
from the original sound although such a phenomenon takes place to a little
degree. Particularly, in the case where this efficient coding system
utilizing the characteristic of the hearing sense is used as a recording
format for recording media for which fidelity reproduction of original
sound is required, realization of higher sound quality is required.
On the contrary, a format of such an efficient coding system (ATRAC system,
etc.) to compress audio signal so that its signal (data) quantity becomes
equal to about 1/5 has been already put into practice, and hardware
employing such a format is being popularized.
Accordingly, implementation of change or expansion having no compatibility
of the format is disadvantageous not only to manufacturers (makers) which
have used the format but also to general users.
For this reason, it is expected that high sound quality be attained by
encoding or decoding device without changing the format itself.
As the method of realization of higher sound quality except for the above,
it is conceivable to mix linear PCM sound into ordinary compressed data.
However, since compressed data of the efficient coding system and linear
data are different in length of frame and time length per each frame, it
is difficult synchronize at the time of reproduction. Accordingly, it is
very difficult to use these data of two formats at the same time.
Further, not only in the case of ordinary audio equipment, but also in,
cinema film projection system, high definition television, or stereo or
multi-sound acoustic system such as video tape recorder or video disc
player, etc., audio signals of 4 to 8 channels are being handled. It is
also expected that efficient coding to reduce the bit rate would apply to
such plural channel systems.
Particularly, in the cinema film, there are instances where, digital audio
signals of 8 channels, namely of left channel, left center channel, center
channel, right center channel, right channel, surround left channel,
surround right channel and sub-woofer channel are recorded. In this case,
the above-mentioned efficient coding to reduce bit rate is required.
It is difficult to provide on cinema film an area capable of 8 channels of
linearly quantized audio data of sampling frequency of 44.1 kHz and 16
bits as used in so called CD (Compact Disc), etc. Accordingly, compression
of the audio data is required.
It should be noted that channels of 8 channel data recorded on the cinema
film respectively correspond to left speaker, left center speaker, center
speaker, right center speaker, right speaker, surround left speaker,
surround right speaker, and sub-woofer speaker, which are disposed on the
screen side where, pictures reproduced from the picture recording areas of
cinema film are projected by projector.
The center speaker is disposed in the center on the screen side, and serves
to output reproduced sound by audio data of center channel. This center
speaker outputs the most important reproduced sound, e.g., speech of
actor, etc.
The sub-woofer speaker serves to output reproduced sound by audio data of
sub-woofer channel. This sub-woofer speaker effectively outputs sound
which feels as vibration rather than sound in low frequency band, e.g.,
sound of explosion, and is frequently used effectively in scene of
explosion.
The left speaker and the right speaker are disposed on left and right sides
of the screen, and serve to output reproduced sound by audio data of left
channel and reproduced sound by audio data of right channel, respectively.
These left and right speakers exhibit stereo sound effect.
The left center speaker is disposed between the left speaker and the center
speaker, and the right center speaker is disposed between the center
speaker and the right speaker. The left center speaker outputs reproduced
sound by audio data of left channel, and the right center speaker outputs
reproduced sound by audio data of right center channel. These left and
right center speakers perform auxiliary roles of the left and right
speakers, respectively.
Particularly, in movie-theater having large screen and large number of
persons to be admitted, etc., there is the drawback that localization of
sound image becomes unstable in dependency upon seat positions. However,
the above-mentioned left and right center speakers are added to thereby
exhibit effect in creating more realistic localization of sound image.
Further, the surround left and right speakers are disposed so as to
surround spectators' seats. These surround left and right speakers serve
to respectively output reproduced sound by audio data of surround left
channel and reproduced sound by audio data of surround right channel, and
have the effect to provide reverberation or impression surrounded by hand
clapping or shout of joy. Thus, it is possible to create sound image in
more three-dimensional manner.
In addition, since a defect, is apt to take place on the surface of a
medium of cinema film, if digital data is recorded as is, missing data
occurs to a great degree. Such a recording system cannot be employed from
a practical point of view. For this reason, error correcting code ability
is the very important.
Accordingly, with respect to the data compression, it is necessary to carry
out compression processing to such a degree that recording can be made in
the recording area on the film by taking bits for correcting code into
consideration.
From facts as described above, as the compression method of compressing
digital audio data of 8 channels as described above, there is applied the
efficient coding system (e.g., the ATRAC system) to attain sound quality
comparable to CD by carrying out optimum bit allocation by taking into
consideration the characteristic of the hearing sense of the human being
as described above.
However, with this efficient coding system, sound of general musical
instrument or voice of the human being, etc. is varied from original sound
similarly to the above although such a phenomenon takes place to a little
degree. For this reason, in the case where such a system is employed in
recording format for which reproduction having fidelity to original sound
is required, any means for realizing higher sound quality is required.
This problem always exists as long as in the case where systems except for
the above-mentioned efficient coding system is used as multi-channel
recording format in the cinema film, irreversible compression system is
employed from a viewpoint of ensuring of the recording area.
Moreover, in a system for implementing efficient coding to audio signals of
the multi-channel system as described above, data of respective channels
are independently caused to undergo compression processing.
For this reason, even if, e.g., a certain one channel is in unvoiced sound
state, fixed bit (byte) allocation amount is allocated to that channel.
Giving fixed bit allocation amount to the channel in unvoiced sound state
as stated above is redundant.
Moreover, since bit allocation amounts are the same also with respect to a
channel of signal of low level and a channel of signal of high level, if
bit allocation amounts are evaluated over respective channels, redundant
bits exist.
It is considered that particularly in the case where bit allocation amounts
are fixed every respective channels, redundancy as described above becomes
more conspicuous.
OBJECT AND SUMMARY OF THE INVENTION
With the above in view, an object of this invention is to provide an
encoder and an encoding method capable of eliminating redundancy of bit
allocation amount in compression-coding in the multi-channel system and of
realizing a higher quality of compression-coding, a decoder and a decoding
method corresponding thereto, and recording media on which
compression-coded signals are recorded.
To achieve the above-mentioned object, in accordance with this invention,
there is provided a low bit rate encoder for compression-encoding digital
audio signals of a plurality of channels by making use of both the
property of the audio signal and the hearing sense the human being, the
encoder comprising: energy detecting means for detecting energies of the
digital audio signals every digital audio signals of the respective
channels; bit allocation amount determining means for determining bit
allocation amounts to the respective channels on the basis of the detected
result; compression-coding means for compression-coding the digital audio
signals on the basis of bit allocation amounts allocated every respective
channels in accordance with the determined bit allocation amounts; and
multiplexing means for multiplexing the compression-coded signals every
respective channels. The bit allocation amount determining means is
operative to determine respective bit allocation amounts so that the
relationship between energy and bit allocation amount of digital audio
signal represents a non-linear characteristic such that according as
energy of the digital audio signal increases, bit allocation amount
increases as a whole. Further, variable bit allocation is carried out
between channels with respect to samples in the time region and samples in
frequency region of the audio signals of a plurality of channels.
In the low bit rate encoder of the first embodiment according to this
invention, the energy detecting means is an amplitude information
detecting means for detecting amplitude information of digital audio
signals of respective channels before undergone compression-coding.
Further, the bit allocation amount determining means determines bit
allocation amounts to respective channels on the basis of a change at
point of time of the amplitude information.
In this case, the bit allocation amount determining means calculates
(determines), by a predetermined conversion formula, bit allocation
amounts with respect to peak values of amplitude information of respective
channels on the basis of the hearing sense characteristic, thus to
determine amounts of bits to be allocated to respective channels on the
basis of the conversion result.
Moreover, the bit allocation amount determining means respectively
determines estimated amounts of bit amounts to be allocated to respective
channels from the predetermined conversion formula to allocate bit
allocation amounts of respective channels in proportion to the respective
estimated amounts to thereby allow the total bit allocation quantity of
all channels to be fixed.
On the other hand, the low bit rate decoder of the first embodiment
according to this invention includes decoding means for decoding signals
of respective channels encoded by the low bit rate encoder of the first
embodiment.
Further, in the low bit rate encoder of the second embodiment according to
this invention, the energy detecting a means is means for detecting change
at a point of time of a predetermined scale factor (normalized value of
the two-dimensional areas of time and frequency (block floating units))
with respect to signals of the respective channels, and the bit allocation
amount determining means serves to carry out variable bit allocation
between channels dependent upon change of scale factors.
Also in low bit rate encoder of the second embodiment, the bit allocation
amount determining means calculates (determines), by a predetermined
conversion formula, bit allocation amount with respect to change in point
of time of sum total of scale factors of respective channels on the basis
of the characteristic of the hearing sense of the human being to determine
bit amounts to be allocated to the respective channels on the basis of the
conversion result.
Further, the bit allocation amount determining means respectively
calculates (determines) estimated amounts of bit amounts to be allocated
to respective channels from the predetermined conversion formula to
allocate bit allocation amounts of respective channels in proportion to
respective estimated amounts to thereby allow total bit allocation amount
of all channels to be fixed.
In addition, the low bit rate decoder of the second embodiment according to
this invention includes decoding means for decoding signals of respective
channels encoded by the low bit rate encoder of the second embodiment.
In accordance with this invention, in compression-coding audio data of a
plurality of channels, since there is employed an approach to determine
bit allocation amounts for respective channels on the basis of changes at
a point of time of energies of respective channels thus to carry out
compression-coding, bit allocation in correspondence with information
amounts can be carried out with respect to respective channels.
In addition, in accordance with this invention, in compression-coding audio
data of a plurality of channels, the relationships between energies and
bit allocation amounts at respective channels are made non-linear to carry
out compression-coding on the basis of the bit allocation amounts. For
this reason, it is possible to carry out bit allocation in correspondence
with information amounts with respect to respective channels.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a circuit diagram showing, in a block form, outline of the
configuration of a low bit rate encoder of a first embodiment according to
this invention.
FIG. 2 is a circuit diagram showing, in a block form, outline of the
configuration of a low bit rate decoder of first and second embodiments
according to this invention.
FIG. 3 is a circuit diagram illustrated in a block form for explaining bit
allocation in a low bit rate encoder of the ATRAC system and low bit rate
encoder of the embodiment according to this invention.
FIG. 4 is a view for explaining the state of recording of data within sound
frame.
FIG. 5 is a graph for explaining bit allocation amount in the first
embodiment.
FIG. 6 is a flowchart for explaining the operation of determination of bit
allocation amount in the first embodiment.
FIG. 7 is a circuit diagram showing, in a block form, outline of the
configuration of a low bit rate encoder of a second embodiment according
to this invention.
FIG. 8 is a graph for explaining bit allocation amount in the second
embodiment.
FIG. 9 is a flowchart for explaining the operation of determination of bit
allocation amount in the second embodiment.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
Preferred embodiments of this invention will now be described with
reference to the attached drawings.
The fundamental configuration of a first embodiment according to this
invention is shown in FIGS. 1 and 2. The configuration of a low bit rate
encoder of the first embodiment is shown in FIG. 1, and the configuration
of a low bit rate decoder of the first embodiment is shown in FIG. 2.
The configuration of the encoder shown in FIG. 1 will be first described.
Audio signals of a plurality of channels (ch1, ch2, . . . , chn) are sent
to sampling and quantizing elements 100.sub.1 .about.100.sub.n
corresponding to respective channels via input terminals 20.sub.1
.about.20.sub.n and transmission paths 1.sub.1 .about.1.sub.n similarly
corresponding to respective channels. At these sampling and quantizing
elements 100.sub.1 .about.100.sub.n, audio signals of respective channels
are converted into quantized signals. Quantized signals from these
sampling and quantizing elements 100.sub.1 .about.100.sub.n are sent to
amplitude information detecting circuit 200 and delay lines 300.sub.1
.about.300.sub.n via respective transmission lines 2.sub.1 .about.2.sub.n.
The amplitude information detecting circuit 200 detects amplitude
information from quantized signals of respective channels. Namely, this
amplitude information detecting circuit 200 detects peak values of
amplitude information for every periods corresponding to the number of
samples (hereinafter referred to as time blocks) of audio data processed
at a time by encoding elements 400.sub.1 .about.400.sub.n which will be
described later to send (transfer) these peak values to bit allocation
determining circuit 500 via transmission lines 4.sub.1 .about.4.sub.n
corresponding to respective channels. It should be noted that this
amplitude information detecting circuit 200 may be of a structure to
detect amplitude information by signals from transmission lines 1.sub.1
.about.1.sub.n.
The bit allocation determining circuit 500 determines, of conversion, bit
allocation amounts for every respective channels from peak values of every
respective channel in a manner described later to send (transfer) these
bit allocation amounts to respective encoding elements 400.sub.1
.about.400.sub.n via transmission lines 5.sub.1 .about.5.sub.n.
Moreover, the delay lines 300.sub.1 .about.300.sub.n delay signals which
have been received through transmission lines 2.sub.1 .about.2.sub.n by
the time blocks to send (transfer) these delayed signals to respective
encoding elements 400.sub.1 .about.400.sub.n through respective
transmission lines 3.sub.1 .about.3.sub.n.
Respective encoding elements 400.sub.1 .about.400.sub.n carry out a
compressing operation for every time block. Bit allocation amounts
received through transmission lines 5.sub.1 .about.5.sub.n at this time
reflect peak information of signals received through the transmission
lines 3.sub.1 to 3.sub.n. Respective encoding elements 400.sub.1 to
400.sub.n compress signals which has been received through the
transmission lines 3.sub.1 .about.3.sub.n so that their bit allocation
amounts are equal to bit allocation amounts which have been received
through the transmission lines 5.sub.1 .about.5.sub.n to send (transfer)
these compressed signals to formatter 600 via respective transmission
lines 6.sub.1 .about.6.sub.n.
The formatter 600 implements error correcting processing to the compressed
signals for every channel which has been received via the transmission
lines 6.sub.1 .about.6.sub.n in accordance with a predetermined format to
compose them into a bit stream for transmission or for recording onto
recording medium. This bit stream is outputted from output terminal 21 via
transmission line 7.
Further, this bit stream is written into predetermined areas 28 on cinema
film 27 by laser recording unit 26, for example. In the figure, reference
numeral 29 denotes perforations adapted so that sprockets of projector
(not shown) for film feeding are engaged therewith. The recording areas 28
are provided, e.g., between the perforations 29.
The configuration of low bit rate decoder of this embodiment will now be
described with reference to FIG. 2.
Bit stream composed by the encoder (low bit rate encoder) of FIG. 1 is
transmitted or is recorded onto a recording medium. This recorded bit
stream is delivered to input terminal 22 via a predetermined reproducing
unit (not shown), and is then sent from this input terminal 22 via
transmission line 8 to deformatter 700.
This deformatter 700 decomposes the bit stream which has been sent through
the transmission line 8 into compressed signals for every respective
channel in accordance with a predetermined format. The decomposed
compressed signals of every respective channel are sent to decoding
elements 800.sub.1 .about.800.sub.n via corresponding to transmission
lines 9.sub.1 .about.9.sub.n.
Respective decoding elements 800.sub.1 .about.800.sub.n expand the
compressed signals which have been sent via the respective transmission
lines 9.sub.1 .about.9.sub.n to send them to D/A (digital/analog)
converters 900.sub.1 .about.900.sub.n via corresponding respective
transmission lines 10.sub.1 .about.10.sub.n.
Respective D/A converters 900.sub.1 .about.900.sub.n convert the expanded
signals (digital signals) which have been sent via the respe | | |