|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field
The invention relates to the recording and reproduction of digital audio,
and particularly to an improved digital audio format, and method and
apparatus thereof, for digitally recording and retrieving audio signals
with error correction and concealment techniques.
2. Prior Art
As commonly known, the use of digital techniques has spread rapidly due to
the ease with which digital data can be manipulated, transferred,
transmitted and stored. Accordingly, as has happened in various technical
fields such as the computer, instrumentation and video recording fields,
digital techniques have recently acquired significant potential in the
field of recording and reproducing of audio signals.
The object of any recording process is to store information and then
faithfully reproduce it. However, in conventional analog recorders there
exists a number of problems which deteriorate faithful reproduction, which
problems are an inherent function of the recording medium and of the
mechanical apparatus used to transport the medium. Although the problems
have been compensated, or circumvented, by the development of very
sophisticated mediums and mechanics, it is widely recognized that
conventional analog recording/reproducing techniques are rapidly
approaching theoretical operational limits.
Typical of problems encountered in analog recording/reproducing techniques
are inadequate dynamic range, i.e., low signal-to-noise ratio, inherent
phase distortion, inherent harmonic distortion, insufficient transient
response, modulation noise, cross talk, print through, multi-copy
degradation, flutter and wow, inherent limitations in noise reduction
systems, storage degradation with time and limited low-end frequency
response.
On the other hand, digital recording/reproducing techniques provide either
an improvement in, or total elimination of, each of the above problems.
Some of the problems, such as modulation noise, print through, inadequate
dynamic range, harmonic distortion, modulation noise and low end
limitations, are eliminated or significantly improved due to the fact that
the problem does not exist in the digital domain. Other problems, such as
phase distortion, transient response, flutter and wow, and storage and
multi-copy degradation, are eliminated or significantly improved due to
the ease with which the signal can be handled once it is converted to the
digital domain.
However, the use of digital audio techniques in turn gives rise to various
problems and disadvantages. For example, poor transmission conditions that
conventionally would only degrade an analog signal may completely destroy
the equivalent digital signal, and even a small discontinuity such as a
single bit error, may cause serious audio degradation and unpleasant
sounds if the bit error occurs at a significant bit position. That is,
digital signal systems characteristically fail abruptly, usually without
the gradual warning which is typical of deterioration in analog systems.
Thus, it has been found that digital audio techniques require the use of
special error correction, concealment and/or muting techniques to minimize
the effects of the various types of dropouts and data errors arising
during the reproduction of the recorded digital audio.
In order to effect efficient correction and/or concealment of errors, it is
first necessary to detect that an error has occurred. A first level of
error indication is provided by observing the playback RF signal envelope.
However, such a technique fails to provide the requisite degree of detail
required for a reliable error detection system.
Thus, in a high performance digital audio system, an optimum error
detection technique includes the process of recording additional
information along with the normal audio signal data. This information,
termed "overhead", may be in the form of parity bits and/or special error
checking characters, which are capable of providing detection of any error
which may occur during the record or playback processes.
Upon detection, the errors may be either concealed and/or corrected.
Concealment techniques may employ a zero order interpolation concealment
where the last accurate data sample is held, or a first order
interpolation concealment where an interpolation is made between the last
accurate data sample and the next occurring accurate data sample.
The most desirable technique for eliminating errors is to correct them.
This requires knowledge of the data recorded during the time that the
error occurred. Thus, error correction techniques require the addition of
the overhead information of previous mention during the recording process.
Since errors generally are not randomly scattered but exist in bursts
lasting from a few to several hundred bits, the error correction
information must be dispersed and recorded on the recording medium to
prevent the burst type errors from precluding precise operation of the
error correction system. Thus, it follows that the more effectively and
reliably that an error concealment and correction technique is, the more
overhead information must be added to the audio data during recording.
This additional overhead increases the data storage requirements of the
recorder and either increases the packing density on the medium or causes
a corresponding undesirable increase in tape speed and usage.
Thus, the method and format used to intersperse the overhead information
with audio data is important in providing error concealment or correction
of a gradual deterioration of the recorded data bit stream while
precluding the total failure of the correction and thus of the digital
audio recorder/reproducer system.
A fairly comprehensive list of articles on digital audio
recorder/reproducer systems is compiled in the list of references and
bibliography of an article by M. Willcocks entitled "A Review of Digital
Audio Techniques", Journal of The AES, Jan-Feb, 1978, Volume 26, pages
56-64. Typical of such prior art are the systems described in Bellis &
Brookhart AES preprint no. 1298 (M-2) Nov. 4-7, 1977; BBC Research
Department report, Bellis and Smith BBC RD 1974/39, Nov. 1974; N. Sato,
"PCM Recorder, A New Type of Audio Magnetic Tape Recorder", Journal AES,
V. 21, No. 7, Sept. 1973; U.S. Pat. No. 3,930,234 to Queisser, et al; U.S.
Pat. No. 3,994,014 to S. G. Burgiss.
SUMMARY OF THE INVENTION
Accordingly, it is an object of the invention to provide an improved error
concealment and correction method and apparatus for a digital audio
recorder/reproducer.
Another object is to provide an improved digital audio data format for a
digital audio error concealment and correction system.
A further object is to provide an improved digital audio data format
wherein the digital data is selectively interspersed with error detection,
correction and synchronizing information in a given block/sub-block
configuration.
Still another object of the invention is to provide an improved digital
audio data format uniquely applicable to automatic error correction and to
manual and/or automatic editing techniques.
To this end, the invention provides an improved format, method and
apparatus for interspersing audio data, sync and error detection and
correction information which circumvents the problems and disadvantages
presently existing in the prior art digital audio record/reproduce systems
mentioned above. The recorded data is formatted into blocks with selected
inter-block gaps to allow going into, and out of, the record mode without
irretrievably destroying data. Each block is independent of all others,
and is divided into a selected arrangement of sub-blocks of data and
sub-blocks of parity information, wherein each sub-block contains its own
error detection and sync information. In addition, the blocks of data
corresponding to the data stream, and the error and sync information,
i.e., overhead, are simultaneously recorded in alternate tracks on the
recording medium to further enhance the efficiency and accuracy of the
error detection and correction technique relative to those of the prior
art.
In an exemplary embodiment, the digital audio data in each successive block
is divided into thirty sub-blocks, each containing its own error
detection, correction and sync information. Twenty alternate 16 bit
samples from the audio waveform are placed into two of these (data)
sub-blocks, which are then used to generate a third (parity) sub-block
which may be the bit-by-bit parity of the first two data sub-blocks. By
way of example, parity may be generated by adding two data words together
in modulo-2, or by adding the two data words as the 2's complement. In
either case the result is a data "triad" where even numbered samples are
in one sub-block, odd numbered samples are in another, with the parity
information in the third sub-block. The three sub-blocks, or "triad", are
then specifically dispersed, along with the other 27 sub-blocks, to define
the major block. In addition, the data sub-blocks of a triad are then
simultaneously recorded along alternate tracks in the recording medium
while the parity sub-block of the triad is divided and recorded in both
tracks following the respective data sub-blocks. Such a block/sub-block
arrangement generally prevents any one error event, such as dropout or
burst errors, from causing errors in more than two of the sub-blocks in
any data triad. If an error occurs during playback in one of the three
sub-blocks in a data triad, the original data in that sub-block is
correctly re-constructed from the remaining data and the parity sub-blocks
in accordance with the error correction technique. If an error occurs in
two sub-blocks, error masking or concealment techniques are used to mask
the error.
BRIEF DESCRIPTION OF THE DRAWINGS
FIGS. 1A, 1B and 1C are pictorial representations of an electrical process
used to format the basic audio data.
FIG. 2 is a pictorial representation of an embodiment of the
block/sub-block format of the invention.
FIGS. 3A, 3B and 3C are pictorial representations of the arrangement of the
data and parity sub-blocks, and of the inter-block gap (IBG),
respectively.
FIGS. 4 and 5 are pictorial representations of the format of FIG. 2
depicting the manner of re-generating (correcting) data sub-blocks, and of
interpolating (concealing) a data sub-block, respectively, in the event of
dropouts.
FIG. 6 is a block diagram of a digital audio recorder/reproducer system
embodying the method and apparatus of the invention combination.
FIG. 7 is a more detailed block diagram of the portions of the system of
FIG. 6 which depict the apparatus for generating the format of FIG. 2
while recording and reproducing audio data.
FIGS. 8A, 8B, 8C and 8D are schematic diagrams exemplifying one
implementation of the formatter encoder of the system of FIG. 7.
FIGS. 9A and 9B are schematic diagrams exemplifying one implementation of
the format controller for controlling the system of FIG. 7.
FIGS. 10A, 10B, 10C, 10D, 10E, 10F and 10G are schematic diagrams
exemplifying one implementation of the decoder/de-formatter of the system
of FIG. 7.
FIG. 11 is a schematic diagram exemplifying one implementation of the read
address controller for the system of FIG. 7.
FIG. 12 is a pictorial view depicting the layout, pin numbers, etc., of
various integrated circuits used in the schematics of FIGS. 8-11.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
It has been found that a digital audio recorder format must provide, among
other requirements mentioned above and further described below, the
capability of handling manually-directed machine edits, such as punch-ins,
as well as edits directed by an automatic system. That is, the entrance
and exit points of edits must not irretrievably destroy or disturb
existing data adjacent the data boundaries. Thus, the invention formats
the recorded data into blocks which are independent from all other blocks,
with well defined inter-block gaps therebetween which provide the entrance
and exit points. The blocks in turn are formed of a selected plurality and
arrangement of sub-blocks of audio data interspersed with sync and error
detection and correction information. The arrangement of three sub-blocks
is herein termed a "triad".
In accordance with the invention, each block is physically long enough, on
the recording medium, to allow proper dispersion of the data within the
block such that dropouts cannot destroy the error correction mechanisms of
the system. Further, the blocks occur often enough such that there are at
least two blocks on the medium between the positions of the reproduce and
record heads. The latter condition provides the capability of electronic
cancellation of real and apparent variations in head-to-head spacing due
to mechanical tolerances, including machine-to-machine interchange, and
tape dynamic characteristics. It also allows time for the processing of
the data contained in a block, either within the recorder or in a
peripheral electronic processor, for subsequent re-recording into the same
block space as it passes the record head. This maintains absolute timing
between channels of a multi-channel recorder during editing procedures.
In the present embodiment, by way of example only, for a tape speed of 30
inches per second (in/s) a block rate of 250 Hertz was selected in view of
limitations imposed by the need to have simple synchronization
relationships with the various world television and film standards. The
rate provides an on-medium block-to-block pitch of 120 mils and an
inter-block gap of 9.6 mils. In the present example, five blocks occur
between the reproduce and record heads, which equals a spacing of 600
mils.
Since it is possible that a major dropout will cause the data retrieving
electronics to lose sync, it is necessary to regain sync as soon as
possible to minimize the additional loss of data. Thus, the minimum
frequency of sync occurrence is related to dropout length probability.
However, the maximum frequency of sync information occurrence is decreased
by the need to minimize the amount of overhead added to the recorded data.
In the instant embodiment, the format repeats the 12 bit sync information
approximately every 0.25 milliseconds (ms).
It is also necessary to quickly and un-ambiguously detect the data errors
resulting from dropouts. The format herein repeats error detection
information at the same 0.25 ms rate as the sync information. The error
detection information herein is in the form of a cyclic redundancy check
character (CRCC) pattern which yields excellent error detection
characteristics with the addition of only 12 detection bits for every 172
bits which are to be protected.
The format is arranged to provide error masking and concealment, as well as
error correction capabilities. The error correcting technique always
corrects errors in a data triad when those errors are contained within one
sub-block of the triad. If the errors are contained within two sub-blocks
and the remaining good sub-block of the triad contains sampled audio data,
the errors are masked or concealed in a very effective manner by
interpolating between the alternate data samples contained in the good
sub-block. This is generally termed first order interpolation concealment.
When both of the sub-blocks in a triad which contain sampled audio data
have errors, concealment using data holding or muting is performed, where
the value of the last good data sample is held until the next good sample.
This is generally termed zero order interpolation concealment. An
alternative to zero order interpolation is muting during uncorrected
errors.
In the embodiment described herein, by way of example only, the sampling
rate of the audio signal is 50 kHz. The format provides a 16 bit word to
represent each data sample, whereby the basic serial audio data rate is
800 kilobits per second (kb/s) per channel. To provide error correction,
the added overhead is 50% of the basic data rate, and error detection and
synchronization requires an additional 16% of overhead. The inter-block
gap configuration requires an additional 8.7% overhead, resulting in a
total data rate per channel of 1.5 megabits per second (mb/s).
In order to allow recording such a data rate at conventional medium speeds
of, for example, 30 (in/s), the system herein divides the audio data
stream into two paths which then are recorded into two separate tracks on
the medium. This allows the use of a recording speed of 30 ips with an
acceptable recorded bit density of 25 kb/in, considering currently
available recording media.
More particularly, in discussing the digital audio format of the invention
combination, and particularly the generation thereof, FIG. 1 depicts the
electrical process used to initiate the format of the basic audio data in
real time. An analog-to-digital (A/D) converter, either within the
recorder as depicted in FIG. 6 below, or peripheral to the system, samples
the incoming audio signal every 20 microseconds (50 kHz) and generates a
16 bit binary number representing each sample. FIG. 1A represents the
continuous generation of the 16 bit binary numbers representing the
consecutively sampled audio signal. To aid the explanation, the numbers
are consecutively numbered from S1 through S20, which represent the first
in a series of 20 samples. The first sample S1 is placed in an odd sample
sub-block, O-1 of FIG. 1B. The second sample, S2, is placed in the even
data sub-block, E-1 of FIG. 1B. Likewise, the sample S3 is placed in odd
sub-block O-1, the sample S4 is placed in even sub-block E-1, and the
sampling continues until all 20 samples have been divided between odd and
even sub-blocks O-1 and E-1, respectively. Each data sample contains 16
bits, and each sub-block contains 10 samples, whereby accordingly each
sub-block contains 160 bits of digitized audio data.
A third sub-block, termed the parity sub-block and shown in FIG. 1C, is
created by sequentially comparing the bits in data sub-block O-1 with
those in data sub-block E-1. For example, the first bit in O-1 is compared
with the first bit in E-1. As known, binary bits can have only two values,
a "1" or a "0", whereby if both bits compared are of the same value, a "0"
is placed in the first bit position of the word in parity sub-block P-1.
If the bits have a different value, a "1" is placed in the first bit
position of the word in parity sub-block P-1. Such process continues on a
bit-position by bit-position basis until all 160 bits of audio data have
been compared and all 160 positions within parity sub-block P-1 have been
filled. The result is a sub-block triad consisting of two data sub-blocks
O-1, E-1 and parity sub-block P-1.
The next 20 samples are also divided into a triad configuration such as
shown in FIGS. 1A-1C, whereupon ten of the triads are then combined to
form a single data block.
Although parity is generated above employing a modulo-2 addition, parity
may be obtained by the 2's complement addition. Thus, two words are
summed, and the 2's complement sum is formed; i.e., there is formed a
17-bit word which represents the 2's complement sum of the two sixteen bit
words. Then the most significant sixteen bits are recorded as parity. In
reproducing the data, the top sixteen bits are retrieved and subtracted
from parity, whereby the least significant bit is not derived, i.e., in
the event of an error, only the first fifteen bits of the missing data
word are recovered. However, this technique provides a more accurate
masking technique than when using the modulo-2 addition to generate
parity, since the parity may be divided by two to yield a fifteen bit
approximation instead of a linear interpolation.
Referring to FIG. 2, the ten sub-block triads that make up one data block
are divided between alternate recording medium tracks, track A and track
B. The tracks are spaced apart on the order of one track width to insure
that typical single event dropouts only affect one track of the two track
pair.
Track A contains the odd data sub-blocks (O-1, O-2, etc.) and track B
contains the even data sub-blocks (E-1, E-2, etc.). Thus, it may be seen
that alternate samples of the audio signal as sampled in FIG. 1 are
recorded in alternate tracks of the recording medium.
Note that the parity sub-blocks are shared between the tracks, with the odd
parity sub-blocks (P-1, P-3, P-5, P-7 and P-9) recorded on odd track A,
and with the even parity sub-blocks (P-2, P-4, P-6, P-8 and P-10) recorded
on even track B. Such an arrangement of parity sub-blocks improves the
accuracy of error correction as is further explained hereinbelow.
The data block of FIG. 2 also depicts the inclusion of synchronization and
error detection information at specific locations within the block, and
particularly at specific locations within each sub-block. As previously
mentioned, it is possible that when a major dropout occurs, the recorder's
electronics may lose synchronism with the format on the recording medium.
Synchronism must be regained as soon as possible to minimize any
additional loss of data. To ensure this rapid recovery, a 12 bit pattern
is inserted at the beginning of each sub-block as depicted in expanded
detail of specific sub-blocks along the bottom of the FIG. 2. This pattern
is unique and cannot naturally occur in the audio, parity or error
detection data. By way of example, an encoding scheme may be used wherein
the synchronization pattern may comprise a self-clocking, DC free pattern
of seven bits which does not occur in data, with a five bit suffix to
indicate which sub-block is under consideration. An example of an encoding
scheme which may be used is the Miller squared (M.sup.2) type code. It may
be seen that a synchronization pattern occurs approximately every 0.25 ms.
Just as it is necessary to re-synchronize after a dropout as soon as
possible, it is also necessary to quickly and unambiguously detect the
data errors resulting from dropouts. Obviously, it is only after detection
of a dropout error that such errors may be corrected to concealed.
Accordingly, a 12 bit error detection character is added to the end of
each sub-block and thus occurs at the same rate as the synchronization
pattern. This character is in the form of a cyclic redundancy check
character (CRCC), which character is the result of arithmetically dividing
the data in the sub-block by a binary polynomial. More particularly, the
CRCC is a code wherein the data stream is successively divided, i.e., the
160 bit of a sub-block are divided by a selected polynomial employing a
modulo-2 scheme. The number is subtracted and is shifted to the right,
subtracted again, and again shifted to the right. This results in a
remainder much as in the process of long division, which is stored as the
CRCC code. Since the polynomial used to generate the remainder character
when the data was received is known, the division may be performed again
in playback, whereupon the check characters may be compared to provide
error detection. If the remainder from this division matches the remainder
represented by the CRCC, there is an extremely high probability that no
errors occurred during playback in either the data or the CRCC. If the
check characters are not the same, then it is known that an error has
occurred in the block of information. If an error burst occured and that
burst was less than 12 bits in length, the errors will be unconditionally
detected. If the burst error is exactly 12 bits long, the probability of
the error going undetected is 1 in 2,048. For burst errors longer than 12
bits, the probability of undetected errors is 1 in 4,096. Thus, it may be
seen that the scheme provides a potential to improve the recorder's basic
bit error rate by 5,000 to 1 if all detected errors are corrected.
The data block of FIG. 2 further includes a selected blank space or
inter-block gap (IBG) at the beginning of each block of data, which gap is
reserved for the nonrecording of information. More particularly, in the
embodiments herein described, the IBG contains only transitions relating
to clock extraction, and physically separates the data into blocks to
allow the digital audio recorder to enter and exit the magnetic history on
the recording medium during the recording, editing, etc., processes
without irretrievably destroying the recorded audio data. The IBG may be
used to supply total block information, editing information, etc.
The block/sub-block configuration of the instant format, wherein the blocks
are separated by inter-block gaps, allows a unique reproduce/record head
configuration and method of operation, which, in turn, provides unique
advantages not available in prior art audio recorder/reproducers. That is,
in the digital audio system described herein, the reproduce head is
located first or upstream on the tape, and the record head is spaced
therefrom down the tape or downstream from the reproduce head. More
particularly, the reproduce head is spaced ahead of the record head a
distance of five blocks, i.e., 600 mils, and a delay circuit is provided
which has a delay equal to the distance between the heads. Such a
configuration allows the information to be reproduced and subsequently
recorded in the same position on the tape as long as the exact distance
between the heads is known. Likewise, the configuration allows the system
to drop into record at the center of the inter-block gap, and allows
dynamically varying the length of the inter-block gap in order to make
certain that all the gaps are of the same length. Additionally, in
editing, the magnetic history on the recording medium can be reproduced
from the medium, processed, corrected, etc., and then replaced on the
medium by the record head in exactly the same position at which it was
initially recorded. The circuit of the application (FIGS. 7-11) provides
for dynamically varying the delay distance (between the heads) such that
the reproduced data in one signal channel may be processed in one manner,
while the data in another signal channel may undergo a different type of
processing.
As may be seen in FIG. 2, the improved format hereof provides a minimum
distance between the data sub-blocks and the parity sub-blocks, which
improves the error correcting capabilities of the system. Since most tape
dropouts are 10 mils or less in length, the CRCC codes located at the end
of each sub-block are approximately 71/2 mils apart. This allows the
system to rapidly recover after a dropout, which in turn allows recovery
of the data and synchronization. The parity blocks should thus be greater
than 10 mils apart, and they are located within the format described
herein a minimum of 30 mils from the respective data that they protect.
Such arrangement optimizes the chances of surviving catastrophic type
dropouts that might occur such as, for example, if there are fingerprints
or dirt on the tape, manufacturer's defects, etc. The rate of occurrence
of the inter-block gaps is also selected to allow for synchronization to
any of the television broadcast standards, i.e., NTSC, PAL, etc.
FIGS. 3A, 3B, and 3C show the construction of the data and parity
sub-blocks, and the inter-block gap, respectively, in greater detail. The
data sub-block of FIG. 3A includes ten audio samples of 16 bits each, is
preceded by the sync code and is followed by the cyclic redundancy check
character. The parity sub-block of FIG. 3B is similar to that of the data
sub-block, and includes parity for 20 audio samples of two data
sub-blocks, wherein the combination of two data sub-blocks and the
associated parity sub-block defines the "triad" of previous mention.
The inter-block gap of FIG. 3C separates the data blocks and is used to go
into and out of record without destroying audio data. The IBG also
contains the synchronization pattern preceding the gap which identifies it
as an IBG, and the cyclic redundancy check character for error detection
following the gap. The IBG may be utilized to record non-critical and
generally repetitive information such as time code, data block
identification, or editing information. Thus, the IBG may be used, for
example, to label each specific block of data for editing purposes whereby
determination may be made in terms of hours, minutes, seconds, frames and
then blocks. This allows the system to detect a specific block, whereupon
the system may count down inside the block and perform, for example, an
edit within the block on a word-by-word basis.
Such format allows the advantage of non-destructive recording which
precludes muting of the signal during times in which some sort of edit is
completed, and it allows for an instantaneous data transfer. That is, when
moving from one sample to the next sample, the system can select the next
sample from a source which is different from the sample source which would
normally be used. Thus, it follows that the limit of resolution is down to
the sample rate which is 20 microseconds in the instant configuration as
opposed to the delay times on the order of several milliseconds for prior
art digital audio recorders.
For any professional record | | |