|
Description  |
|
|
BACKGROUND OF THE INVENTION
The present invention relates to a data compression and decompression
processor and corresponding methods therefor.
Data compression/decompression techniques are well known in the prior art.
Examples of prior art approaches are disclosed in U.S. Pat. No. 4,464,650
(the '650 patent), issued in August 1984 to Eastman, and U.S. Pat. No.
4,558,302 (the '302 patent), issued in December 1985 to Welch.
The above patents present implementations of a data compression algorithm
described by Lempel and Ziv in "Compression of Individual Sequences Via
Variable-Rate Coding," IEEE Transactions on Information Theory, IT-24-5,
September 1977, pp. 530-537, and "A Universal Algorithm for Sequential
Data Compression," IEEE Transactions on Information Theory, IT-23-3, May
1977, pp. 337-343. These two papers presented the original ideas and
principles of compression algorithm. However, no implementation method was
given.
Data compression algorithms are known as a discipline which design the
shortest representation of the original data string. Reversible data
compression implies that a given data string being compressed and then
decompressed should be identical to the original data string, and not a
single bit of information is allowed to be distorted or lost. This
reversibility is identical to the concept of noiseless data compression in
the context of information theory. Hence, noiseless data compression is
quite different from speech compression and video compression where
distortions are allowed. For typical applications of data compression for
data being stored or transmitted, noiseless data compression is the only
kind of compression which can be used.
The benefits of data compression are tremendous. The benefits of data
compression technology to communications have been the increase in
bandwidth available to the users and a savings of the cost of the
communication device. For example, to transmit 9600 bps digital
information over existing telephone lines requires conditioned lines and
9600 bps high-speed modems. If one applied advanced data compression
technology with 4:1 compression ratio, one may use 2400 bps modems and
unconditioned lines for such transmissions. In this example, the savings
is not only the difference in modem cost between 9600 bps modems and 2400
bps modems, also the monthly charges for the conditioned lines. As a
further example, a typical branch of a bank requires four leased lines for
operation (one for the ATM machine, one for news broadcast, and two for
data processing). If a multiplexer has a built-in compressor with 4 to 1
compression ratio, then only one line will be needed to communicate with
the outside world, and a savings of monthly charges of three lines can be
achieved.
As to back-up systems, data compression technology can greatly reduce the
back-up time, increase the capacity and save the back-up device cost. For
example, for the high-end 1/2" reel to reel tape back-up systems,
typically four tapes are required to be mounted and dismounted from the
tape drive system. It normally takes 30 minutes to back up one tape. Due
to this sequential process, it becomes a very-time-consuming process to
back up computer data.
If a back-up system has a built-in compression capability and compression
ratio of 4 to 1, then one may use one tape instead of four tapes for the
back-up purpose, and the back-up process becomes an unattended process.
The other major benefit of data compression technology for back-up is that
the person performing the back-up can simply mount the tape and leave the
system unattended. By the same token, data compression technology can
increase the capacity of the disk drive, reduce the data access time, and
save the disk cost. Again, the savings are especially significant for
high-end disk drive systems where multiple heads and multiple platters are
required. With the help of data compression technology, only single head
and single platter can be used to achieve the same capacity requirement.
Due to fewer parts, this approach also improves the reliability of the
system. However, these benefits will be realized only if the data
compression technology meets each of the following requirements:
1. Its encoding process should be string-oriented rather than
character-oriented in order to achieve good compression ratios.
2. The algorithm must be adaptive to the data source so that good
compression ratios can be achieved for all data files. This capability is
also called "universal" data compression.
3. Its data processing rate must be at least as high as the system data
transfer rate so that system throughput will not be degraded.
4. The cost of data compression hardware implementation must be low.
Preferably, the implementation of the data compression algorithm on a
single VLSI chip is feasible.
5. The integration of the data compression technology into the system must
be simple, and the data compression technology cannot cause any system
problems.
6. The compression capability must be totally transparent to the users.
Currently, the most popular noiseless data compression method being used in
the industry is the traditional Huffman coding or its modified versions.
Huffman coding is a two-pass coding method. The first pass performs
statistics collection of the data source; the second pass constructs a
codetable based on the statistics collected. The underlying principle is
to replace the more frequent characters by shorter codewords and less
frequent characters by longer codewords. This coding method normally
provides poor compression ratios because it is character encoding rather
than string encoding.
To elaborate this point, consider encoding the more frequent words by
shorter codewords and less frequent words by longer words. It is intuitive
that the compression ratios to be achieved will be even better than
compression ratios achieved by character encoding. For example, it is
known that as the English word "the" is the most frequent word, so that it
may be encoded by the shortest codeword. This is what is normally called
encoding on the extended information source or multiple symbol encoding.
In general, the complexity and memory requirement of Huffman coding based
on extended symbols (multiple symbols) source grow exponentially. For
example, if the symbol size in the Huffman coding system is a byte, then
the codetable requires 256 entries for character encoding. If the symbol
size in the Huffman coding system is two bytes, then the codetable
requires 65,536 entries for two byte encoding. For the particular example
mentioned earlier, three-character encoding for "the" requires 16,777,216
possible entries for the codetable. Hence, practical implementation of
Huffman coding has always been limited to character encoding only.
Typically, Huffman coding requires the same codetable to be shared or
transmitted between the sender's and receiver's ends for successful
compression and decompression. In practice, several classes of codetables
are preassigned for data compression and decompression purposes. The
compressor/decompressor switches to different codetables according to the
closest match between the data source and the codetable. The resulting
compression ratios are further deteriorated due to the mismatch between
the source statistics and the codetable being used. Although it is
possible to construct an adaptive version of Huffman coding, the
implementation is cumbersome and the improvement on compression ratios is
limited.
Since Huffman coding is a fixed to variable coding technique, the
decompressor parses the compressed strings on a bit-by-bit basis until a
legal codeword has been recognized. Because the length of the legal
codeword is not known beforehand, the decompression process could be slow.
The principle of Huffman coding is described in the article of D. A.
Huffman entitled "A Method for the Construction of Minimum Redundancy
Codes," Proceedings of IRE 40, pp. 1098-1100 (September 1952). In summary,
Huffman coding suffers from several limitations:
1. It is a two-pass algorithm. The first pass involves the statistics
collection needed to construct the codetable to be used by the compressor
and decompressor. For many real life applications, this is unacceptable.
2. The compression ratio is less than optimal, since the complexity of
multiple symbol encoding or string encoding is exponential, and practical
implementation of Huffman coding has been limited to character encoding
only.
3. The compression ratio is poor. Frequently, the codetable is mismatched
to the information source. On the other hand, the implementation of
adaptive Huffman coding is cumbersome and complex, and the convergence of
the adaptive statistic collection is slow.
4. Since Huffman encoding is a fixed to variable length encoding, the
compression algorithm itself is sensitive to error propagation and the
decoding process could be slow.
5. The same codetable needs to be stored or transmitted between the
compressor and the decompressor. This represents additional overheads for
the compression purpose.
The first major effort to alleviate these limitations was the disclosure of
the Lempel-Ziv algorithm identified above. Basic principles of Lempel-Ziv
algorithm were described in the two papers cited above. The basic
principles of Lempel-Ziv algorithm work as follows:
The compressor first prepares for a dictionary where variable length
dictionary words will be stored. Initially, the dictionary could be empty
or partially filled. Then the incoming string will be parsed and examined
on a character by character basis, so that the longest match between the
input string and the words in the dictionary will be identified. This is
also called "greedy" parsing. As soon as the greedy parsing is completed,
the extended substring of the longest match by one character will be
entered into the dictionary as a new word, and the index of the word in
the dictionary with a match to the longest substring of the incoming
string will be output as a compressed word.
The decompressor will also prepare for a dictionary with the same principle
to build and fill the dictionary as the compressor does. By examining the
received indexes, the decompressor reconstructs the dictionary and
performs the decoding process without physically storing or transmitting
the actual contents of the dictionary from the compressor to the
decompressor.
The preeminent features of the Lempel-Ziv algorithm are its ability to
adapt to any type of incoming data source, and it extracts the longer and
more frequent substrings from the data source as words stored in the
dictionary with virtually no limitation on the length of the word as long
as the local memory can accommodate them.
Indirectly, the Lempel-Ziv algorithm is performing dictionary construction
on the fly and carrying out data compression in the sense of extended
source encoding or string encoding. For example, if the word
"international" is stored in the dictionary being matched to the incoming
string and the size of the dictionary is 4,096, then only 12 bits are
required to represent 13 byte word "international." Hence, a tremendous
amount of compression has been achieved.
The Lempel-Ziv algorithm provides distinct advantages over Huffman coding
for the following reasons:
1. It is a single-pass encoding technique and statistics collection is not
needed.
2. It is string-oriented rather than character-oriented, so it achieves
good compression ratios.
3. It is adaptive to the data source so that it becomes a good general
purpose data compression algorithm.
4. The decompressor is able to locally generate the replica of the
dictionary used by the compressor, based on the compressed strings, so
that it is not needed to share the dictionary between the compressors and
the decompressors.
The '650 and '302 patents identified above provide two different
implementation methodologies for the Lempel-Ziv algorithm. In the
implementation of Lempel-Ziv algorithm, two key components are the data
structure used to store the input characters/ substrings for the purpose
of dictionary construction and hashing function used to perform the
string-matching algorithm. In the '650 patent, the data structure is based
on tree-structures, and data is processed two bits (or a fraction of a
byte) at a time for a reasonable memory requirement, but the processing
speed is slow. In the '302 patent, the data structure is based on a large
hash array structure, and data is processed one byte at a time, and the
processing speed has been somewhat improved. Both patents cover only the
implementation of the compression/decompression apparatus but not
applications.
The '650 patent made the first attempt to make the Lempel-Ziv algorithm
practical and realizable in implementation. However, since the data
structure is based on Tree structure, the compression apparatus only
processes a fraction of a byte at a time, and the compression process
requires numerous RAM cycles. Furthermore, the algorithm itself utilizes
complex mathematics formulas and arithmetic operations such as multiply
and divide.
The '302 patent made further progress on the practical implementation of
the Lempel-Ziv algorithm. In essence, the compression processing is done
one character (one byte) at a time. More importantly, the storing strings
of input data character signals comprise prefix code signals and the
extended character only. The storing strings in the dictionary are
compared to the input data stream to determine the longest match.
It will be helpful to re-examine the six requirements for data compression
technology to be useful for system applications. The Lempel-Ziv algorithm
and the '650 and '302 patents only meet the first two requirements.
Notice that the compression throughput requirement varies according to
applications. For example, high-speed modem applications require no more
than 20 kilobits/sec. On the other hand, for today's typical peripheral
applications, the Small Computer System Interface (SCSI) II requires data
transfer rate up to ten megabytes/sec for the synchronous operation.
Currently, the SCSI data transfer has become a common standard for
peripheral applications.
As an example, assume that the nominal clock rate is 40 megahertz. This
implies that the compressor and decompressor based on prior arts will have
to process data at one character (byte) per four clock cycles. This also
assumes extremely high-speed, expensive memories are being used in order
to keep up with the fast clock rate. Hence, there will be very stringent
compression throughput requirements for real-life peripheral applications.
Besides, current bus structures being used in most computer systems are
either 16-bit or 32-bit wide. This implies that more than a byte of
information will be available at any given cycle time. Therefore, parallel
data compression processing will be needed to meet such requirements in
order to achieve high-speed throughput rate memory implementation cost to
be manageable.
The other possibility to achieve high speed implementation based on prior
arts is to de-interleave the input data string and use multiple 8-bit
oriented compressors for compression processing. However, this
de-interleaving process destroys the input data correlation structure, and
hence degrades the compressor ratios significantly.
The application of compression apparatus based on the Lempel-Ziv algorithm
and the '650 and '302 patents are limited in terms of their applications
due to the following reasons:
1. They are single byte or a fraction of a byte processing. In the string
parsing process, the hashing is done based on the next extended character
and the content of the hash address. Hence, it is impossible to perform
multi-character processing due to the sequential nature of the hashing
process, and it is impossible to perform parallel processing based on
these existing approaches.
2. They are sequential file compression. In other words, data blocks are
chained together sequentially due to the use and construction of the
dictionary. For example in order to read the 50th block of a 1 megabyte
file, one needs to decompress the file from the first character until the
50th block is reached. The sequential file decompression implies that
these techniques lack the random block access capability. This is
absolutely unacceptable for disk random access environment.
3. The N-time hashing technique used in the string-matching process is
non-guaranteed. Subsequently the compression ratio suffers and becomes
poorer.
4. The dictionary has to be reset at the beginning of a given file. For
block-oriented applications, such as synchronous communications, packet
switching and disk environment, if a given block is treated as a file,
then the dictionary is always reset at the beginning of every block, and
the compression ratio suffers greatly. Since the dump dictionary is also
partially filled and only shorter words are being stored in the
dictionary. It is quite clear that adaptation and efficiency of
compression for short blocks are incompatible; the learning cycle
necessarily entails the initial construction in the dictionary of short
strings peculiar to the incoming data, during which time even shorter
strings are transmitted. It is not until long strings have been
constructed that the system achieves high compression efficiency.
5. During the decompression processing, data is read in the reverse order
from a stack. This also implies that numerous RAM cycles are needed just
to decompress data.
In view of the foregoing, it would be very desirable to provide an improved
data compression/decompression processor and corresponding methods which
overcome the deficiencies and limitations of prior art approaches.
SUMMARY OF THE INVENTION
It is one object of the present invention to provide an improved data
compression and decompression processor and method.
A data compression technique according to the present invention is able to
process multiple characters (parallel processing) concurrently with no
extra memory requirement and no degradation in compression ratio. The
novel implementation of the present invention provides a low cost, high
speed reversible implementation of data compression that provides an
information processing capability that can be integrated as part of disk,
tape back-up and communication systems.
According to one aspect of the present invention, a data compression
processor and corresponding method provides the capability of parallel
symbol processing. The present invention processes multiple bytes of data
simultaneously by computing all of the dictionary addresses for possible
matching substrings. The post processing phase of the data compression
system provides the parsing of the input data stream and the output
indices in parallel.
According to another aspect of the present invention, a parallel data
decompression processor and method provides parallel output symbols
without a stacking requirement. This eliminates numerous RAM cycles in
data decoding and significantly improves the decompression throughput. The
decompressor accepts one index at a time and traces multiple dictionary
trees from the top down, producing multiple bytes at each cycle in the
right order.
Other aspects of the present invention include the following:
New compression data structure and new hashing function (a function of
character and address) to make parallel symbol processing possible with no
extra memory requirement.
Guaranteed successful hashing function due to the uniqueness of the data
structure. The significant impacts are faster compression processing rates
and better compression ratios.
New decompression data structure links the bytes in the right order instead
of from the leaves to the root. This data structure utilizes common
substrings in the dictionary words to reduce memory requirements.
Block-oriented compression algorithms are made possible due to the use of
special codewords so that successive blocks are compressed and
decompressed independently without sacrificing the compression ratios or
resetting the dictionary.
Provide means to indicate the dictionary is full, hence future compression
is merely a table look-up approach. This feature gives random block
compression capability. This is especially important for database
applications.
Provide guidelines and means to reset the dictionary if the compression
ratio is poor.
Provide an inherent means for error detection due to the channel error for
error protection purposes.
Provide a free encryption capability which is a function of the user's key
via either an enlarged initial seed dictionary as a function of the user's
key and/or randomized code signal generation based on the user's key.
Capability to perform block reread and retransmit by only having to save
the index counter value between subsequent blocks. Only the block oriented
approach makes this possible.
Capability to handle multi-tasking environment by keeping pointers to
different dictionary locations either in memory or on the disk/tape. Each
block will point to the location of the beginning of its dictionary and a
table look-up scheme will determine whether or not the dictionary is
already in memory.
Capability to selectively compress blocks of a given file and leave other
blocks uncompressed to guarantee that the compression ratio will always be
better than or equal to 1.
Capability to handle multiple file processing environment. For random
compression/decompression, store the starting address of each dictionary
and keep track of which dictionary is currently used.
The present invention provides solutions for the integration of data
compression into system controllers and applications (disk applications,
multi-tasking environment, communications, tape back-ups and CDROM.
The present invention provides a dramatic improvement over approaches such
as taught by Lempel and Ziv. The end result is a significantly superior
compression algorithm to provide a faster (higher processing throughput),
better (better compression ratio) and cheaper (a single VLSI chip
solution) compression solution. Due to this new invention, it is feasible
to implement a low-cost, high-speed, multi-application versatile data
compression apparatus for disks, communications, tape back-ups, CDROM and
printer applications.
Other objects, advantages and features of the present invention will become
apparent from the following detailed description when taken in conjunction
with the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings which are incorporated in and form a part of this
specification illustrate an embodiment of the invention and, together with
the description, serve to explain the principles of the invention.
FIG. 1 depicts a block diagram of a parallel data compression system which
can be utilized with the present invention.
FIG. 2 depicts a generalized flow chart illustrating an improved data
compression technique which can be utilized with the present invention.
FIG. 3 depicts a block diagram of a parallel data decompression technique
which can be utilized with the present invention.
FIG. 4 depicts a general parallel data decompression flow chart.
FIG. 5 depicts a multi-file operation block diagram.
FIG. 6 depicts a block diagram of a system interface application including
a data compression/decompression processor according to the present
invention.
FIGS. 7A and 7B depict a more detailed diagram of the improved
compression/decompression processor according to the present invention.
FIG. 8 depicts a pinout configuration of a single chip according to the
present invention.
FIG. 9 depicts a diagram of a disk device driver architecture.
FIG. 10 depicts a diagram of a file system mapping according to the present
invention.
FIG. 11 depictes logical and physical allocation according to the present
invention.
FIGS. 12A and 12B depict file allocation strategies without compression and
with compression, respectively.
FIG. 13A depicts cluster random access.
FIG. 13B depicts a diagram of random access according to the present
invention.
FIGS. 14 and 15 depict diagrams of software architecture according to the
present invention.
DETAILED DESCRIPTION OF THE DRAWINGS
Reference will now be made in detail to the preferred embodiment of the
invention, an example of which is illustrated in the accompanying
drawings. While the invention will be described in conjunction with the
preferred embodiment, it will be understood that it is not intended to
limit the invention to that embodiment. On the contrary, it is intended to
cover alternatives, modifications and equivalents as may be included with
in the spirit and scope of the invention as defined by the appended
claims.
Before going into detail with respect to a data compressor/decompressor
processor, reference will first be made to an improved
compression/decompression system and method. The details of such an
improved compression/decompression system and method are set forth in the
above identified cross-referenced and co-pending application, the details
of which are hereby incorporated by reference.
Referring now to FIG. 1, a parallel compression system 10 which could be
utilized with the present invention is depicted as a block diagram. In
FIG. 1, input symbols 12 are input into N-input symbol registers 14.
Typically, there are N symbols being input into register 14 at the same
time. Common examples are 2 or 4 bytes of data. The input symbol register
14 feeds the input symbols to several places. One input is to the
dictionary address generator 16 that generates all the addresses to N
dictionary tables 20. Dictionary address generator 16 receives other
inputs, as will be discussed.
An address is generated for each of these tables 20. The dictionary tables
are indexed by levels 1, 2 up to level number 0, modulo N. So there are N
of these tables and each one has its own address register 18. So there is
address register #1 for levels number 1, modulo | | |