WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Data compression/decompression processor    
United States Patent5410671   
Link to this pagehttp://www.wikipatents.com/5410671.html
Inventor(s)Elgamal; Taher A. (Palo Alto, CA); Claxton; Daniel D. (Sausalito, CA); Honea; Robert F. (Santa Clara, CA)
AbstractA data compression/decompression processor (a single-chip VLSI data compression/decompression engine) for use in applications including but not limited to data storage and communications. The processor is highly versatile such that it can be used on a host bus or housed in host adapters, so that all devices such as magnetic disks, tape drives, optical drives and the like connected to it can have substantial expanded capacity and/or higher data transfer rate. The processor employs an advanced adaptive data compression algorithm with string-matching and link-list techniques so that it is completely adaptive, and a dictionary is constructed on the fly. No prior knowledge of the statistics of the characters in the data is needed. During decompression, the dictionary is reconstructed at the same time as the decoding occurs. The compression converges very quickly and the compression ratio approaches the theoretical limit. The processor is also insensitive to error propagation.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5410671
Data compression/decompression processor - US Patent 5410671 Drawing
Data compression/decompression processor
Inventor     Elgamal; Taher A. (Palo Alto, CA); Claxton; Daniel D. (Sausalito, CA); Honea; Robert F. (Santa Clara, CA)
Owner/Assignee     Cyrix Corporation (Richardson, TX)
Patent assignment
All assignments
Publication Date     April 25, 1995
Application Number     08/066,317
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     May 21, 1993
US Classification     711/202
Int'l Classification     G06F 012/02
Examiner     Harvey; Jack B.
Assistant Examiner     Tall; Michael H.
Attorney/Law Firm     Viger; Andrew S. Maxin; John L. ,
Address
Parent Case     This is a continuation of application Ser. No. 07/951,033 filed Sep. 24, 1992, now abandoned which is a continuation of application Ser. No. 07/568,363 filed Aug. 16, 1990, now abandoned, which is a continuation in part of Ser. No. 07/517,630 filed May 1, 1990, now abandoned.
Priority Data    
USPTO Field of Search     395/425 395/400
Patent Tags     data compression/decompression processor
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5077654
Ohtsuki

Dec,1991

[0 after 0 votes]
4989137
Oxley
711/203
Jan,1991

[0 after 0 votes]
4942541
Hoel
358/1.16
Jul,1990

[0 after 0 votes]
4929946
O'Brien
341/87
May,1990

[0 after 0 votes]
4906991
Fiala
341/51
Mar,1990

[0 after 0 votes]
4899147
Schiavo
341/60
Feb,1990

[0 after 0 votes]
4881075
Weng
341/87
Nov,1989

[0 after 0 votes]
4876541
Storer
341/51
Oct,1989

[0 after 0 votes]
4870415
Van Maren
341/94
Sep,1989

[0 after 0 votes]
4847619
Kato
341/106
Jul,1989

[0 after 0 votes]
4843389
Lisle
341/106
Jun,1989

[0 after 0 votes]
4814746
Miller
341/95
Mar,1989

[0 after 0 votes]
4800489
Moyer
711/206
Jan,1989

[0 after 0 votes]
4701745
Waterworth
341/63
Oct,1987

[0 after 0 votes]
4680700
Hester
711/206
Jul,1987

[0 after 0 votes]
4667325
Kitano
714/743
May,1987

[0 after 0 votes]
4654777
Nakamura
711/206
Mar,1987

[0 after 0 votes]
4558302
Welch
341/51
Dec,1985

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A method of accessing files of compressed data on a mass storage device comprising the steps of:

allocating one or more logical data blocks to each file using a first allocation table;

allocating one or more physical data blocks for storing file data using a second file allocation table; and

translating between said first file allocation table and said second file allocation table in response to a request to access the storage device.

2. The method of claim 1 wherein said translating step comprises the step of translating between said first file allocation table and said second file allocation table using a mapping table.

3. The method of claim 2 wherein said first file allocation table comprises a plurality of entries, each entry associated with a respective logical data block.

4. The method of claim 3 wherein said step of allocating one or more logical data blocks comprises the step of associating each file with one of said entries corresponding a first logical data block associated with that file and creating a linked list of entries in said first allocation file indicating the logical data blocks associated with the file.

5. The method of claim 4 wherein each entry of the first allocation table is associated with a respective entry in the mapping table which indicates a physical data block allocated by said second allocation table.

6. A system of storing and retrieving compressed data comprising:

a first file allocation table for allocating one or more logical data blocks to each of a plurality of files;

a second allocation file for allocating physical data blocks for storing file data; and

a mapping table for translating between said first and second file allocation tables in response to a request to access the storage device.

7. The system of claim 6 and further comprising a processor for compressing the file data in response to a write operation to the storage device and decompressing the data responsive to a read operation from the storage device.

8. The system of claim 7 wherein said first file allocation table comprises a plurality of entries, each entry associated with a respective logical data block.

9. The system of claim 8 wherein said mapping table comprises a plurality of entries corresponding to respective of said first file allocation table entries, each entry of said mapping table indicating a physical data block allocated by said second allocation table.

10. The system of claim 6 wherein said first allocation table maintains a linked list of logical data blocks for each file.

11. A method of accessing files of compressed data on a mass storage device comprising the steps of:

maintaining a first file allocation table which indicates one or more logical data blocks;

maintaining a second file allocation table which indicates one or more physical data blocks on the mass storage device where the compressed data associated with each file is stored;

mapping between the first file allocation table and the second file allocation table in response to a request to access the mass storage device.

12. The method of claim 11 wherein each block of said second file allocation table comprises a plurality of allocation units, and said mapping step comprises the step of associating each block of said first file allocation table with one or more of said allocation units.

13. The method of claim 11 wherein said mapping step comprises the step of associating each block of said first file allocation table with a block of said second file allocation table and an offset into said block of said second file allocation table.

14. The method of claim 11 and further comprising the step of allocating a number of logical data blocks in excess of the number of physical data blocks to said first allocation table.

15. The method of claim 11 wherein said mapping step comprises the step of mapping a plurality of logical data blocks into a single physical data block.

16. The method of claim 11 wherein said mapping step comprises the step of mapping each logical data block to a physical block and an offset into the physical block.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

The present invention relates to a data compression and decompression processor and corresponding methods therefor.

Data compression/decompression techniques are well known in the prior art. Examples of prior art approaches are disclosed in U.S. Pat. No. 4,464,650 (the '650 patent), issued in August 1984 to Eastman, and U.S. Pat. No. 4,558,302 (the '302 patent), issued in December 1985 to Welch.

The above patents present implementations of a data compression algorithm described by Lempel and Ziv in "Compression of Individual Sequences Via Variable-Rate Coding," IEEE Transactions on Information Theory, IT-24-5, September 1977, pp. 530-537, and "A Universal Algorithm for Sequential Data Compression," IEEE Transactions on Information Theory, IT-23-3, May 1977, pp. 337-343. These two papers presented the original ideas and principles of compression algorithm. However, no implementation method was given.

Data compression algorithms are known as a discipline which design the shortest representation of the original data string. Reversible data compression implies that a given data string being compressed and then decompressed should be identical to the original data string, and not a single bit of information is allowed to be distorted or lost. This reversibility is identical to the concept of noiseless data compression in the context of information theory. Hence, noiseless data compression is quite different from speech compression and video compression where distortions are allowed. For typical applications of data compression for data being stored or transmitted, noiseless data compression is the only kind of compression which can be used.

The benefits of data compression are tremendous. The benefits of data compression technology to communications have been the increase in bandwidth available to the users and a savings of the cost of the communication device. For example, to transmit 9600 bps digital information over existing telephone lines requires conditioned lines and 9600 bps high-speed modems. If one applied advanced data compression technology with 4:1 compression ratio, one may use 2400 bps modems and unconditioned lines for such transmissions. In this example, the savings is not only the difference in modem cost between 9600 bps modems and 2400 bps modems, also the monthly charges for the conditioned lines. As a further example, a typical branch of a bank requires four leased lines for operation (one for the ATM machine, one for news broadcast, and two for data processing). If a multiplexer has a built-in compressor with 4 to 1 compression ratio, then only one line will be needed to communicate with the outside world, and a savings of monthly charges of three lines can be achieved.

As to back-up systems, data compression technology can greatly reduce the back-up time, increase the capacity and save the back-up device cost. For example, for the high-end 1/2" reel to reel tape back-up systems, typically four tapes are required to be mounted and dismounted from the tape drive system. It normally takes 30 minutes to back up one tape. Due to this sequential process, it becomes a very-time-consuming process to back up computer data.

If a back-up system has a built-in compression capability and compression ratio of 4 to 1, then one may use one tape instead of four tapes for the back-up purpose, and the back-up process becomes an unattended process. The other major benefit of data compression technology for back-up is that the person performing the back-up can simply mount the tape and leave the system unattended. By the same token, data compression technology can increase the capacity of the disk drive, reduce the data access time, and save the disk cost. Again, the savings are especially significant for high-end disk drive systems where multiple heads and multiple platters are required. With the help of data compression technology, only single head and single platter can be used to achieve the same capacity requirement. Due to fewer parts, this approach also improves the reliability of the system. However, these benefits will be realized only if the data compression technology meets each of the following requirements:

1. Its encoding process should be string-oriented rather than character-oriented in order to achieve good compression ratios.

2. The algorithm must be adaptive to the data source so that good compression ratios can be achieved for all data files. This capability is also called "universal" data compression.

3. Its data processing rate must be at least as high as the system data transfer rate so that system throughput will not be degraded.

4. The cost of data compression hardware implementation must be low. Preferably, the implementation of the data compression algorithm on a single VLSI chip is feasible.

5. The integration of the data compression technology into the system must be simple, and the data compression technology cannot cause any system problems.

6. The compression capability must be totally transparent to the users.

Currently, the most popular noiseless data compression method being used in the industry is the traditional Huffman coding or its modified versions. Huffman coding is a two-pass coding method. The first pass performs statistics collection of the data source; the second pass constructs a codetable based on the statistics collected. The underlying principle is to replace the more frequent characters by shorter codewords and less frequent characters by longer codewords. This coding method normally provides poor compression ratios because it is character encoding rather than string encoding.

To elaborate this point, consider encoding the more frequent words by shorter codewords and less frequent words by longer words. It is intuitive that the compression ratios to be achieved will be even better than compression ratios achieved by character encoding. For example, it is known that as the English word "the" is the most frequent word, so that it may be encoded by the shortest codeword. This is what is normally called encoding on the extended information source or multiple symbol encoding.

In general, the complexity and memory requirement of Huffman coding based on extended symbols (multiple symbols) source grow exponentially. For example, if the symbol size in the Huffman coding system is a byte, then the codetable requires 256 entries for character encoding. If the symbol size in the Huffman coding system is two bytes, then the codetable requires 65,536 entries for two byte encoding. For the particular example mentioned earlier, three-character encoding for "the" requires 16,777,216 possible entries for the codetable. Hence, practical implementation of Huffman coding has always been limited to character encoding only.

Typically, Huffman coding requires the same codetable to be shared or transmitted between the sender's and receiver's ends for successful compression and decompression. In practice, several classes of codetables are preassigned for data compression and decompression purposes. The compressor/decompressor switches to different codetables according to the closest match between the data source and the codetable. The resulting compression ratios are further deteriorated due to the mismatch between the source statistics and the codetable being used. Although it is possible to construct an adaptive version of Huffman coding, the implementation is cumbersome and the improvement on compression ratios is limited.

Since Huffman coding is a fixed to variable coding technique, the decompressor parses the compressed strings on a bit-by-bit basis until a legal codeword has been recognized. Because the length of the legal codeword is not known beforehand, the decompression process could be slow. The principle of Huffman coding is described in the article of D. A. Huffman entitled "A Method for the Construction of Minimum Redundancy Codes," Proceedings of IRE 40, pp. 1098-1100 (September 1952). In summary, Huffman coding suffers from several limitations:

1. It is a two-pass algorithm. The first pass involves the statistics collection needed to construct the codetable to be used by the compressor and decompressor. For many real life applications, this is unacceptable.

2. The compression ratio is less than optimal, since the complexity of multiple symbol encoding or string encoding is exponential, and practical implementation of Huffman coding has been limited to character encoding only.

3. The compression ratio is poor. Frequently, the codetable is mismatched to the information source. On the other hand, the implementation of adaptive Huffman coding is cumbersome and complex, and the convergence of the adaptive statistic collection is slow.

4. Since Huffman encoding is a fixed to variable length encoding, the compression algorithm itself is sensitive to error propagation and the decoding process could be slow.

5. The same codetable needs to be stored or transmitted between the compressor and the decompressor. This represents additional overheads for the compression purpose.

The first major effort to alleviate these limitations was the disclosure of the Lempel-Ziv algorithm identified above. Basic principles of Lempel-Ziv algorithm were described in the two papers cited above. The basic principles of Lempel-Ziv algorithm work as follows:

The compressor first prepares for a dictionary where variable length dictionary words will be stored. Initially, the dictionary could be empty or partially filled. Then the incoming string will be parsed and examined on a character by character basis, so that the longest match between the input string and the words in the dictionary will be identified. This is also called "greedy" parsing. As soon as the greedy parsing is completed, the extended substring of the longest match by one character will be entered into the dictionary as a new word, and the index of the word in the dictionary with a match to the longest substring of the incoming string will be output as a compressed word.

The decompressor will also prepare for a dictionary with the same principle to build and fill the dictionary as the compressor does. By examining the received indexes, the decompressor reconstructs the dictionary and performs the decoding process without physically storing or transmitting the actual contents of the dictionary from the compressor to the decompressor.

The preeminent features of the Lempel-Ziv algorithm are its ability to adapt to any type of incoming data source, and it extracts the longer and more frequent substrings from the data source as words stored in the dictionary with virtually no limitation on the length of the word as long as the local memory can accommodate them.

Indirectly, the Lempel-Ziv algorithm is performing dictionary construction on the fly and carrying out data compression in the sense of extended source encoding or string encoding. For example, if the word "international" is stored in the dictionary being matched to the incoming string and the size of the dictionary is 4,096, then only 12 bits are required to represent 13 byte word "international." Hence, a tremendous amount of compression has been achieved.

The Lempel-Ziv algorithm provides distinct advantages over Huffman coding for the following reasons:

1. It is a single-pass encoding technique and statistics collection is not needed.

2. It is string-oriented rather than character-oriented, so it achieves good compression ratios.

3. It is adaptive to the data source so that it becomes a good general purpose data compression algorithm.

4. The decompressor is able to locally generate the replica of the dictionary used by the compressor, based on the compressed strings, so that it is not needed to share the dictionary between the compressors and the decompressors.

The '650 and '302 patents identified above provide two different implementation methodologies for the Lempel-Ziv algorithm. In the implementation of Lempel-Ziv algorithm, two key components are the data structure used to store the input characters/ substrings for the purpose of dictionary construction and hashing function used to perform the string-matching algorithm. In the '650 patent, the data structure is based on tree-structures, and data is processed two bits (or a fraction of a byte) at a time for a reasonable memory requirement, but the processing speed is slow. In the '302 patent, the data structure is based on a large hash array structure, and data is processed one byte at a time, and the processing speed has been somewhat improved. Both patents cover only the implementation of the compression/decompression apparatus but not applications.

The '650 patent made the first attempt to make the Lempel-Ziv algorithm practical and realizable in implementation. However, since the data structure is based on Tree structure, the compression apparatus only processes a fraction of a byte at a time, and the compression process requires numerous RAM cycles. Furthermore, the algorithm itself utilizes complex mathematics formulas and arithmetic operations such as multiply and divide.

The '302 patent made further progress on the practical implementation of the Lempel-Ziv algorithm. In essence, the compression processing is done one character (one byte) at a time. More importantly, the storing strings of input data character signals comprise prefix code signals and the extended character only. The storing strings in the dictionary are compared to the input data stream to determine the longest match.

It will be helpful to re-examine the six requirements for data compression technology to be useful for system applications. The Lempel-Ziv algorithm and the '650 and '302 patents only meet the first two requirements.

Notice that the compression throughput requirement varies according to applications. For example, high-speed modem applications require no more than 20 kilobits/sec. On the other hand, for today's typical peripheral applications, the Small Computer System Interface (SCSI) II requires data transfer rate up to ten megabytes/sec for the synchronous operation. Currently, the SCSI data transfer has become a common standard for peripheral applications.

As an example, assume that the nominal clock rate is 40 megahertz. This implies that the compressor and decompressor based on prior arts will have to process data at one character (byte) per four clock cycles. This also assumes extremely high-speed, expensive memories are being used in order to keep up with the fast clock rate. Hence, there will be very stringent compression throughput requirements for real-life peripheral applications. Besides, current bus structures being used in most computer systems are either 16-bit or 32-bit wide. This implies that more than a byte of information will be available at any given cycle time. Therefore, parallel data compression processing will be needed to meet such requirements in order to achieve high-speed throughput rate memory implementation cost to be manageable.

The other possibility to achieve high speed implementation based on prior arts is to de-interleave the input data string and use multiple 8-bit oriented compressors for compression processing. However, this de-interleaving process destroys the input data correlation structure, and hence degrades the compressor ratios significantly.

The application of compression apparatus based on the Lempel-Ziv algorithm and the '650 and '302 patents are limited in terms of their applications due to the following reasons:

1. They are single byte or a fraction of a byte processing. In the string parsing process, the hashing is done based on the next extended character and the content of the hash address. Hence, it is impossible to perform multi-character processing due to the sequential nature of the hashing process, and it is impossible to perform parallel processing based on these existing approaches.

2. They are sequential file compression. In other words, data blocks are chained together sequentially due to the use and construction of the dictionary. For example in order to read the 50th block of a 1 megabyte file, one needs to decompress the file from the first character until the 50th block is reached. The sequential file decompression implies that these techniques lack the random block access capability. This is absolutely unacceptable for disk random access environment.

3. The N-time hashing technique used in the string-matching process is non-guaranteed. Subsequently the compression ratio suffers and becomes poorer.

4. The dictionary has to be reset at the beginning of a given file. For block-oriented applications, such as synchronous communications, packet switching and disk environment, if a given block is treated as a file, then the dictionary is always reset at the beginning of every block, and the compression ratio suffers greatly. Since the dump dictionary is also partially filled and only shorter words are being stored in the dictionary. It is quite clear that adaptation and efficiency of compression for short blocks are incompatible; the learning cycle necessarily entails the initial construction in the dictionary of short strings peculiar to the incoming data, during which time even shorter strings are transmitted. It is not until long strings have been constructed that the system achieves high compression efficiency.

5. During the decompression processing, data is read in the reverse order from a stack. This also implies that numerous RAM cycles are needed just to decompress data.

In view of the foregoing, it would be very desirable to provide an improved data compression/decompression processor and corresponding methods which overcome the deficiencies and limitations of prior art approaches.

SUMMARY OF THE INVENTION

It is one object of the present invention to provide an improved data compression and decompression processor and method.

A data compression technique according to the present invention is able to process multiple characters (parallel processing) concurrently with no extra memory requirement and no degradation in compression ratio. The novel implementation of the present invention provides a low cost, high speed reversible implementation of data compression that provides an information processing capability that can be integrated as part of disk, tape back-up and communication systems.

According to one aspect of the present invention, a data compression processor and corresponding method provides the capability of parallel symbol processing. The present invention processes multiple bytes of data simultaneously by computing all of the dictionary addresses for possible matching substrings. The post processing phase of the data compression system provides the parsing of the input data stream and the output indices in parallel.

According to another aspect of the present invention, a parallel data decompression processor and method provides parallel output symbols without a stacking requirement. This eliminates numerous RAM cycles in data decoding and significantly improves the decompression throughput. The decompressor accepts one index at a time and traces multiple dictionary trees from the top down, producing multiple bytes at each cycle in the right order.

Other aspects of the present invention include the following:

New compression data structure and new hashing function (a function of character and address) to make parallel symbol processing possible with no extra memory requirement.

Guaranteed successful hashing function due to the uniqueness of the data structure. The significant impacts are faster compression processing rates and better compression ratios.

New decompression data structure links the bytes in the right order instead of from the leaves to the root. This data structure utilizes common substrings in the dictionary words to reduce memory requirements.

Block-oriented compression algorithms are made possible due to the use of special codewords so that successive blocks are compressed and decompressed independently without sacrificing the compression ratios or resetting the dictionary.

Provide means to indicate the dictionary is full, hence future compression is merely a table look-up approach. This feature gives random block compression capability. This is especially important for database applications.

Provide guidelines and means to reset the dictionary if the compression ratio is poor.

Provide an inherent means for error detection due to the channel error for error protection purposes.

Provide a free encryption capability which is a function of the user's key via either an enlarged initial seed dictionary as a function of the user's key and/or randomized code signal generation based on the user's key.

Capability to perform block reread and retransmit by only having to save the index counter value between subsequent blocks. Only the block oriented approach makes this possible.

Capability to handle multi-tasking environment by keeping pointers to different dictionary locations either in memory or on the disk/tape. Each block will point to the location of the beginning of its dictionary and a table look-up scheme will determine whether or not the dictionary is already in memory.

Capability to selectively compress blocks of a given file and leave other blocks uncompressed to guarantee that the compression ratio will always be better than or equal to 1.

Capability to handle multiple file processing environment. For random compression/decompression, store the starting address of each dictionary and keep track of which dictionary is currently used.

The present invention provides solutions for the integration of data compression into system controllers and applications (disk applications, multi-tasking environment, communications, tape back-ups and CDROM.

The present invention provides a dramatic improvement over approaches such as taught by Lempel and Ziv. The end result is a significantly superior compression algorithm to provide a faster (higher processing throughput), better (better compression ratio) and cheaper (a single VLSI chip solution) compression solution. Due to this new invention, it is feasible to implement a low-cost, high-speed, multi-application versatile data compression apparatus for disks, communications, tape back-ups, CDROM and printer applications.

Other objects, advantages and features of the present invention will become apparent from the following detailed description when taken in conjunction with the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings which are incorporated in and form a part of this specification illustrate an embodiment of the invention and, together with the description, serve to explain the principles of the invention.

FIG. 1 depicts a block diagram of a parallel data compression system which can be utilized with the present invention.

FIG. 2 depicts a generalized flow chart illustrating an improved data compression technique which can be utilized with the present invention.

FIG. 3 depicts a block diagram of a parallel data decompression technique which can be utilized with the present invention.

FIG. 4 depicts a general parallel data decompression flow chart.

FIG. 5 depicts a multi-file operation block diagram.

FIG. 6 depicts a block diagram of a system interface application including a data compression/decompression processor according to the present invention.

FIGS. 7A and 7B depict a more detailed diagram of the improved compression/decompression processor according to the present invention.

FIG. 8 depicts a pinout configuration of a single chip according to the present invention.

FIG. 9 depicts a diagram of a disk device driver architecture.

FIG. 10 depicts a diagram of a file system mapping according to the present invention.

FIG. 11 depictes logical and physical allocation according to the present invention.

FIGS. 12A and 12B depict file allocation strategies without compression and with compression, respectively.

FIG. 13A depicts cluster random access.

FIG. 13B depicts a diagram of random access according to the present invention.

FIGS. 14 and 15 depict diagrams of software architecture according to the present invention.

DETAILED DESCRIPTION OF THE DRAWINGS

Reference will now be made in detail to the preferred embodiment of the invention, an example of which is illustrated in the accompanying drawings. While the invention will be described in conjunction with the preferred embodiment, it will be understood that it is not intended to limit the invention to that embodiment. On the contrary, it is intended to cover alternatives, modifications and equivalents as may be included with in the spirit and scope of the invention as defined by the appended claims.

Before going into detail with respect to a data compressor/decompressor processor, reference will first be made to an improved compression/decompression system and method. The details of such an improved compression/decompression system and method are set forth in the above identified cross-referenced and co-pending application, the details of which are hereby incorporated by reference.

Referring now to FIG. 1, a parallel compression system 10 which could be utilized with the present invention is depicted as a block diagram. In FIG. 1, input symbols 12 are input into N-input symbol registers 14. Typically, there are N symbols being input into register 14 at the same time. Common examples are 2 or 4 bytes of data. The input symbol register 14 feeds the input symbols to several places. One input is to the dictionary address generator 16 that generates all the addresses to N dictionary tables 20. Dictionary address generator 16 receives other inputs, as will be discussed.

An address is generated for each of these tables 20. The dictionary tables are indexed by levels 1, 2 up to level number 0, modulo N. So there are N of these tables and each one has its own address register 18. So there is address register #1 for levels number 1, modulo