WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Data compression apparatus and method    
United States Patent5016009   
Link to this pagehttp://www.wikipatents.com/5016009.html
Inventor(s)Whiting; Douglas L. (South Pasadena, CA); George; Glen A. (Pasadena, CA); Ivey; Glen E. (Pasadena, CA)
AbstractAn apparatus and method for converting an input data character stream into a variable length encoded data stream in a data compression system. The data compression system includes a history array means. The history array means has a plurality of entries and each entry of the history array means is for storing a portion of the input data stream. The method for converting the input data character stream includes the following steps. Performing a search in a history array means for the longest data string which matches the input data string. If the matching data string is found within the history buffer means, the next step includes encoding the longest matching data string found by appending to the encoded data stream a tag indicating the longest matching data string was found and a string substitution code. If the matching data string is not found within the history array means, the next step includes encoding the first character of the input data string by appending to the encoded data stream a raw data tag indicating that no matching data string was found and the first character of the input data string.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5016009
Data compression apparatus and method - US Patent 5016009 Drawing
Data compression apparatus and method
Inventor     Whiting; Douglas L. (South Pasadena, CA); George; Glen A. (Pasadena, CA); Ivey; Glen E. (Pasadena, CA)
Owner/Assignee     Stac, Inc. (Pasadena, CA)
Patent assignment
All assignments
Publication Date     May 14, 1991
Application Number     07/297,152
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     January 13, 1989
US Classification     341/67 341/95 341/106 370/506 375/363
Int'l Classification     H03M 007/40 H03L 007/00
Examiner     Pellinen; A. D.
Assistant Examiner     Logan; Sharon D.
Attorney/Law Firm     Irell & Manella
Address
Parent Case    
Priority Data    
USPTO Field of Search     375/27 375/112 358/261.1 364/715.02 341/51 341/67 341/106 341/95 370/102
Patent Tags     data compression
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
4876541
Storer
341/51
Oct,1989

[0 after 0 votes]
4814746
Miller
341/95
Mar,1989

[0 after 0 votes]
4701745
Waterworth
341/63
Oct,1987

[0 after 0 votes]
4612532
Bacon
341/51
Sep,1986

[0 after 0 votes]
4558302
Welch
341/51
Dec,1985

[0 after 0 votes]
4491934
Heinz
341/55
Jan,1985

[0 after 0 votes]
4464650
Eastman
341/51
Aug,1984

[0 after 0 votes]
4412306
Moll
708/203
Oct,1983

[0 after 0 votes]
4054951
Jackson
714/48
Oct,1977

[0 after 0 votes]
4021782
Hoerning
341/51
May,1977

[0 after 0 votes]
3976844
Betz
375/240
Aug,1976

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


We claim:

1. A method for converting an input data character stream into a variable length encoded data stream in a data compression system, said data compression system comprising a history array means, said history array means having a plurality of entries, said entries of said history array means for storing said input data stream, a history array pointer, said history array pointer indicating a latest entry in said history array means, a hash table means having a plurality of entries, each entry of said hash table means for storing a pointer indicating one of said entries of said history array means, and an offset array means, said offset array means having a plurality of entries, each entry of said offset array means providing a link, if any, from one of said entries in said history array means to one or more other entries of said history array means, said method comprising the steps of:

performing a search in said history array means for a longest matching data string which matches said input data stream, said step of performing said search including the steps of:

performing a hashing function on said input data stream, said hashing function providing a pointer indicating one of said entries of said hash table means,

obtaining said pointer stored at said hash table entry pointed to by said hashing function,

calculating a difference between said history array pointer and said pointer obtained from said hash table means,

storing said difference into said offset array means entry pointed to by said history array pointer, and

storing said history array pointer into said hash table entry pointed to by said hashing function,

operative when said matching data string is found within said history array means, encoding said matching data string found in said history array means by assigning a tag indicating that said matching data string was found and a string substitution code including a variable length indicator of the length of said matching data string and a pointer indicating the location within said history array means of said matching data string,

operative when said matching data string is not found within said history array means, encoding a first character of said input data stream by assigning a "raw" data tag indicating that no matching data string was found in said history array means and said first character of said input data stream,

whereby said input data stream is converted into a variable length encoded data stream.

2. The method of claim 1 wherein said step of performing said search in said history array means for said longest matching data string further includes the step of limiting said search to a predefined number of inquiries in said history array means for said longest matching data string.

3. The method of claim 2 further including the steps of:

discarding input data characters of said input data stream which have been encoded and performing one or more steps of said method for further encoding said input data stream.

4. The method of claim 1 or 3 wherein said step of appending said pointer and said length indicator further includes the step of representing said pointer and said length indicator by an encoding scheme according to a predetermined strategy that ensures that a matching string of two characters of said input data stream is compressed to less than said two characters of said input data stream.

5. The method of claim 1 further including the step of incrementing said history array pointer to the next entry in said history array means and performing one or more elements of said method for hashing said next entry of said history array means.

6. The method of claim 1 wherein said step of performing a search further includes the step of periodically examining the pointer stored at each said hash table entry for determining whether the difference between said pointer of each said entry and said history array pointer is greater than a predetermined amount.

7. The method of claim 8 wherein said step further includes the step of:

operative when said difference is greater than said predetermined amount, replacing said entry in said hash table by an invalid value, thereby reinitializing said entry.

8. The method of claim 1 further including the step of:

completing said method of encoding by appending to said encoded data an end of compressed data marker which is detected during decompression of said variable length encoded data stream.

9. The method of claim 1 further including the step of:

initializing said hash table by replacing all entries of said hash table by invalid values.

10. The method of claim 1 further including the step of decoding said variable length encoded data stream, said step including the steps of:

parsing said variable length encoded data stream into separate portions, each said separate portion starting with one of said tags,

evaluating said tag of each said separate portion for determining whether said tag is said raw data tag or said tag indicating said encoded matching data string.

11. The method of claim 10 further including the step:

operative when said tag indicates said encoded matching data string, interpreting said length indicator and said pointer of said substitution code for generating said matching data string, thereby reconstructing a portion of the original input data stream.

12. The method of claim 10 further including the step:

operative when said tag indicates said raw data tag, obtaining said first character of said encoded input data stream, thereby reconstructing a portion of the original input data stream.

13. An apparatus for converting an input data character stream into a variable length encoded data stream in a data compression system, said data compression system comprising a history array means, said history array means having a plurality of entries, said entries of said history array means for storing said input data stream, a history array pointer, said history array pointer indicating a latest entry in said history array means, a hash table means having a plurality of entries, each entry of said hash table means for storing a pointer indicating one of said entries of said history array means, and an offset array means, said offset array means having a plurality of entries, each entry of said offset array means providing a link, if any, from one of said entries in said history array means to one or more other entries of said history array means, said apparatus comprising:

means for performing a search in said history array means for a longest matching data string which matches said input data stream, said means for performing said search including:

means for performing a hashing function on said input data stream, said hashing function providing a pointer indicating one of said entries of said hash table means,

means for obtaining said pointer stored at said hash table entry pointed to by said hashing function,

means for calculating a difference between said history array pointer and said pointer obtained from said hash table means,

means for storing said difference into said offset array means entry pointed to by said history array pointer, and

means for storing said history array pointer into said hash table entry pointed to by said hashing function,

means for encoding said matching data string found in said history array means by assigning a tag indicating that said matching data string was found and a string substitution code including a variable length indicator of the length of said matching data string and a pointer indicating the location within said history array means of said matching data string,

means for encoding a first character of said input data stream by assigning a "raw" data tag indicating that no matching data string was found is said history array means and said first character of said input data stream,

whereby said input data stream is converted into a variable length encoded data stream.

14. The apparatus of claim 13 wherein said means for performing said search in said history array means for said longest matching data string further includes means for limiting said search to a predefined number of inquiries in said history array means for said longest matching data string.

15. The apparatus of claim 14 further including:

means for discarding input data characters of said input data stream which have been encoded.

16. The apparatus of claim 13 or 15 wherein said means for appending said pointer and said length indicator further includes means for representing said pointer and said length indicator by an encoding scheme according to a predetermined strategy that ensures that a matching string of two characters of said input data stream is compressed to less than said two characters of said input data stream.

17. The apparatus of claim 13 further including means for incrementing said history array pointer to the next entry in said history array means.

18. The apparatus of claim 13 wherein said means for performing a search further includes means for periodically examining the pointer stored at each said hash table entry for determining whether the difference between said pointer of each said entry and said history array pointer is greater than a predetermined amount.

19. The apparatus of claim 18 wherein said means further includes:

operative when said difference is greater than said predetermined amount, means for replacing said entry in said hash table by an invalid value, thereby reinitializing said entry.

20. The apparatus of claim 13 further including:

means for appending to said encoded data an end of compressed data marker which is detected during the decompression of said variable length encoded data stream.

21. The apparatus of claim 13 further including:

means for initializing said hash table by replacing all entries of said hash table by invalid values.

22. The apparatus of claim 13 further including means for decoding said variable length encoded data stream, said means including:

means for parsing said variable length encoded data stream into separate portions, each said separate portion starting with one of said tags,

means for evaluating said tag of said separate portion for determining whether said tag is said raw data tag or said tag indicating said encoded matching data string.

23. The apparatus of claim 22 further including:

operative when said tag indicates said encoded matching data string, means for interpreting said length indicator and said pointer of said substitution code for generating said matching data string, thereby reconstructing a portion of the original input data stream.

24. The apparatus of claim 22 further including:

operative when said tag indicates said raw data tag, means for obtaining said first character of said encoded input data stream thereby reconstructing a portion of the original input data stream.

25. The apparatus of claim 24 further including a RAM, said RAM including said history array means, said history table means and said offset array means, and said encoded or decoded data streams temporarily stored in one or more FIFO buffers, and portions of said FIFO buffers also maintained by said RAM, thereby enhancing data transfer performance.

26. A method for converting an input data character stream into a variable length encoded data stream, said method comprising the steps of:

storing previously processed input data characters into a storage means for reference,

performing a search in said data storage means for a longest data string of said previously processed input data characters which matches said input data stream,

operative when said matching data string is found within said storage means, encoding said matching data string by assigning a tag indicating that said matching data string was found, a variable length indicator of the length of said matching data string, and a pointer indicating the location within said storage means of said matching data string, and said step of encoding said matching data string including the step of representing said pointer and said variable length indicator by an encoding scheme according to a predetermined strategy, said predetermined strategy ensuring that a matching string of two characters of said input data stream is compressed to less than said two characters of said input data stream,

operative when said matching input data string is not found within said storage means, encoding a first character of said input data stream by assigning a "raw" data tag indicating that no matching data string was found in said storage means and said first character of said input data stream,

whereby said input data stream is converted into a variable length encoded data stream.

27. The method of claim 26 wherein said step of performing said search in said storage means for said longest matching data string further includes the step of limiting said search to a predefined number of inquiries in said storage means for said longest matching data string.

28. The method of claim 26 further including the steps of:

discarding input data characters of said input data stream which have been encoded and performing one or more steps of said method for further encoding said input data stream.

29. The method of claim 26 wherein said step of appending said pointer and said length indicator further includes the step of representing said pointer either by a long form or a short form depending upon the location within said storage means of said matching data string.

30. The method of claim 26 further including the step of defining said length indicator of a two character matching data string as the binary word "00".

31. The method of claim 29 further including the step of providing a single digit binary word for indicating whether said pointer is said long or said short form.

32. The method of claim 31 further including the step of encoding said short form as a seven digit binary word, so that said pointer may indicate one hundred and twenty-seven different locations within said storage means.

33. The method of claim 31 further including the step of encoding said long form of said pointer as an eleven digit binary word, so that said pointer may indicate two thousand and forty-seven different locations within said storage means.

34. The method of claim 26 or 32 or 33 further including the step of encoding said variable length indicator according to the following table:

______________________________________ 00 = 2 01 = 3 10 = 4 11 00 = 5 11 01 = 6 11 10 = 7 11 11 0000 = 8 11 11 0001 = 9 11 11 0010 = 10 11 11 0011 = 11 11 11 0100 = 12 11 11 0101 = 13 11 11 0110 = 14 11 11 0111 = 15 11 11 1000 = 16 11 11 1001 = 17 11 11 1010 = 18 11 11 1011 = 19 11 11 1100 = 20 11 11 1101 = 21 11 11 1110 = 22 11 11 1111 0000 = 23 11 11 1111 0001 = 24 11 11 1111 0010 = 25 11 11 1111 1110 = 37 11 11 1111 1111 0000 = 38 11 11 1111 1111 0001 = 39 etc. ______________________________________

35. The method of claim 26 wherein said step of representing said length indicator by said encoding scheme further includes the steps of:

defining a first group of words for defining the length of said matching data string where said matching data string is a length of 2, 3, and 4 characters,

said first group and a second group of words together defining lengths of said matching data string where said matching data string is a length of 5, 6, and 7 characters,

said first, said second and a third group of words together defining lengths of said matching data string where said matching data string is a length of 8 through 22 characters, and

said first, said second, said third and a fourth group of words together defining lengths of said matching data string where said matching data string is a length of 23 through 37 characters.

36. The method of claim 35 wherein said step of defining said first group of words further includes the step of representing said lengths of 2, 3, and 4 characters by the following table:

______________________________________ 00 2 characters 01 3 characters 10 4 characters. ______________________________________

37. The method of claim 35 wherein said step of defining said first and said second group of words further includes the step of representing said lengths of 5, 6, and 7 characters by the following table:

______________________________________ 11 00 5 characters 11 01 6 characters 11 10 7 characters. ______________________________________

38. The method of claim 35 wherein said step of defining said first, said second and said third group of words further includes the step of representing said lengths of 8 through 22 characters by the following table:

______________________________________ 11 11 0000 8 characters 11 11 0001 9 characters 11 11 0010 10 characters 11 11 0011 11 characters 11 11 0100 12 characters 11 11 0101 13 characters 11 11 0110 14 characters 11 11 0111 15 characters 11 11 1000 16 characters 11 11 1001 17 characters 11 11 1010 18 characters 11 11 1011 19 characters 11 11 1100 20 characters 11 11 1101 21 characters 11 11 1110 22 characters. ______________________________________

39. The method of claim 26 wherein said step of performing said search in said storage means for said longest matching data string further includes the step of performing a hashing function.

40. The method of claim 39 wherein said conversion of said input data stream into said variable length encoded data stream occurs in a data compression system and said data compression system includes said storage means, said storage means having a plurality of entries, said entries of said storage means for storing said input data stream, a storage means pointer, said storage means pointer indicating the latest entry in said storage means, a hash table means having a plurality of entries, each entry of said hash table means for storing a pointer for indicating one of said entries of said storage means, and an offset array means, said offset array means having a plurality of entries, each entry providing a link from one of said entries of said storage means to one or more other entries of said storage means and said step of performing said search in said storage means further including the steps of:

said hashing function providing a pointer to one of said entries of said hash table means,

obtaining said pointer stored at said hash table entry pointed to by said result of said hashing function,

calculating the difference between said storage means pointer and said pointer read from said hash table means,

storing said difference into said offset array means entry pointed to by said storage means pointer, and

storing said storage means pointer into said hash table entry pointed to by said hashing function.

41. The method of claim 40 further including the step of incrementing said storage means pointer to the next entry in said storage means and performing one or more elements of said method for hashing said next entry of said storage means.

42. The method of claim 40 wherein said step of performing said search in said storage means further includes the step of periodically examining the pointer stored at each said hash table entry for determining whether the difference between said pointer of each said entry and said storage means pointer is greater than a predetermined amount.

43. The method of claim 42 wherein said step further includes the step of:

operative when said difference is greater than said predetermined amount, replacing said entry in said hash table by an invalid value, thereby reinitializing said entry.

44. The method of claim 43 further including the step of:

completing said method of encoding by appending to said encoded data an end of compressed data marker which is detected during decompression of said variable length encoded data stream.

45. The method of claim 44 further including the step of:

initializing said hash table by replacing all entries of said hash table by invalid values.

46. The method of claim 26 further including the step of decoding said variable length encoded data stream, said step including the steps of:

parsing said variable length encoded data stream into separate portions, each said separate portion starting with one of said tags,

evaluating said tag of each said separate portion for determining whether said tag is said raw data tag or said tag indicating said encoded matching data string.

47. The method of claim 46 further including the step:

operative when said tag indicates said encoded matching data string, interpreting said variable length indicator and said pointer for generating said matching data string, thereby reconstructing a portion of the original input data stream.

48. The method of claim 46 further including the step:

operative when said tag indicates said raw data tag, obtaining said first character of said encoded input data stream, thereby reconstructing a portion of the original input data stream.

49. A method for converting a variable length encoded input data stream into an output data character stream in a data decompression system, said data decompression system comprising a storage means, said storage means having a plurality of entries, said entries of said storage means for storing said output data character stream, a storage means pointer, said storage means pointer indicating a latest entry in said storage means, said method comprising the steps of:

parsing said variable length encoded input data stream into separate portions, each said separate portion starting with a tag,

evaluating said tag of each said separate portion for determining whether said tag is a raw data tag or a tag indicating an encoded matching data string,

operative when said tag is said raw data tag, parsing a raw data byte from said separate portion, outputting said raw data byte, placing said raw data byte in said storage means, and

operative when said tag is said tag indicating an encoded matching data string, parsing a variable length indicator of the length of said matching data string and an offset indicating the location within said storage means of said matching data string, said separate portion being less than two characters of said output data character stream when said separate portion represents a matching data string of two characters, outputting said matching data string at said location within said storage means for said length, and placing said matching data string in said storage means.

50. An apparatus for converting an input data character stream into a variable length encoded data stream, said apparatus comprising:

storage means for storing previously processed input data characters for reference,

means for performing a search in said storage means for a longest data stream of said previously processed input data characters which matches said input data stream,

means for encoding said matching data stream by assigning a tag indicating that said matching data string was found, a variable length indicator of the length of said matching data string, and a pointer indicating the location within said storage means of said matching data string, and means for representing said pointer and said variable length indicator by an encoding scheme according to a predetermined strategy, said predetermined strategy ensuring that a matching string of two characters of said input data stream is compressed to less than said two characters of said input data stream, and

means for encoding a first character of said input data stream by assigning a "raw" data tag indicating that no matching data string was found in said storage means and said first character of said input data stream,

whereby said input data stream is converted into a variable length encoded data stream.

51. The apparatus of claim 50 further comprising means for limiting said search to a predefined number of inquiries in said storage means for said longest matching data string.

52. The apparatus of claim 50 further comprising:

means for discarding input data characters of said input data stream which have been encoded.

53. The apparatus of claim 50 further comprising means for representing said pointer either by a long form or a short form depending upon the location within said storage means of said matching data stream.

54. The apparatus of claim 50 further comprising means for defining said length indicator of a two character matching data string as the binary word "00".

55. The apparatus of claim 53 further comprising means for providing a single digit binary word for indicating whether said pointer is said long or said short form.

56. The apparatus of clam 55 further comprising means for encoding said short form as seven digit binary word, so that said pointer may indicate one hundred and twenty-seven different locations within said storage means.

57. The apparatus of claim 55 further comprising means for encoding said long form of said pointer as an eleven digit binary word, so that said pointer may indicate two thousand and forty-seven different locations within said storage means.

58. The apparatus of claim 50 or 56 or 57 further comprising means for encoding said variable length indicator according to the following table:

______________________________________ 00 = 2 01 = 3 10 = 4 11 00 = 5 11 01 = 6 11 10 = 7 11 11 0000 = 8 11 11 0001 = 9 11 11 0010 = 10 11 11 0011 = 11 11 11 0100 = 12 11 11 0101 = 13 11 11 0110 = 14 11 11 0111 = 15 11 11 1000 = 16 11 11 1001 = 17 11 11 1010 = 18 11 11 1011 = 19 11 11 1100 = 20 11 11 1101 = 21 11 11 1110 = 22 11 11 1111 0000 = 23 11 11 1111 0001 = 24 11 11 1111 0010 = 25 11 11 1111 1110 = 37 11 11 1111 1111 0000 = 38 11 11 1111 1111 0001 = 39 etc. ______________________________________

59. The apparatus of claim 50 further comprising:

means for defining a first group of words for defining the length of said matching data string where said matching data string is a length of 2, 3, and 4 characters,

means for said first group and a second group of words together defining lengths of said matching data string where said matching data string is a length of 5, 6, and 7 characters,

means for said first, said second and a third group of words together defining lengths of said matching data string where said matching data string is a length of 8 through 22 characters, and

means for said first, said second, said third and a fourth group of words together defining lengths of said matching data string where said matching data string is a length of 23 through 37 characters.

60. The apparatus of claim 59 further comprising means for representing said lengths of 2, 3, and 4 characters by the following table:

______________________________________ 00 2 characters 01 3 characters 10 4 characters. ______________________________________

61. The apparatus of claim 59 further comprising means for representing said lengths of 5, 6, and 7 characters by the following table:

______________________________________ 11 00 5 characters 11 01 6 characters 11 10 7 characters. ______________________________________

62. The apparatus of claim 59 further comprising means for representing said lengths of 8 through 22 characters by the following table:

______________________________________ 11 11 0000 8 characters 11 11 0001 9 characters 11 11 0010 10 characters 11 11 0011 11 characters 11 11 0100 12 characters 11 11 0101 13 characters 11 11 0110 14 characters 11 11 0111 15 characters 11 11 1000 16 characters 11 11 1001 17 characters 11 11 1010 18 characters 11 11 1011 19 characters 11 11 1100 20 characters 11 11 1101 21 characters 11 11 1110 22 characters. ______________________________________

63. The apparatus of claim 50 further comprising means for performing a hashing function.

64. The apparatus of claim 63 wherein said storage means has a plurality of entries, said entries of said storage means for storing said input data stream, further comprising:

a storage means pointer, said storage means pointer indicating the latest entry in said storage means,

a hash table means having a plurality of entries, each entry of said hash table means for storing a pointer for indicating one of said entries of said storage means,

an offset array means, said offset array means having a plurality of entries, each entry providing a link from one of said entries of said storage means to one or more other entries of said storage means,

said means for performing a hashing function providing a pointer to one of said entries of said hash table means,

means for obtaining said pointer stored at said hash table entry pointed to by said result of said hashing function,

means for calculating the difference between said storage means pointer and said pointer read from said hash table means,

means for storing said difference into said offset array means entry pointed to by said storage means pointer, and

means for storing said storage means pointer into said hash table entry pointed to by said hashing function.

65. The apparatus of claim 64 further comprising means for incrementing said storage means pointer to the next entry in said storage means.

66. The apparatus of claim 64 further comprising means for periodically examining the pointer stored at each said hash table entry for determining whether the difference between said pointer of each said entry and said storage means pointer is greater than a predetermined amount.

67. The apparatus of claim 66 further comprising:

means for replacing said entry in said hash table by an invalid value, thereby reinitializing said entry.

68. The apparatus of claim 67 further comprising:

means for appending to said encoded data an end of compressed data marker which is detected during decompression of said variable length encoded data stream.

69. The apparatus of claim 68 further comprising:

means for initializing said hash table by replacing all entries of said hash table by invalid values.

70. The apparatus of claim 50 further comprising:

means for decoding said variable length encoded data stream,

means for parsing said variable length encoded data stream into separate portions, each said separate portion starting with one of said tags, and

means for evaluating said tag of each said separate portion for determining whether said tag is said raw data tag or said tag indicating said encoded matching data string.

71. The apparatus of claim 70 further comprising:

means for interpreting said variable length indicator and said pointer for generating said matching data string, thereby reconstructing a portion of the original input data stream.

72. The apparatus of claim 70 further comprising:

means for obtaining said first character of said encoded input data stream, thereby reconstructing a portion of the original input data stream.

73. An apparatus for converting a variable length encoded input data stream into an output data character stream in a data decompression system, said data decompression system comprising:

a storage means, said storage means having a plurality of entries, said entries of said storage means for storing said output data character stream,

a storage means pointer, said storage means pointer indicating a latest entry in said storage means,

means for parsing said variable length encoded input data stream into separate portions, each said separate portion starting with a tag,

means for evaluating said tag of each said separate portion for determining whether said tag is said a raw data tag or a tag indicating an encoded matching data string,

means for parsing a raw data byte from said separate portion, means for outputting said raw data byte, means for placing said raw data byte in said storage means, and

means for parsing a variable length indicator of the length of said matching data string and an offset indicating the location within said storage means of said matching data string, said separate portion being less than two characters of said output data character stream when said separate portion represents a matching data string of two characters, means for outputting said matching data string at said location within said storage means for said length, and means for placing said matching data string in said storage means.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates generally to data storage and communication systems, and more particularly to data compression systems and methods which improve the capacity of data storage and communication.

2. Description of the Prior Art

Due to the insignificant differences between data compression in data storage and data communication systems, only data storage systems are referred to; particularly the data files stored in such systems. However, all data storage systems can easily be extended to cover data communications systems and other applications as well. A file is assumed to be a sequential stream of bytes or characters, where a byte consists of some fixed number of bits (typically 8), and the compression system transforms this input byte stream into a "compressed" output stream of bytes from which the original file contents can be reconstructed by a decompression unit.

It is well-established that computer data files typically contain a significant amount of redundancy. Many techniques have been applied over the years to "compress" these files so that they will occupy less space on the disk or tape storage medium or so that they can be transmitted in less time over a communications channel such as a 1200 baud modem line. For example, there are several widely used commercial programs available for personal computers (e.g., ARC Software by Systems Enhancement Associates, Inc., Wayne, N.J., 1985) which perform the compression and decompression functions on files. It is not uncommon for such programs to reduce the size of a given file by a 2:1 ratio (or better), although the amount of reduction varies widely depending on the contents of the file.

There are many approaches in the prior art for compressing data. Some of these approaches make implicit assumptions about certain types of files or data within the files. For example, a bit image of a page produced using a scanner typically has most of its pixels blank, and this tendency can be exploited by a compress