WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Data compression using hashing    
United States Patent5371499   
Link to this pagehttp://www.wikipatents.com/5371499.html
Inventor(s)Graybill; Mark D. (Agoura Hills, CA); Harris; Donald R. (Quartz Hill, CA); Gibson; Dean K. (Harbor City, CA)
AbstractCompressing a sequence of characters drawn from an alphabet uses string substitution with no a priori information. An input data block is processed into an output data block comprised of variable length incompressible data sections and variable length compressed token sections. Multiple hash tables are used based on different subblock sizes for string matching, and this improves the compression ratio and rate of compression. The plurality of uses of the multiple hash tables allows for selection of an appropriate compression data rate and/or compression factor in relation to the input data. Using multiple hashing tables with a recoverable hashing method further improves compression ratio and compression rate. Each incompressible data section contains means to distinguish it from compressed token sections.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5371499
Data compression using hashing - US Patent 5371499 Drawing
Data compression using hashing
Inventor     Graybill; Mark D. (Agoura Hills, CA); Harris; Donald R. (Quartz Hill, CA); Gibson; Dean K. (Harbor City, CA)
Owner/Assignee     Intersecting Concepts, Inc. (Agoura Hills, CA)
Patent assignment
All assignments
Publication Date     December 6, 1994
Application Number     07/965,331
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     October 23, 1992
US Classification     341/51 341/67 341/106
Int'l Classification     H03M 007/30
Examiner     Young; Brian K.
Assistant Examiner    
Attorney/Law Firm     Merchant & Gould
Address
Parent Case     RELATED APPLICATION This application is a continuation-in-part of U.S. Ser. No. 843,982 filed February 28, 1992. The contents of this application are incorporated by reference herein.
Priority Data    
USPTO Field of Search     341/51 341/65 341/87 341/95 341/106 341/107 395/425 365/49 365/230.03
Patent Tags     data compression hashing
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5132992
Yurt
375/240
Jul,1992

[0 after 0 votes]
5129074
Kikuchi
711/173
Jul,1992

[0 after 0 votes]
5049881
Gibson
341/95
Sep,1991

[0 after 0 votes]
5032987
Broder
711/221
Jul,1991

[0 after 0 votes]
5016009
Whiting
341/67
May,1991

[0 after 0 votes]
5010553
Scheller
714/751
Apr,1991

[0 after 0 votes]
5003307
Whiting
341/51
Mar,1991

[0 after 0 votes]
4988998
O'Brien
341/55
Jan,1991

[0 after 0 votes]
4979039
Kisor
375/240.22
Dec,1990

[0 after 0 votes]
4961139
Hong
707/1
Oct,1990

[0 after 0 votes]
4929946
O'Brien
341/87
May,1990

[0 after 0 votes]
4906991
Fiala
341/51
Mar,1990

[0 after 0 votes]
4899147
Schiavo
341/60
Feb,1990

[0 after 0 votes]
4876541
Storer
341/51
Oct,1989

[0 after 0 votes]
4872009
Tsukiyama
341/95
Oct,1989

[0 after 0 votes]
4853696
Mukherjee
341/65
Aug,1989

[0 after 0 votes]
4814746
Miller
341/95
Mar,1989

[0 after 0 votes]
4809350
Shimoni
382/238
Feb,1989

[0 after 0 votes]
4758899
Tsukiyama
360/8
Jul,1988

[0 after 0 votes]
4730348
MacCrisken
375/240
Mar,1988

[0 after 0 votes]
4701745
Waterworth
341/63
Oct,1987

[0 after 0 votes]
4682150
Mathes
235/431
Jul,1987

[0 after 0 votes]
4677649
Kunishi
375/240
Jun,1987

[0 after 0 votes]
4612532
Bacon
341/51
Sep,1986

[0 after 0 votes]
4588985
Carter
341/95
May,1986

[0 after 0 votes]
4586027
Tsukiyama
341/95
Apr,1986

[0 after 0 votes]
4560976
Finn
341/51
Dec,1985

[0 after 0 votes]
4558302
Welch
341/51
Dec,1985

[0 after 0 votes]
4538240
Carter
708/492
Aug,1985

[0 after 0 votes]
4491934
Heinz
341/55
Jan,1985

[0 after 0 votes]
4464650
Eastman
341/51
Aug,1984

[0 after 0 votes]
4412306
Moll
708/203
Oct,1983

[0 after 0 votes]
4295124
Roybal
341/86
Oct,1981

[0 after 0 votes]
4087788
Johannesson
382/241
May,1978

[0 after 0 votes]
4054951
Jackson
714/48
Oct,1977

[0 after 0 votes]
4021782
Hoerning
341/51
May,1977

[0 after 0 votes]
3976844
Betz
375/240
Aug,1976

[0 after 0 votes]
3914586
McIntosh
341/68
Oct,1975

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


We claim:

1. A method of compressing a stream of input data into a compressed stream of output data comprising:

(a) maintaining multiple hash tables, each respectively for a different data subblock size, and each hash table having entries having respective hash keys,

(b) hashing a string of input characters of the input data for each of the different data subblock sizes to obtain the hash keys, and using these hash keys to address hash table entries containing hash information to facilitate location of string matches,

(c) hashing, for each of the subblock sizes, subsequent strings of data and searching for a match of prior strings related to the information addressed by hash keys in at least one of the hashing tables, and

(d) if a hash match occurs in at least one hash table, outputting the subsequent occurrence of the string as compressed data, and if a hash match for each of the subblocks does not occur in any hash table, outputting at least the first character of an input subblock as uncompressed data.

2. A method as claimed in claim 1 wherein the hashed information in relation to the input data is stored in at least one of the hash tables and is selectively only a source pointer to prior hashed subblocks or the source pointer representative of the hashed subblocks and additional data.

3. A method as claimed in claim 1 wherein selectively either one or multiple of the hash table entries contain selectively only a source pointer to prior hashed subblocks, or a source pointer to prior hashed subblocks and additional data.

4. A method as claimed in claim 1 wherein the hashed information in the hash entries in at least one of the hash tables is selectively unconditionally replaced, or conditionally replaced or removed on the occurrence of subsequent hashed information being directed by a hash key to the hash table entry containing information representative of prior hashed subblocks.

5. A method as claimed in claim 1 wherein hashing is effected on a data subblock size of three for a first hash table and a data subblock size of four for a second hash table.

6. A method of compressing a stream of input data into a compressed stream of output data comprising:

(a) maintaining multiple hash tables, each respectively for a different data subblock size, and each hash table having entries having respective hash keys,

(b) hashing a string of input characters of the input data for each of the different data subblock sizes to obtain the hash keys, and using these hash keys to address hash table entries containing hash information to facilitate location of string matches,

(c) hashing, for each of the subblock sizes, subsequent strings of data and searching for a match of prior strings related to the information addressed by hash keys in at least one of the hashing tables,

(d) if a hash match occurs in at least one hash table, outputting the subsequent occurrence of the string as compressed data, and if a hash match for each of the subblocks does not occur in any hash table, outputting at least the first character of an input subblock as uncompressed data, and

(e) determining at least part of a hash key for one of the hash tables from the hash key of another of the hash tables.

7. A method as claimed in claim 6 wherein the hashed keys for a first hash table are determined on a first data subblock size and wherein a second hash table contains keys obtained from the second subblock size, and wherein the hash keys for the first subblock size are obtained at least in part from the second hash table keys.

8. A method as claimed in claim 6 wherein the hashed information in relation to the input data is stored in at least one of the hash tables and is selectively only a source pointer to prior hashed subblocks or the source pointer representative of the hashed subblocks and additional data.

9. A method as claimed in claim 6 wherein selectively either one or multiple of the hash table entries contain selectively only a source pointer to prior hashed subblocks, or a source pointer to prior hashed subblocks and additional data.

10. A method as claimed in claim 6 wherein the hashed information in the hash entries in at least one of the hash tables is selectively unconditionally replaced, or conditionally replaced or removed on the occurrence of subsequent hashed information being directed by a hash key to the hash table entry containing information representative of prior hashed subblocks.

11. A method as claimed in claim 6 wherein hashing is effected on a data subblock size of three for a first hash table and a data subblock size of four for a second hash table.

12. A method of compressing a stream of input data into a compressed string of output data comprising:

(a) maintaining multiple hash tables, each respectively for a different data subblock size, and each hash table having entries having respective hash keys,

(b) hashing a string of input characters of the input data for each of the different data subblock sizes to obtain the hash keys, and using these hash keys to address hash table entries containing hash information to facilitate location of string matches,

(c) hashing, for each of the subblock sizes, subsequent strings of data and searching for a match of prior strings related to the information addressed by hash keys in at least one of the hashing tables,

(d) if a hash match for each of the subblocks does not occur in any hash table, outputting at least the first character of an input subblock as uncompressed data, and

(e) if a hash match occurs in at least one hash table, outputting the subsequent occurrence of the string as compressed data, such that when a hash match occurs in a table related to a larger subblock size and a hash match occurs in a table related to a smaller subblock size, the hash match in the hash table of the larger subblock size is selected for the compressed data output.

13. A method as claimed in claim 12 wherein the hashed information in relation to the input data is stored in at least one of the hash tables and is selectively only a source pointer to prior hashed subblocks or the source pointer representative of the hashed subblocks and additional data.

14. A method as claimed in claim 12 wherein selectively either one or multiple of the hash table entries contain selectively only a source pointer to prior hashed subblocks, or a source pointer to prior hashed subblocks and additional data.

15. A method as claimed in claim 12 wherein the hashed information in the hash entries in at least one of the hash tables is selectively unconditionally replaced, or conditionally replaced or removed on the occurrence of subsequent hashed information being directed by a hash key to the hash table entry containing information representative of prior hashed subblocks.

16. A method as claimed in claim 12 wherein hashing is effected on a data subblock size of three for a first hash table and a data subblock size of four for a second hash table.

17. A method of compressing a stream of input data into a compressed string of output data comprising:

(a) maintaining multiple hash tables, each respectively for a different data subblock size, and each hash table having entries having respective hash keys,

(b) hashing a string of input characters of the input data for each of the different data subblock sizes to obtain the hash keys, and using these hash keys to address hash table entries containing hash information to facilitate location of string matches,

(c) hashing, for each of the subblock sizes, subsequent strings of data and searching for a match of prior strings related to the information addressed by hash keys in at least one of the hashing tables,

(d) if a hash match for each of the subblocks does not occur in any hash table, outputting at least the first character of an input subblock as uncompressed data, and

(e) if a hash match occurs in a hash table, outputting the subsequent occurrence of the string as compressed data, such that where a hash match occurs for more than one hash table, determining the longer match length of the string of input data for each respective hash table, and outputting the longer match length as the compressed data output.

18. A method of compressing a stream of input data into a compressed string of output data comprising:

(a) maintaining multiple hash tables, each respectively for a different data subblock size, and each hash table having entries having respective hash keys,

(b) hashing a string of input characters of the input data for each of the different data subblock sizes to obtain the hash keys, and using these hash keys to address hash table entries containing hash information to facilitate location of string matches,

(c) hashing, for each of the subblock sizes, subsequent strings of data and searching for a match of prior strings related to the information addressed by hash keys in at least one of the hashing tables,

(d) if a hash match for each of the subblocks does not occur in any hash table, outputting at least the first character of an input subblock as uncompressed data,

(e) if a hash match occurs in a hash table, outputting the subsequent occurrence of the string as compressed data, such that where a hash match occurs in a table related to a larger subblock size relative to a hash match in a table related to a smaller subblock size, the hash match of the larger subblock size is selected as the compressed data output, and

(f) wherein the hash keys for the larger subblock size are computed using at least part of the value of hash keys from the smaller subblock size.

19. A method of compressing a stream of input data into a compressed string of output data comprising:

(a) maintaining multiple hash tables, each respectively for a different data subblock size, and each hash table having entries having respective hash keys,

(b) hashing a string of input characters of the input data for each of the different data subblock sizes to obtain the hash keys, and using these hash keys to address hash table entries containing hash information to facilitate location of string matches,

(c) hashing, for each of the subblock sizes, subsequent strings of data and searching for a match of prior strings related to the information addressed by hash keys in at least one of the hashing tables,

(d) if a hash match for each of the subblocks does not occur in any hash table, outputting at least the first character of an input subblock as uncompressed data,

(e) wherein when a hash match occurs in a table related to a larger subblock size relative to a hash match in a table related to a smaller subblock size, selecting the hash match of the larger subblock size as the compressed data output, and

(f) if a hash match occurs in a hash table, outputting the subsequent occurrence of the string as compressed data, such that when a larger subblock size does not match on a minimum match length, selecting the smaller subblock size hash value if such smaller subblock value matches for its respective minimum length.

20. A method as claimed in claim 17 wherein the hashed information in relation to the input data is stored in at least one of the hash tables and is selectively only a source pointer to prior hashed subblocks or the source pointer representative of the hashed subblocks and additional data.

21. A method as claimed in claim 17 wherein selectively either one or multiple of the hash table entries contain selectively only a source pointer to prior hashed subblocks, or a source pointer to prior hashed subblocks and additional data.

22. A method as claimed in claim 17 wherein the hashed information in the hash entries in at least one of the hash tables is selectively unconditionally replaced, or conditionally replaced or removed on the occurrence of subsequent hashed information being directed by a hash key to the hash table entry containing information representative of prior hashed subblocks.

23. A method as claimed in claim 17 wherein hashing is effected on a data subblock size of three for a first hash table and a data subblock size of four for a second hash table.

24. A method as claimed in claim 18 wherein the hashed information in relation to the input data is stored in at least one of the hash tables and is selectively only a source pointer to prior hashed subblocks or the source pointer representative of the hashed subblocks and additional data.

25. A method as claimed in claim 18 wherein selectively either one or multiple of the hash table entries contain selectively only a source pointer to prior hashed subblocks, or a source pointer to prior hashed subblocks and additional data.

26. A method as claimed in claim 18 wherein the hashed information in the hash entries in at least one of the hash tables is selectively unconditionally replaced, or conditionally replaced or removed on the occurrence of subsequent hashed information being directed by a hash key to the hash table entry containing information representative of prior hashed subblocks.

27. A method as claimed in claim 18 wherein hashing is effected on a data subblock size of three for a first hash table and a data subblock size of four for a second hash table.

28. A method as claimed in claim 19 wherein the hashed information in relation to the input data is stored in at least one of the hash tables and is selectively only a source pointer to prior hashed subblocks or the source pointer representative of the hashed subblocks and additional data.

29. A method as claimed in claim 19 wherein selectively either one or multi