|
Claims  |
|
|
We claim:
1. A method for converting an input data character stream into a variable
length encoded data stream in a data compression system, said data
compression system comprising a history array means, said history array
means having a plurality of entries, said entries of said history array
means for storing said input data stream, a history array pointer, said
history array pointer indicating a latest entry in said history array
means, a hash table means having a plurality of entries, each entry of
said hash table means for storing a pointer indicating one of said entries
of said history array means, and an offset array means, said offset array
means having a plurality of entries, each entry of said offset array means
providing a link, if any, from one of said entries in said history array
means to one or more other entries of said history array means, said
method comprising the steps of:
performing a search in said history array means for a longest matching data
string which matches said input data stream, said step of performing said
search including the steps of:
performing a hashing function on said input data stream, said hashing
function providing a pointer indicating one of said entries of said hash
table means,
obtaining said pointer stored at said hash table entry pointed to by said
hashing function,
calculating a difference between said history array pointer and said
pointer obtained from said hash table means,
storing said difference into said offset array means entry pointed to by
said history array pointer, and
storing said history array pointer into said hash table entry pointed to by
said hashing function,
operative when said matching data string is found within said history array
means, encoding said matching data string found in said history array
means by assigning a tag indicating that said matching data string was
found and a string substitution code including a variable length indicator
of the length of said matching data string and a pointer indicating the
location within said history array means of said matching data string,
operative when said matching data string is not found within said history
array means, encoding a first character of said input data stream by
assigning a "raw" data tag indicating that no matching data string was
found in said history array means and said first character of said input
data stream,
whereby said input data stream is converted into a variable length encoded
data stream.
2. The method of claim 1 wherein said step of performing said search in
said history array means for said longest matching data string further
includes the step of limiting said search to a predefined number of
inquiries in said history array means for said longest matching data
string.
3. The method of claim 2 further including the steps of:
discarding input data characters of said input data stream which have been
encoded and performing one or more steps of said method for further
encoding said input data stream.
4. The method of claim 1 or 3 wherein said step of appending said pointer
and said length indicator further includes the step of representing said
pointer and said length indicator by an encoding scheme according to a
predetermined strategy that ensures that a matching string of two
characters of said input data stream is compressed to less than said two
characters of said input data stream.
5. The method of claim 1 further including the step of incrementing said
history array pointer to the next entry in said history array means and
performing one or more elements of said method for hashing said next entry
of said history array means.
6. The method of claim 1 wherein said step of performing a search further
includes the step of periodically examining the pointer stored at each
said hash table entry for determining whether the difference between said
pointer of each said entry and said history array pointer is greater than
a predetermined amount.
7. The method of claim 8 wherein said step further includes the step of:
operative when said difference is greater than said predetermined amount,
replacing said entry in said hash table by an invalid value, thereby
reinitializing said entry.
8. The method of claim 1 further including the step of:
completing said method of encoding by appending to said encoded data an end
of compressed data marker which is detected during decompression of said
variable length encoded data stream.
9. The method of claim 1 further including the step of:
initializing said hash table by replacing all entries of said hash table by
invalid values.
10. The method of claim 1 further including the step of decoding said
variable length encoded data stream, said step including the steps of:
parsing said variable length encoded data stream into separate portions,
each said separate portion starting with one of said tags,
evaluating said tag of each said separate portion for determining whether
said tag is said raw data tag or said tag indicating said encoded matching
data string.
11. The method of claim 10 further including the step:
operative when said tag indicates said encoded matching data string,
interpreting said length indicator and said pointer of said substitution
code for generating said matching data string, thereby reconstructing a
portion of the original input data stream.
12. The method of claim 10 further including the step:
operative when said tag indicates said raw data tag, obtaining said first
character of said encoded input data stream, thereby reconstructing a
portion of the original input data stream.
13. An apparatus for converting an input data character stream into a
variable length encoded data stream in a data compression system, said
data compression system comprising a history array means, said history
array means having a plurality of entries, said entries of said history
array means for storing said input data stream, a history array pointer,
said history array pointer indicating a latest entry in said history array
means, a hash table means having a plurality of entries, each entry of
said hash table means for storing a pointer indicating one of said entries
of said history array means, and an offset array means, said offset array
means having a plurality of entries, each entry of said offset array means
providing a link, if any, from one of said entries in said history array
means to one or more other entries of said history array means, said
apparatus comprising:
means for performing a search in said history array means for a longest
matching data string which matches said input data stream, said means for
performing said search including:
means for performing a hashing function on said input data stream, said
hashing function providing a pointer indicating one of said entries of
said hash table means,
means for obtaining said pointer stored at said hash table entry pointed to
by said hashing function,
means for calculating a difference between said history array pointer and
said pointer obtained from said hash table means,
means for storing said difference into said offset array means entry
pointed to by said history array pointer, and
means for storing said history array pointer into said hash table entry
pointed to by said hashing function,
means for encoding said matching data string found in said history array
means by assigning a tag indicating that said matching data string was
found and a string substitution code including a variable length indicator
of the length of said matching data string and a pointer indicating the
location within said history array means of said matching data string,
means for encoding a first character of said input data stream by assigning
a "raw" data tag indicating that no matching data string was found is said
history array means and said first character of said input data stream,
whereby said input data stream is converted into a variable length encoded
data stream.
14. The apparatus of claim 13 wherein said means for performing said search
in said history array means for said longest matching data string further
includes means for limiting said search to a predefined number of
inquiries in said history array means for said longest matching data
string.
15. The apparatus of claim 14 further including:
means for discarding input data characters of said input data stream which
have been encoded.
16. The apparatus of claim 13 or 15 wherein said means for appending said
pointer and said length indicator further includes means for representing
said pointer and said length indicator by an encoding scheme according to
a predetermined strategy that ensures that a matching string of two
characters of said input data stream is compressed to less than said two
characters of said input data stream.
17. The apparatus of claim 13 further including means for incrementing said
history array pointer to the next entry in said history array means.
18. The apparatus of claim 13 wherein said means for performing a search
further includes means for periodically examining the pointer stored at
each said hash table entry for determining whether the difference between
said pointer of each said entry and said history array pointer is greater
than a predetermined amount.
19. The apparatus of claim 18 wherein said means further includes:
operative when said difference is greater than said predetermined amount,
means for replacing said entry in said hash table by an invalid value,
thereby reinitializing said entry.
20. The apparatus of claim 13 further including:
means for appending to said encoded data an end of compressed data marker
which is detected during the decompression of said variable length encoded
data stream.
21. The apparatus of claim 13 further including:
means for initializing said hash table by replacing all entries of said
hash table by invalid values.
22. The apparatus of claim 13 further including means for decoding said
variable length encoded data stream, said means including:
means for parsing said variable length encoded data stream into separate
portions, each said separate portion starting with one of said tags,
means for evaluating said tag of said separate portion for determining
whether said tag is said raw data tag or said tag indicating said encoded
matching data string.
23. The apparatus of claim 22 further including:
operative when said tag indicates said encoded matching data string, means
for interpreting said length indicator and said pointer of said
substitution code for generating said matching data string, thereby
reconstructing a portion of the original input data stream.
24. The apparatus of claim 22 further including:
operative when said tag indicates said raw data tag, means for obtaining
said first character of said encoded input data stream thereby
reconstructing a portion of the original input data stream.
25. The apparatus of claim 24 further including a RAM, said RAM including
said history array means, said history table means and said offset array
means, and said encoded or decoded data streams temporarily stored in one
or more FIFO buffers, and portions of said FIFO buffers also maintained by
said RAM, thereby enhancing data transfer performance.
26. A method for converting an input data character stream into a variable
length encoded data stream, said method comprising the steps of:
storing previously processed input data characters into a storage means for
reference,
performing a search in said data storage means for a longest data string of
said previously processed input data characters which matches said input
data stream,
operative when said matching data string is found within said storage
means, encoding said matching data string by assigning a tag indicating
that said matching data string was found, a variable length indicator of
the length of said matching data string, and a pointer indicating the
location within said storage means of said matching data string, and said
step of encoding said matching data string including the step of
representing said pointer and said variable length indicator by an
encoding scheme according to a predetermined strategy, said predetermined
strategy ensuring that a matching string of two characters of said input
data stream is compressed to less than said two characters of said input
data stream,
operative when said matching input data string is not found within said
storage means, encoding a first character of said input data stream by
assigning a "raw" data tag indicating that no matching data string was
found in said storage means and said first character of said input data
stream,
whereby said input data stream is converted into a variable length encoded
data stream.
27. The method of claim 26 wherein said step of performing said search in
said storage means for said longest matching data string further includes
the step of limiting said search to a predefined number of inquiries in
said storage means for said longest matching data string.
28. The method of claim 26 further including the steps of:
discarding input data characters of said input data stream which have been
encoded and performing one or more steps of said method for further
encoding said input data stream.
29. The method of claim 26 wherein said step of appending said pointer and
said length indicator further includes the step of representing said
pointer either by a long form or a short form depending upon the location
within said storage means of said matching data string.
30. The method of claim 26 further including the step of defining said
length indicator of a two character matching data string as the binary
word "00".
31. The method of claim 29 further including the step of providing a single
digit binary word for indicating whether said pointer is said long or said
short form.
32. The method of claim 31 further including the step of encoding said
short form as a seven digit binary word, so that said pointer may indicate
one hundred and twenty-seven different locations within said storage
means.
33. The method of claim 31 further including the step of encoding said long
form of said pointer as an eleven digit binary word, so that said pointer
may indicate two thousand and forty-seven different locations within said
storage means.
34. The method of claim 26 or 32 or 33 further including the step of
encoding said variable length indicator according to the following table:
______________________________________
00 = 2
01 = 3
10 = 4
11 00 = 5
11 01 = 6
11 10 = 7
11 11 0000 = 8
11 11 0001 = 9
11 11 0010 = 10
11 11 0011 = 11
11 11 0100 = 12
11 11 0101 = 13
11 11 0110 = 14
11 11 0111 = 15
11 11 1000 = 16
11 11 1001 = 17
11 11 1010 = 18
11 11 1011 = 19
11 11 1100 = 20
11 11 1101 = 21
11 11 1110 = 22
11 11 1111 0000 = 23
11 11 1111 0001 = 24
11 11 1111 0010 = 25
11 11 1111 1110 = 37
11 11 1111 1111 0000 = 38
11 11 1111 1111 0001 = 39
etc.
______________________________________
35. The method of claim 26 wherein said step of representing said length
indicator by said encoding scheme further includes the steps of:
defining a first group of words for defining the length of said matching
data string where said matching data string is a length of 2, 3, and 4
characters,
said first group and a second group of words together defining lengths of
said matching data string where said matching data string is a length of
5, 6, and 7 characters,
said first, said second and a third group of words together defining
lengths of said matching data string where said matching data string is a
length of 8 through 22 characters, and
said first, said second, said third and a fourth group of words together
defining lengths of said matching data string where said matching data
string is a length of 23 through 37 characters.
36. The method of claim 35 wherein said step of defining said first group
of words further includes the step of representing said lengths of 2, 3,
and 4 characters by the following table:
______________________________________
00 2 characters
01 3 characters
10 4 characters.
______________________________________
37. The method of claim 35 wherein said step of defining said first and
said second group of words further includes the step of representing said
lengths of 5, 6, and 7 characters by the following table:
______________________________________
11 00
5 characters
11 01
6 characters
11 10
7 characters.
______________________________________
38. The method of claim 35 wherein said step of defining said first, said
second and said third group of words further includes the step of
representing said lengths of 8 through 22 characters by the following
table:
______________________________________
11 11 0000 8 characters
11 11 0001 9 characters
11 11 0010 10 characters
11 11 0011 11 characters
11 11 0100 12 characters
11 11 0101 13 characters
11 11 0110 14 characters
11 11 0111 15 characters
11 11 1000 16 characters
11 11 1001 17 characters
11 11 1010 18 characters
11 11 1011 19 characters
11 11 1100 20 characters
11 11 1101 21 characters
11 11 1110 22 characters.
______________________________________
39. The method of claim 26 wherein said step of performing said search in
said storage means for said longest matching data string further includes
the step of performing a hashing function.
40. The method of claim 39 wherein said conversion of said input data
stream into said variable length encoded data stream occurs in a data
compression system and said data compression system includes said storage
means, said storage means having a plurality of entries, said entries of
said storage means for storing said input data stream, a storage means
pointer, said storage means pointer indicating the latest entry in said
storage means, a hash table means having a plurality of entries, each
entry of said hash table means for storing a pointer for indicating one of
said entries of said storage means, and an offset array means, said offset
array means having a plurality of entries, each entry providing a link
from one of said entries of said storage means to one or more other
entries of said storage means and said step of performing said search in
said storage means further including the steps of:
said hashing function providing a pointer to one of said entries of said
hash table means,
obtaining said pointer stored at said hash table entry pointed to by said
result of said hashing function,
calculating the difference between said storage means pointer and said
pointer read from said hash table means,
storing said difference into said offset array means entry pointed to by
said storage means pointer, and
storing said storage means pointer into said hash table entry pointed to by
said hashing function.
41. The method of claim 40 further including the step of incrementing said
storage means pointer to the next entry in said storage means and
performing one or more elements of said method for hashing said next entry
of said storage means.
42. The method of claim 40 wherein said step of performing said search in
said storage means further includes the step of periodically examining the
pointer stored at each said hash table entry for determining whether the
difference between said pointer of each said entry and said storage means
pointer is greater than a predetermined amount.
43. The method of claim 42 wherein said step further includes the step of:
operative when said difference is greater than said predetermined amount,
replacing said entry in said hash table by an invalid value, thereby
reinitializing said entry.
44. The method of claim 43 further including the step of:
completing said method of encoding by appending to said encoded data an end
of compressed data marker which is detected during decompression of said
variable length encoded data stream.
45. The method of claim 44 further including the step of:
initializing said hash table by replacing all entries of said hash table by
invalid values.
46. The method of claim 26 further including the step of decoding said
variable length encoded data stream, said step including the steps of:
parsing said variable length encoded data stream into separate portions,
each said separate portion starting with one of said tags,
evaluating said tag of each said separate portion for determining whether
said tag is said raw data tag or said tag indicating said encoded matching
data string.
47. The method of claim 46 further including the step:
operative when said tag indicates said encoded matching data string,
interpreting said variable length indicator and said pointer for
generating said matching data string, thereby reconstructing a portion of
the original input data stream.
48. The method of claim 46 further including the step:
operative when said tag indicates said raw data tag, obtaining said first
character of said encoded input data stream, thereby reconstructing a
portion of the original input data stream.
49. A method for converting a variable length encoded input data stream
into an output data character stream in a data decompression system, said
data decompression system comprising a storage means, said storage means
having a plurality of entries, said entries of said storage means for
storing said output data character stream, a storage means pointer, said
storage means pointer indicating a latest entry in said storage means,
said method comprising the steps of:
parsing said variable length encoded input data stream into separate
portions, each said separate portion starting with a tag,
evaluating said tag of each said separate portion for determining whether
said tag is a raw data tag or a tag indicating an encoded matching data
string,
operative when said tag is said raw data tag, parsing a raw data byte from
said separate portion, outputting said raw data byte, placing said raw
data byte in said storage means, and
operative when said tag is said tag indicating an encoded matching data
string, parsing a variable length indicator of the length of said matching
data string and an offset indicating the location within said storage
means of said matching data string, said separate portion being less than
two characters of said output data character stream when said separate
portion represents a matching data string of two characters, outputting
said matching data string at said location within said storage means for
said length, and placing said matching data string in said storage means.
50. An apparatus for converting an input data character stream into a
variable length encoded data stream, said apparatus comprising:
storage means for storing previously processed input data characters for
reference,
means for performing a search in said storage means for a longest data
stream of said previously processed input data characters which matches
said input data stream,
means for encoding said matching data stream by assigning a tag indicating
that said matching data string was found, a variable length indicator of
the length of said matching data string, and a pointer indicating the
location within said storage means of said matching data string, and means
for representing said pointer and said variable length indicator by an
encoding scheme according to a predetermined strategy, said predetermined
strategy ensuring that a matching string of two characters of said input
data stream is compressed to less than said two characters of said input
data stream, and
means for encoding a first character of said input data stream by assigning
a "raw" data tag indicating that no matching data string was found in said
storage means and said first character of said input data stream,
whereby said input data stream is converted into a variable length encoded
data stream.
51. The apparatus of claim 50 further comprising means for limiting said
search to a predefined number of inquiries in said storage means for said
longest matching data string.
52. The apparatus of claim 50 further comprising:
means for discarding input data characters of said input data stream which
have been encoded.
53. The apparatus of claim 50 further comprising means for representing
said pointer either by a long form or a short form depending upon the
location within said storage means of said matching data stream.
54. The apparatus of claim 50 further comprising means for defining said
length indicator of a two character matching data string as the binary
word "00".
55. The apparatus of claim 53 further comprising means for providing a
single digit binary word for indicating whether said pointer is said long
or said short form.
56. The apparatus of clam 55 further comprising means for encoding said
short form as seven digit binary word, so that said pointer may indicate
one hundred and twenty-seven different locations within said storage
means.
57. The apparatus of claim 55 further comprising means for encoding said
long form of said pointer as an eleven digit binary word, so that said
pointer may indicate two thousand and forty-seven different locations
within said storage means.
58. The apparatus of claim 50 or 56 or 57 further comprising means for
encoding said variable length indicator according to the following table:
______________________________________
00 = 2
01 = 3
10 = 4
11 00 = 5
11 01 = 6
11 10 = 7
11 11 0000 = 8
11 11 0001 = 9
11 11 0010 = 10
11 11 0011 = 11
11 11 0100 = 12
11 11 0101 = 13
11 11 0110 = 14
11 11 0111 = 15
11 11 1000 = 16
11 11 1001 = 17
11 11 1010 = 18
11 11 1011 = 19
11 11 1100 = 20
11 11 1101 = 21
11 11 1110 = 22
11 11 1111 0000 = 23
11 11 1111 0001 = 24
11 11 1111 0010 = 25
11 11 1111 1110 = 37
11 11 1111 1111 0000 = 38
11 11 1111 1111 0001 = 39
etc.
______________________________________
59. The apparatus of claim 50 further comprising:
means for defining a first group of words for defining the length of said
matching data string where said matching data string is a length of 2, 3,
and 4 characters,
means for said first group and a second group of words together defining
lengths of said matching data string where said matching data string is a
length of 5, 6, and 7 characters,
means for said first, said second and a third group of words together
defining lengths of said matching data string where said matching data
string is a length of 8 through 22 characters, and
means for said first, said second, said third and a fourth group of words
together defining lengths of said matching data string where said matching
data string is a length of 23 through 37 characters.
60. The apparatus of claim 59 further comprising means for representing
said lengths of 2, 3, and 4 characters by the following table:
______________________________________
00 2 characters
01 3 characters
10 4 characters.
______________________________________
61. The apparatus of claim 59 further comprising means for representing
said lengths of 5, 6, and 7 characters by the following table:
______________________________________
11 00
5 characters
11 01
6 characters
11 10
7 characters.
______________________________________
62. The apparatus of claim 59 further comprising means for representing
said lengths of 8 through 22 characters by the following table:
______________________________________
11 11 0000 8 characters
11 11 0001 9 characters
11 11 0010 10 characters
11 11 0011 11 characters
11 11 0100 12 characters
11 11 0101 13 characters
11 11 0110 14 characters
11 11 0111 15 characters
11 11 1000 16 characters
11 11 1001 17 characters
11 11 1010 18 characters
11 11 1011 19 characters
11 11 1100 20 characters
11 11 1101 21 characters
11 11 1110 22 characters.
______________________________________
63. The apparatus of claim 50 further comprising means for performing a
hashing function.
64. The apparatus of claim 63 wherein said storage means has a plurality of
entries, said entries of said storage means for storing said input data
stream, further comprising:
a storage means pointer, said storage means pointer indicating the latest
entry in said storage means,
a hash table means having a plurality of entries, each entry of said hash
table means for storing a pointer for indicating one of said entries of
said storage means,
an offset array means, said offset array means having a plurality of
entries, each entry providing a link from one of said entries of said
storage means to one or more other entries of said storage means,
said means for performing a hashing function providing a pointer to one of
said entries of said hash table means,
means for obtaining said pointer stored at said hash table entry pointed to
by said result of said hashing function,
means for calculating the difference between said storage means pointer and
said pointer read from said hash table means,
means for storing said difference into said offset array means entry
pointed to by said storage means pointer, and
means for storing said storage means pointer into said hash table entry
pointed to by said hashing function.
65. The apparatus of claim 64 further comprising means for incrementing
said storage means pointer to the next entry in said storage means.
66. The apparatus of claim 64 further comprising means for periodically
examining the pointer stored at each said hash table entry for determining
whether the difference between said pointer of each said entry and said
storage means pointer is greater than a predetermined amount.
67. The apparatus of claim 66 further comprising:
means for replacing said entry in said hash table by an invalid value,
thereby reinitializing said entry.
68. The apparatus of claim 67 further comprising:
means for appending to said encoded data an end of compressed data marker
which is detected during decompression of said variable length encoded
data stream.
69. The apparatus of claim 68 further comprising:
means for initializing said hash table by replacing all entries of said
hash table by invalid values.
70. The apparatus of claim 50 further comprising:
means for decoding said variable length encoded data stream,
means for parsing said variable length encoded data stream into separate
portions, each said separate portion starting with one of said tags, and
means for evaluating said tag of each said separate portion for determining
whether said tag is said raw data tag or said tag indicating said encoded
matching data string.
71. The apparatus of claim 70 further comprising:
means for interpreting said variable length indicator and said pointer for
generating said matching data string, thereby reconstructing a portion of
the original input data stream.
72. The apparatus of claim 70 further comprising:
means for obtaining said first character of said encoded input data stream,
thereby reconstructing a portion of the original input data stream.
73. An apparatus for converting a variable length encoded input data stream
into an output data character stream in a data decompression system, said
data decompression system comprising:
a storage means, said storage means having a plurality of entries, said
entries of said storage means for storing said output data character
stream,
a storage means pointer, said storage means pointer indicating a latest
entry in said storage means,
means for parsing said variable length encoded input data stream into
separate portions, each said separate portion starting with a tag,
means for evaluating said tag of each said separate portion for determining
whether said tag is said a raw data tag or a tag indicating an encoded
matching data string,
means for parsing a raw data byte from said separate portion, means for
outputting said raw data byte, means for placing said raw data byte in
said storage means, and
means for parsing a variable length indicator of the length of said
matching data string and an offset indicating the location within said
storage means of said matching data string, said separate portion being
less than two characters of said output data character stream when said
separate portion represents a matching data string of two characters,
means for outputting said matching data string at said location within
said storage means for said length, and means for placing said matching
data string in said storage means. |
|
|
|
|
Claims  |
|