|
|
|
| United States Patent | 5323155 |
| Link to this page | http://www.wikipatents.com/5323155.html |
| Inventor(s) | Iyer; Balakrishna R. (San Jose, CA);
Meriwether; Teresa A. (Cary, NC);
Sherwin, Jr.; Elton B. (Stamford, CT);
Sinha; Bhaskar (Boxford, MA) |
| Abstract | A method of transmitting compressed data using a Ziv-Lempel
compression/expansion algorithm, using an adaptive Ziv-Lempel (AZL)
dictionary modified to a mature state. The mature state is signaled by a
time to freeze signal sent as a switch-over signal from a transmitting
location to each receiving location. These signals freeze and synchronize
the AZL dictionaries at both locations, and starts a translation of the
frozen AZL dictionary to a static SZL dictionary--at least at the
transmitting location. The SZL dictionary is then used to compress records
being transmitted. An index translation process is generates translation
information to allow the receiving locations to decompress SZL indices
into original characters. The AZL-to-SZL dictionary translation process
re-organizes the frozen AZL to an SZL dictionary. The SZL process is used
until either the end of the inputted sequence, or a time to unfreeze
signal is generated. An SZL to AZL switch-over signal is generated in
response to the time to unfreeze signal, which in turn signals a switch
over back to the AZL process and invokes the saved frozen AZL dictionary
to be used to until mature to on the current input data stream at which
time the AZL is frozen and a switch-over signal is provided and a new SZL
is generated. |
|
|
|
Title Information  |
|
|
|
|
|
Drawing from US Patent 5323155 |
|
|
Semi-static data compression/expansion method |
|
|
|
|
|
| Publication Date |
June 21, 1994 |
|
|
|
|
|
| Filing Date |
December 4, 1992 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Title Information  |
|
|
Claims  |
|
|
Having thus described our invention, what we claim as new and desire to
secure by Letters Patent is:
1. A method of transmitting compressed data using a well-known Ziv-Lempel
(ZL) compression/expansion algorithm, comprising the steps of:
inputting a sequence of records containing characters of data to an
adaptive Ziv-Lemel (AZL) process at a transmitting location for
compressing the data including: generating an AZL dictionary in computer
storage by inserting an entry therein at an assigned index for each
inputted character string not found in the dictionary, detecting indicies
in the dictionary for the inputted character strings in the sequence of
records to represent compressed records; and transmitting the compressed
records to a receiving location;
generating a corresponding AZL dictionary at the receiving location from
received indices of the transmitted compressed records;
freezing the AZL dictionary to maintain its current content at the
transmitting location to provide an SZL (static Ziv-Lempel) dictionary
when the AZL dictionary has been generated to a mature state to enable SZL
processing; and
sending a switch-over signal by the transmitting location to the receiving
location to indicate a synchronization point in the sequence of
transmitted compressed records to enable the receiving location to freeze
the corresponding AZL dictionary with an identical current content for
enabling an SZL process to expand data by statically using the current
content following the synchronization point in the transmitted compressed
records.
2. A method of transmitting compressed data using a well-known Ziv-Lempel
(ZL) compression/expansion algorithm, comprising the step of:
inputting a sequence of records containing characters of data to an
adaptive Ziv-Lempel (AZL) process at a transmitting location for
compressing the data including: generating an AZL dictionary in computer
storage by inserting an entry therein at an assigned index for each
inputted character string not found in the dictionary, and detecting the
indicies in the dictionary and transmitting to a receiving location the
indices assigned to the inputted character strings in the sequence of
records to represent transmitted compressed records;
generating a corresponding AZL dictionary at the receiving location from
received indices of the transmitted compressed records;
freezing the AZL dictionary at the transmitting location to provide an SZL
(static Ziv-Lempel) dictionary when the AZL dictionary has been generated
to a mature state to enable SZL processing (which does not change the SZL
dictionary);
sending a switch-over signal by the transmitting location to the receiving
location to indicate a synchronization point in the sequence of
transmitted compressed records to enable the receiving location to freeze
the corresponding AZL dictionary for enabling an SZL process to expand
data following the synchronization point in the transmitted compressed
records; and
translating the frozen AZL dictionary in to a more processing-efficient
corresponding SZL dictionary at the transmitting location by copying child
character(s) of each extension character into an entry for the extension
character to reduce storage accesses for child entries in the dictionary
to speed up SZL processing for compressing subsequent records.
3. A method of transmitting compressed data using a well-known Ziv-Lempel
(ZL) compression/expansion algorithm as defined in claim 2, comprising the
step of:
converting each index detected in the SZL dictionary into a corresponding
index for a same character string in the frozen AZL dictionary, and
transmitting converted indices to the receiving location as the
transmitted compressed records to enable the frozen SZL dictionary to be
used at the receiving location for expanding data in the compressed
record.
4. A method of transmitting compressed data using a well-known Ziv-Lempel
(ZL) compression/expansion algorithm as defined in claim 2, comprising the
step of:
storing the indices and the switch-over signal in a transmission buffer at
the transmitting location in preparation for transmitting compressed data
to the receiving location.
5. A method of transmitting compressed data using a well-known Ziv-Lempel
(ZL) compression/expansion algorithm as defined in claim 2, comprising the
step of:
generating a switch-over signal as a predetermined value not used as an
index in the frozen AZL dictionary.
6. A method of transmitting compressed data using a well-known Ziv-Lempel
(ZL) compression/expansion algorithm as defined in claim 2, comprising the
step of:
completing operation by AZL and SZL processes upon detecting an end of a
sequence of records being inputted for compression.
7. A method of transmitting compressed data using a well-known Ziv-Lempel
(ZL) compression/expansion algorithm as defined in claim 2, comprising the
step of:
storing the frozen AZL dictionary in processor storage when a time to
freeze signal is provided.
8. A method of transmitting compressed data using a well-known Ziv-Lempel
(ZL) compression/expansion algorithm as defined in claim 2, comprising the
step of:
providing a time to unfreeze signal at the transmitting location by
detecting that greater data compression is available by switching from the
SZL process to the AZL process.
9. A method of transmitting compressed data using a well-known Ziv-Lempel
(ZL) compression/expansion algorithm as defined in claim 8, comprising the
step of:
retrieving the stored AZL dictionary from storage at the transmitting
location in response to a time to the unfreeze signal.
10. A method of transmitting compressed data using a well-known Ziv-Lempel
(ZL) compression/expansion algorithm as defined in claim 9, comprising the
step of:
generating a switch-over signal at the transmitting location in response to
a time to unfreeze signal.
11. A method of transmitting compressed data using a well-known Ziv-Lempel
(ZL) compression/expansion algorithm as defined in claim 10, comprising
the steps of:
buffering the transmitted indices and the switch-over signal in a
transmission buffer for transmission to each receiving location;
transmitting the switch-over signal to each receiving location to
synchronize a switching to the AZL process using the frozen AZL dictionary
at each receiving location with the switch-over at transmitting location;
and
invoking the AZL process at each location.
12. A method of transmitting compressed data using a well-known Ziv-Lempel
(ZL) compression/expansion algorithm as defined in claim 11, comprising
the step of:
inserting the switch-over signal into the transmitted compressed records to
synchronize the switch-over operations at both the transmitting and
receiving locations.
13. A method of transmitting compressed data using a well-known Ziv-Lempel
(ZL) compression/expansion algorithm as defined in claim 12, comprising
the step of:
counting N number of characters from the beginning of the AZL process to
generate a time to freeze signal when the count of N is reached at the
transmitting location.
14. A method of transmitting compressed data using a well-known Ziv-Lempel
(ZL) compression/expansion algorithm as defined in claim 12, comprising
the step of:
counting M number of characters from the beginning of the SZL process to
generate a time to unfreeze signal when the count of M is reached at the
transmitting location.
15. A method of transmitting compressed data using a well-known Ziv-Lempel
(ZL) compression/expansion algorithm as defined in claim 10, comprising
the steps of:
counting the number of characters in an inputted record provided to the AZL
process, counting the number of characters in a compressed record
generated from the inputted record, and obtaining a record compression
ratio by dividing the number of characters in the compressed record by the
number of characters in the inputted record; and
generating a time to freeze signal when the record compression ratio is
within a predetermined range of ratios for indicating a switch-over to the
SZL process.
16. A method of transmitting compressed data using a well-known Ziv-Lempel
(ZL) compression/expansion algorithm as defined in claim 10, comprising
the steps of:
counting the number of characters in an inputted record provided to the SZL
process, counting the number of characters in a compressed record
generated from the inputted record, and obtaining a record compression
ratio by dividing the number of characters in the compressed record by the
number of characters in the inputted record; and
generating a time to unfreeze signal when the record compression ratio is
outside of a predetermined range of ratios for indicating a switch-over to
the AZL process.
17. A method of transmitting compressed data using a well-known Ziv-Lempel
(ZL) compression/expansion algorithm as defined in claim 10, comprising
the steps of:
counting the number of characters in an inputted record provided to the AZL
process, counting the number of characters in a compressed record
generated from the inputted record, and obtaining a record compression
ratio by dividing the number of characters in the compressed record by the
number of characters in the inputted record;
averaging the last R number of record compression ratios generated for the
inputted sequence of records to obtain an average compression ratio; and
generating a time to freeze signal when the average compression ratio is
within a predetermined range of ratios for indicating a switch-over to the
SZL process.
18. A method of transmitting compressed data using a well-known Ziv-Lempel
(ZL) compression/expansion algorithm as defined in claim 10, comprising
the steps of:
counting the number of characters in an inputted record provided to the SZL
process, counting the number of characters in a compressed record
generated from the inputted record, and obtaining a record compression
ratio by dividing the number of characters in the compressed record by the
number of characters in the inputted record;
averaging the last R number of record compression ratios generated for the
inputted sequence of records to obtain an average compression ratio; and
generating a time to unfreeze signal when the average compression ratio is
outside of a predetermined range of ratios for indicating a switch-over to
the AZL process. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
INTRODUCTION
This invention relates to a semi-static manner of using the Ziv-Lempel (ZL)
process of compressing and expanding computer data.
BACKGROUND
The adaptive Ziv-Lempel (AZL) process is presently used by many commercial
computer systems to reduce the storage space required by data files. The
AZL process is disclosed in a number of prior filed patent applications,
such as in U.S. Pat. No. 4,814,746 to Miller and Wegman assigned to the
same assignee as the subject application. It describes and claims a ZL
compression dictionary using an LRU (least recently used) replacement
technique for continuously adapting (modifying) the dictionary to new
input data strings being compressed when the dictionary runs out of unused
entries to accommodate new input data character strings.
The ZL data compression process compares characters in the inputted
sequence to trace through respective entries in the dictionary to locate
the dictionary entries representing the longest matched input strings, and
to output the dictionary indices of these entries as the compressed data
stream. The indices may be buffered and then transmitted in bursts of
indices to maximize the efficiency of transmitting the compressed records
or messages.
The AZL process modifies a dictionary for each separately inputted sequence
of data characters, which may be the inputted characters in a data file,
or in a computer session being transmitted. The AZL process always
continues the construction and modification of its AZL dictionary WHILE
compressing data. An AZL dictionary is always changed by each new
character string in the input stream of data being compressed. When the
AZL dictionary becomes full, one of its valid string entries is deleted
and replaced by a representation of each new character string being
inputted.
The compressed output of the Ziv-Lempel process is represented by a
sequence of dictionary indices representing the character strings detected
in the input stream. Each index is expanded to the characters in the
string it represents by accessing the string entry at that index and using
pointers therefrom to access entries representing the characters in the
string in a reverse order. Because of LRU entry replacement in an AZL
dictionary, the index representation of a particular character string may
change, such as if an entry is aged out of the dictionary and later is put
back in the dictionary at a different index location when the string is
again encountered in the input stream.
If data compressed by the AZL process is being transmitted, its AZL
dictionary is not transmitted. Instead, an identical AZL dictionary (used
for expansion) is constructed from the received compressed data (indices)
at the receiving location in synchronism with the construction of the
identical AZL compression dictionary at the sending location. Synchronized
identity of the AZL dictionary structures is essential at both the sending
and receiving locations, in order to expand the received compressed data
into uncompressed data identical to the original uncompressed data.
Hence, the AZL process "dynamically adapts" its dictionary by continuously
updating it to each newly inputted input string in the sequence. This is
why the AZL dictionary is known as a "dynamic" or "adaptive" dictionary.
The AZL process may be used to perform compression/expansion at the same
location, or at two different locations connected by data transmission.
Only a single dictionary is used when performing compression/expansion at
the same location, but separate identical dictionaries are needed when
compression is done at one location and expansion is done at a different
location.
Accordingly, the AZL process generates its dictionary(s) WHILE the data is
being inputted for compression, whether the AZL process is performing
compression and expansion at the same location, or at two different
locations connected by data transmission.
An AZL dictionary can only handle records sequentially accessed from a data
base, since its structure is totally dependent on the sequential nature of
its inputted records. An AZL process does not expand records randomly
accessed from a data base in a different order than the records used in
the current construction of its AZL dictionary.
Thus, the AZL process is totally dependent on the sequential relationship
of the characters currently being inputted for compression. For example,
if the same set of records are inputted in two different sequences for
compression by the same Ziv-Lempel process, each sequence will generate a
different dictionary, and the same uncompressed record will likely
generate a compressed record containing a different set of indices in the
two sequences. Thus, the AZL process has an "input-record-order
dependency", due to the AZL dictionary(s) being generated DURING the
inputting of uncompressed data for compression.
A computer operates inefficiently with the AZL process when its input
stream is comprised of "randomly-obtained" data, such as
"randomly-accessed" small records or "randomly-determined" small messages
represented in a data base. AZL processing must build a new AZL dictionary
for every sequence of "randomly-obtained" records or input message. This
is because the building an AZL dictionary DURING the inputting of a
"random-obtained" sequence of records (or messages) ties the dictionary to
that particular sequence, and it cannot be used efficiently with any other
"randomly-obtained" sequence of records. This prevents an AZL dictionary
generated for any randomly-obtained data from being used with any other
randomly-obtained data.
The computer-efficiency problem with records randomly-obtained from a data
base is solved by the static Ziv-Lempel (SZL) process described and
claimed in patent application Ser. No. 07/968,631 entitled "Method and
Means Providing Static Dictionary Structures for Compressing Character
Data and Expanding Compressed Data", filed Oct. 19, 1992 and assigned by
the same assignee as the subject application.
The SZL process generates its SZL dictionary(s) BEFORE (and not while) it
is used to compress or expand any record in the data base. The SZL
dictionary is generated from ALL uncompressed records in a data base. The
resulting compressed records may be stored to provide a compressed form of
that data base which occupies only a fraction of the storage of the
uncompressed data base.
The SZL process allows any individual record in the data base (whether
compressed or not) to be randomly accessed and expanded independent of
other records, using the SZL dictionary without change (and no computer
processing is used for any dictionary generation or updating). The
compressed records may be expanded at the same location, or transmitted to
another location where the expansion is done.
By generating an SZL dictionary from an entire uncompressed data base PRIOR
TO using the data base, the same compressed record (same indices) is
obtained regardless of the order in which that record is later obtained.
Hence, it may later be randomly-obtained from the data base without any
affect on its compressed form, unlike in the AZL process.
Accordingly, the SZL process generates an "SZL dictionary" that represents
all of the strings within the records in a data base. (The term "records"
is used in a generic sense to include any recorded sequence of characters,
such as may be found in messages accessed from a data base.)
The invention in application Ser. No. 07/968,631 also discovered that an
SZL dictionary used for compression need not be identical to an SZL
dictionary used for expansion, as long as they use the same indices to
represent the same character strings. That is, the content of SZL
dictionary entries may be different for corresponding entries at the same
index in the separate compression and expansion dictionaries. These paired
SZL compression and expansion dictionaries are herein referred to as
"corresponding" dictionaries, which may or may not be identical.
Increased computer efficiency is obtained by using separate corresponding
SZL compression and expansion dictionaries, rather than identical
dictionaries. Corresponding SZL compression and SZL expansion dictionaries
may be used at the same location, or at different locations connected by
data transmission.
If the expansion process is done at a location different from the
compression process, all SZL corresponding dictionaries needed at the
different locations may be constructed at any location having the
uncompressed data base, and then the SZL dictionaries may be transmitted
to any location wherever needed. Or if an identical copy of the
uncompressed data base exists at plural locations, the corresponding
dictionary(s) need at the location may be constructed there.
In a data transmission environment, the SZL dictionary may be transmitted
to a receiving location after the dictionary is generated from the entire
uncompressed data base at the sending location and before using the SZL
dictionary. An SZL dictionary can be constructed at a receiving location
only if the receiving location has the same uncompressed records and
inputs them to the SZL process in the exact same input sequence as is used
to generate the SZL dictionary at the sending location.
A "compressed-version of the data base" may be generated simultaneously
with generation of the SZL dictionary(s). The compressed version of the
data base may be used to randomly or sequentially obtained any record in
the data base in compressed form.
No SZL dictionary is thereafter constructed during random-accessing
operations of the data base, whether uncompressed or compressed records
are being randomly-obtained at a location which is to transmit the record
in compressed form. That is, the prior-constructed SZL dictionary(s) can
then be used for the compression and/or expansion of records or messages
randomly-obtained at any location.
Furthermore, the SZL process may also be used to compress and expand new or
changed records or messages in the uncompressed data base AFTER the
generation of the SZL dictionary. Any new character string in such new or
changed record is compressed and expanded as containing one or more
existing smaller character strings currently represented in the dictionary
(due to being in the prior version of the data base).
The SZL process has been found to operate efficiently with
randomly-obtained small compressed records, or messages, that need to be
compressed and expanded at the same location, or need to be compressed at
one location and transmitted to another location where the record is
expanded.
Previously, it had been presumed that poor data compression would result
from the Ziv-Lempel process if the AZL process was not used to
continuously adapt its dictionary to its input data. This has not been
found to be the case with SZL processes, as long as the data base does not
change by a large amount. Thus, it has been found that the SZL process
effectively compresses records randomly accessed from a data base as long
as an excessive number of changes have not been made in the data base.
Thus, the SZL process allows an existing SZL dictionary(s) to be used with
any sequence of records or messages randomly obtained in a large data base
without constructing or modifying any new dictionary. That is, no
processing is spent on modifying the SZL dictionary structure while using
the SZL process for compressing and expanding any sequence of records. On
the other hand, the AZL process requires a large amount of computer
processing for modifying its AZL dictionary to adapt it to each sequence
of randomly-obtained records. The result is that the AZL process is not as
efficient as the SZL process, when used with new sequences of
randomly-obtained small records and messages.
While the SZL process is being used, old uncompressed records may be
updated and new uncompressed records may be added in the corresponding
uncompressed data base at a central location containing the data base.
Yet, the existing SZL dictionary(s) and any corresponding dictionary(s) at
this and any other location need not be updated, in order to use these
dictionaries to compress and expand the new and updated records in the
data base, as long as the SZL dictionary(s) is not changed.
Computer operating efficiency is further enhanced by use of a novel
structure for entries in the SZL dictionary disclosed and claimed in
patent application Ser. No. 07/968,631, which obtains a further
significant performance increase for a computer system (over prior art AZL
processes). This novel SZL entry structure enables fewer accesses in the
SZL dictionary than is required in conventional AZL dictionaries--for
character-string-compare determinations within the dictionary entries.
Reducing the comparative number of storage accesses for dictionary entries
in memory (in addition to not having to spend processing on modifying the
dictionary) enables this SZL process to be much faster than the AZL
process (using computers having the same instruction execution rate).
SUMMARY OF THE INVENTION
The subject invention operates with a sequence of transmitted records
(including messages). They may be obtained from a large data base. The
sequence of records may be originated and transmitted by a human. The
records (messages) may be randomly-selected from a distantly-located
uncompressed, or a compressed, data base by commands issued in a sequence
by a human to a computer system, which may then then transmit them to the
human in that sequence. The sequence may have a significant total number
of characters (for example, over 1000 characters), even though any
particular record or message in the sequence may be very small (for
example, 3 characters).
The primary object of this invention is to reduce the amount of computer
processing (over what was previously required using AZL processes) at both
a sending location (retrieving and compressing data) and a receiving
location (expanding the received compressed data), where the transmitted
data is a sequence of records or messages.
Another object of this invention is to significantly speed up the data
transmission of a sequence of uncompressed records, or messages, by not
interrupting the transmission for computer processing eliminated by the
SZL process (which is required by the AZL process).
A further object of this invention is to avoid having to previously
generate an SZL dictionary for an entire data base in order to obtain some
of the advantages of the SZL process.
Still another object of this invention is to use the AZL process to
generate a limited SZL dictionary from only an initial subset of
randomly-obtained records or messages obtained from a data base, and then
be able to use the limited SZL dictionary for controlling the compression
and expansion of subsequent records in a sequence of records being
transmitted. Performance efficiency is increased in freezing an AZL
dictionary and operating it as an SZL dictionary to eliminate the
processing cycles of modifying the dictionary for each new input string.
The eliminated processing cycles increases the system processing
efficiency at least until a significant number of new strings are inputted
and not found in the SZL dictionary, after which the dictionary may be
changed to an AZL dictionary and further updated with new strings.
A further object of this invention is to provide a semi-static Ziv-Lempel
process for use in data transmission by connecting the AZL and SZL
processes with a switch-over control.
The invention starts by using an AZL process to compress an initial part of
an inputted sequence of records or messages until a switch-over time is
detected when the compression operation can be done more efficiently by
the SZL process. At AZL to SZL switch-over time, the AZL process is ended,
and a switch-over is made to the SZL process. An optional AZL-to-SZL
dictionary translation process may be used to re-organize the SZL
dictionary into the form taught in application Ser. No. 07/968,631, and
the frozen form of the AZL dictionary may then be saved. The SZL process
is then used until either the end of the inputted sequence, or an SZL to
AZL switch-over signal is generated to signal a switch-over from the SZL
to the AZL process, in which case the saved AZL dictionary may be
retrieved and the AZL process in again invoked.
Thus, the SZL to AZL switch-over may be invoked whenever the SZL dictionary
is found to need updating to maintain compression efficiency, due to an
excessive amount of new strings occurring in the input stream. Even if an
AZL to SZL dictionary translation was done in the AZL to SZL switch-over,
an SZL to AZL switch-over does not need any dictionary translation process
if the last version of the AZL dictionary has been saved, because last AZL
dictionary represents the same data strings as are represented in the
current SZL dictionary.
Any of several different ways may be used to determine the occurrence of
the time to freeze, and the time to unfreeze the dictionary. One way to
detect the time to freeze and unfreeze is by a character count, or record
count, for the input stream, and to signal it upon detecting a
predetermined count. The time to freeze may be signalled with a different
count than the time to unfreeze. The counting technique is simple and
operable, but is not the most compression-efficient way. A more precise
way, although a more complicated way, is to monitor the compression ratio
for each compressed record generated by the current compression process.
The time to freeze may be determined when the compression ratio falls
within an AZL-to-SZL predetermined range of compression ratios. And, time
to unfreeze may be determined when a subsequent record has a compression
ratio which falls outside of a predetermined SZL-to- | | |