|
Description  |
|
|
BACKGROUND OF THE INVENTION
This invention relates generally to Electronic Data Interchange ("EDI")
systems and more particularly to a novel EDI translation system.
EDI can be defined as the paperless, computer application to computer
application, inter- and intra-organizational exchange of business
documents, such as purchase orders and invoices, in a structured,
application-processable form. An EDI document can be sent directly to a
business partner's computer over a communication line. EDI provides many
benefits, including: a) speed--documents are sent and received almost
immediately; b) accuracy--documents are received as they were transmitted,
eliminating manual rekeying of data and attendant errors; c) cost
reduction--rapid document turnaround allows more accurate planning of
inventory levels and reduces inventory reorder time; d) increased
productivity--employees are freed from paperwork and available for other
tasks; e) simplified broadcast communications to multiple trading partners
(such as sending a request for proposal); f) directness of
communication--data is routed directly from the person placing an order to
the data processing system of the receiving organization; and g) data
integration--data in documents can be integrated directly with existing
business information and data processing systems.
EDI involves three essential components: a) EDI standards; b)
communications means; and c) an EDI translation system. EDI standards can
be divided into formatting standards, dictionary standards, and
communications enveloping standards. Formatting standards govern: a) what
documents can be communicated; b) what information is to be included; and
c) how the information is to be sequenced and presented.
Dictionary standards specify the meaning of the various elements being
combined by the formatting standard. Communications enveloping standards
define how to group documents together into larger units. Communications
enveloping saves on addressing by grouping a number of messages meant for
the same destination from a specific source. The communications enveloping
standard can also provide password security not present in paper forms of
communication. Any of the standards can be proprietary standards (limited
to one organization and its trading partners) or common EDI standards
(adopted by industry-wide or cross-industry users).
EDI standard documents from and to which trading partner data is converted
have detailed definitions in the pertinent EDI standard. EDI standards are
maintained by various standards maintenance organizations. Examples of
such standards (and maintenance organizations) are the ANSI X12 standard
(developed by the American National Standards Institute's Accredited
Standards Committee's X12 group), the UN-EDIFACT standard (Electronic Data
Interchange for Administration, Commerce, and Transport, an international
standard based on ANSI X12 and the Trade Data Interchange standards used
in Europe), the Uniform Communications Standards ("UCS"), and TDCC
(developed by the Transportation Data Coordinating Committee). The
standards are not invariant --they continue to evolve to meet the changing
needs of information transfer.
Standards may have different terminology. There are however, similarities
in the meaning associated with the terms. Whether termed a transaction set
(ANSI X12), a standard message (EDIFACT), or a document (UCS), there is an
electronic representation of a paper business document. A unique
identifier code is assigned in the standard for each type of business
document. As an example, in the ANSI X12 standard an invoice is referred
to as X12 document number X12.2, with a transaction set identification
code of 810.
EDI standards are developed using a set of abstractions: a) there is a unit
of data called a data element; b) data elements can be grouped into
compound data elements; c) data elements and/or compound data elements are
grouped into data segments; d) data segments can be grouped into loops;
and e) loops and/or data segments are grouped into a business document.
The abstraction is based on an analogy to a paper document. Paper documents
can be considered to have three distinct areas: the heading area, the
detail area, and the summary area. In many cases the detail area consists
of repeating groups of data elements. For example, in an invoice, the
elements are the items being invoiced and are usually printed as lines in
a columnar list. In the terminology used in the standards, these repeating
groups are loops. Grouping data elements into loops proves unwieldy
because of the number of data elements that must be considered. The
standards therefore group data elements into data segments and compound
data elements.
Transaction set standards specify whether data segments are mandatory,
optional, or conditional and indicate whether, how many times, and in what
order a particular data segment can be repeated. The transaction set
standard does not specify the content of individual data segments.
Instead, a segment directory identifies the specific data elements to be
included in each data segment. The segment directory is composed of a
series of data segment diagrams, each of which identifies the data
elements to be included in a data segment, the sequence of the elements,
whether each element is mandatory, optional, or conditional, and the form
of each element in terms of the number of characters and whether the
characters are numeric or alphabetic.
Data segment diagrams include the following components. The data segment
identifier identifies the data segment being specified. The data element
separator is a user-selected character that precedes each constituent data
element and serves as a position marker. The data segment terminator is a
user-selected character used to signify the end of the data element.
Element diagrams describe individual data elements.
Depending on the standard, element diagrams can define an element's name, a
reference designator, a data dictionary reference number specifying the
location in a data dictionary where information on the data element can be
found, a requirement designator (either mandatory, optional, or
conditional), a type (such as numeric, decimal, or alphanumeric), and a
length (minimum and maximum number of characters). A data element
dictionary gives the content and meaning for each data element.
EDI standard documents are electronically packaged or "enveloped" for
transmittal between trading partners. Enveloping can be at several levels.
The first, or innermost, level of enveloping separates one document from
another. This is accomplished by attaching transaction set headers and
transaction set trailers to each transaction set, or document.
At a second level of enveloping, documents can be packaged together into
groups known as functional groups. An example of a functional group is a
purchase order and an invoice, which are often sent together in both the
paper and EDI worlds. Each functional group is packaged with a functional
group header at its beginning and a functional group trailer at its end.
This second level of enveloping is an optional level in most standards.
At a third level of enveloping, all functional groups to be sent to a
single trading partner can be packaged together. This enveloping consists
of an interchange control header and an interchange control trailer
bounding either the packaged functional groups and/or the document.
The second component of EDI is communications means. EDI standard documents
are transmitted electronically between trading partners' computers. The
transmittal can be directly between trading partners, via a direct private
network in which the computers are linked directly. Direct networks become
difficult to maintain with larger numbers of trading partners. The
alternative is to use a third-party, or value-added network (VAN). A VAN
maintains an electronic mailbox for each trading partner that can be
accessed by each other partner (with appropriate security restrictions).
The standard does not specify a communications standard except to the
extent to which it describes the enveloping standard and the way in which
transmissions can be acknowledged. There are de-facto standards such as
IBM 2780/3780 BSC protocols used as file transfer protocols. Virtually
every EDI VAN supports these protocols because of their pervasive use.
The third component of EDI is the EDI translation system, which performs at
least the functions of data communication and document translation.
Document translation is the most significant function in this component,
and is implemented through an EDI translation software.
In practice, participants in EDI select and agree to use a proper subset of
the document standard. Such agreements among trading partners are too
numerous to be included in an EDI translation software. The EDI
translation therefore needs to provide a way for the user to express how
to process or generate the EDI document in compliance with the agreement.
One example of the need for compliance with a trading partner agreement is
when one trading partner is to automatically post an EDI document to a
processing system. A purchase order received from a customer could be
booked automatically through the recipient's order entry system. Since the
agreement and the interface requirements to the order entry system are not
defined in the EDI translation software, the user of the EDI translation
software must specify how to perform the translation.
The EDI translation software must provide three functions in conjunction
with giving the user a means of expression for performing the translation.
First, the software must provide mechanisms to navigate and manipulate EDI
documents. Second, the software must be able to produce the records that
are to be interfaced with the order entry system. Third, the system must
provide the user with means to express (using these primitives) a complete
procedure for transforming an EDI document into something that is
compatible with the user's interfacing requirements.
In some cases a user may find it advantageous to automatically print an EDI
document on paper in a format that is familiar to the user. For example,
it may be helpful to print an invoice because the accounts payable system
is implemented manually. As in the discussion above, the translation
software does not include a definition of the format in which the user
expects the output. The user must therefore specify this information.
The development of prior EDI formatting or translation software can be
traced through three generations--translation software of the invention
represents the fourth generation. The first generation of software was
developed in the late 1970s and early 1980s to support a variety of
private formats. The private formats were developed by large corporations
to allow them to exchange business documents with their trading partners.
The formats did not conform to any industry standards. The software used
in these systems resembled subsystems of existing general business
applications.
These first generation systems typically involved the exchange of
private-format data files and associated data processing programs. Trading
partners typically communicated directly with each other over private
networks. Examples of first generation systems include General Motors' and
Ford's automotive "release" systems and retail industry order processing
systems, such as those used by Sears, J.C. Penney, and K Mart.
The second generation was characterized by the introduction of variable
length, hierarchical document standards such as TDCC (used in the
transportation industry) and UCS (used in the grocery industry), which
created a need for a more generalized approach to translating those
standards into computer-processable business forms.
Translators of this generation typically employed fixed data files
transferred to an intermediate process that translated the document into a
form usable by host applications. The translator's primary task was to
convert the records in the data files from variable length format to a
fixed length format that could be processed by traditional batch
applications. These translators were known colloquially as "asterisk
strippers" because they had no capability to manipulate data between
records or to change the placement of data within records.
A major shortcoming of these second generation translators was that a large
amount of additional intermediate processing was required, typically
involving programming tasks to integrate the EDI document into an
application system. Networks and major trading partners tended to overlook
this problem in their efforts to expand trading partner relationships. The
dissatisfaction of end users with the high software maintenance demands of
this post-translation processing requirement led to the development of the
third generation of software.
The third, or current, generation of translation software attempts to
provide higher levels of transform capability so that an EDI document can
be put into a form closer to the input required by the user's integrated
application software. These translators provide for table-driven systems
and "dynamic mapping."
The translation software uses a table structure to perform the translation.
The tables consist of the standard data dictionary and syntax rules for
the data segments and elements of a given EDI transaction set. The
software selects the appropriate table to perform the translation for a
specified EDI transaction set to be generated.
Dynamic mapping allows the user to identify the relationship of elements
within a segment to fields in an application input document and vice
versa. Instead of fixing record lengths, the systems allow the user to put
data elements into different data files in any location. Rather than being
limited to a single fixed length file in a transaction set, the user can
select data from multiple files, in any order within the file, and present
the data to the translator.
The ACS Network Systems EDI 4XX product, available from ACS Network Systems
of Concord, Calif., is typical of third generation software products. The
data communication component of ACS 4XX provides the means to generate and
maintain a communication line directly with a trading partner or to a
third party data network and the means to control the process of sending
and receiving documents to and from trading partners. The translation
component of ACS 4XX translates incoming standard business documents from
an EDI Standard format to a format usable by applications programs and
reverses the process for outgoing data.
These third generation systems suffer from several problems. First, the
dynamic links between fields in application databases and data elements in
EDI documents are unconditional--the field mapping or linking is construed
to be constant in any given application. The systems are therefore unable
to represent the conditional expressions that appear as "notes" in almost
all standards. For example, a segment definition may have a conditional
note that specifies that either the second data element must be present or
both the third and fourth elements must be present. Although some
translators have attempted to comply with these notes through the actual
translator software code, this technique has proven inadequate and
difficult to maintain as standards and documents evolve. Further, the
standards definitions employed by these systems have relied on a data base
schema of the standards definition. This is a relatively inflexible
approach that does not readily accomodate the continual evolution of the
standards.
Second, these systems assume that EDI input will require non-EDI, or
application, output. This prevents the systems from acting as true
translators, capable of communicating any type of input and any type of
output. Similarly, the communication components of such systems assume
that their only interfaces would be with EDI-capable networks. This
precludes straight file transfer capability between computers, and
prevents the systems from acting in a terminal emulation mode to interface
with other computers. Further, the communications interface does not
provide the structure for unattended operation; the systems cannot act as
passive communications systems for receiving calls from outside systems.
There is also no method for allowing an outside application to send the
system data for translation into EDI format.
Third, there can be structure or sequence clash between the EDI document
and the application transaction's structure requirement. The previous
generation systems provide a tool for specifying transformation of data
through the use of a mapping program. The mapping program includes
assumptions about the correspondence of structure and sequence between
source data and target data. The tool therefore cannot be used to
translate data with a structure or sequence clash between the source data
and target data.
EDI documents are defined using a specification that prescribes a sequence
and structure to the document. When the application transaction's
structure and sequence differs from the corresponding EDI document
specification, the user is required to develop a set of programs that deal
with the structural or sequence difference to achieve a seamless interface
to the application.
For example, a retail store may send to a manufacturer purchase orders with
store distribution specifications attributed to each line item. If the
manufacturer requires separate paperwork for each shipping location, the
grouping of data as expressed by the EDI document clashes with the
manufacturer interface requirement. The clash is caused by the different
structure assumed when producing the EDI document (which specifies how to
distribute the order for a specific item to multiple shipping locations)
and the structure assumed by the order entry system (which assumes that an
order document contains items for a specific shipping location).
Third generation systems do not attempt to address this problem. The
general approach is to produce a file that contains the necessary data in
a rigid, hierarchical structure and to force the user to develop a set of
programs using whatever programming language and utilities are available
to change the structure. Some of these programs can become very
complicated.
Fourth, third generation translators generally have limited pattern
matching capability. This reduces the scope of acceptable types of input.
The pattern matching of application transaction files is very limited.
Records are identified using a strictly specified value in a specific
position in the record. Therefore, unless the application transaction file
created from the computer application contains these very specific
value(s) (i.e, H01, H02, D01, D02, etc.), an interface program must be
developed to supply data to the translator.
Fifth, the current systems have strict one-to-one correspondence between
source and target documents. The systems therefore cannot receive a
document, such as a shipping notice, and generate multiple transactions,
such as a receiving notice, an inspection notice, and an invoice notice.
Sixth, prior systems have limited capacity for evaluating performance
errors. For example, a user who creates a mapping specification with a
table driven or dynamic mapping scheme may find during operation that some
of the data elements are output incorrectly--the value of a data element
for one document may turn up in another document. It can be difficult for
the user to find the source of the error because the systems provide no
debugging mechanism for finding the problem. While a debugger capability
could theoretically be added, it would be onerous to implement because the
structure of the systems (being non-language based) is not susceptible to
a debugging facility.
Finally, the third generation translators limit operations to simple
assignment semantics. There is no provision for performing arithmetic
operations on source data elements to produce a target data element.
Performing logical operations or string manipulation are usually not
provided.
SUMMARY OF THE INVENTION
The invention overcomes the problems with the prior generation systems with
an EDI translation method that receives data from a source in one format,
executes a script to translate the data into a second format, and
transmits the data in another format to a destination. The system has the
ability to transform input data from, and into, virtually any format and
to produce more than one output for each input document. The system
employs a tree data structure and a script of translation instructions to
overcome the hierarchical data structure limitations imposed by the prior
generation systems.
The system can pattern-recognize records that are not explicitly
differentiated. It provides flexibility in using virtually any
communication system to communicate EDI as well as non-EDI documents. The
system employs a model that assumes that both input and output are to be
communicated and is therefore not constrained for use with a particular
processor, but can instead act as a true communication front end.
The system addresses the difficulty that prior systems had in expressing
syntax notes associated with segments in relational terms. This is done by
expressing the notes using language construction of logical operators on
data elements. For example, the semantic expression "(-or 2 (-and 3 4))"
means "either the second element or both the third and fourth elements
must be present."
The system addresses the inability of prior systems to handle structure
clash between an EDI document and an application's data structure
requirement by providing data and control structures. The control
structures include such basic structures as executing a series of commands
while some predicate condition exists.
The system supports data structures such as multi-dimensional arrays and an
EDI tree data structure. The EDI tree data structure represents an
instance of an EDI document. The tree provides random access operation on
the document allowing the user to specify the sequence in which access to
the EDI document is to be made. The system further provides a set of
access primitives. These two tools allow the user, for example, to read a
document and process it repeatedly, once for each of several shipping
locations.
The system facilitates support for relational level notation in EDI
document definitions that specify hierarchical nesting, such as EDIFACT
standard documents. It supports implicit as well as explicit notation in
EDI document definitions. The tree access notation enables the EDI
programmer to reference EDI data elements so that a language can support
it as one of its data types. Without such a notation, support for EDI in a
programming language would be difficult. Nesting and repetition of
segments is discussed in ISO Publication 9735, pp. 6-9 (1988).
The system also overcomes the limited ability of prior systems to pattern
match data by providing a pattern matching capability that can read a
fairly large set of patterns that follow an LL(1) grammar (for a
description of LL(1) grammar, see Aho, Sethi, & Ullman; Compilers:
Principle, Techniques and Tools, pp. .sctn. 4.4 (Addison Westley 1986).
The user can formulate a filemap definition and record definitions to
express the expected pattern. The system also allows the user to generate
as many target documents as required from a single source document.
The system also overcomes the failure of the prior systems to provide a
debugging capability. Since the system is language based, a conventional
debugging facility can be readily provided.
Finally, the system has a broader range of operations performed than the
prior systems and provides expressions such as those involving logical and
arithmetic operators and string manipulation. Data transformation is
enabled by use of an assignment statement. The assignment statement
accepts an expression on the right hand side as in:
a=b+c
This statement specifies that a is computed from b plus c. The elements a,
b, and c could be elements from the source or target, although a is likely
to be an element of the target and b and c are likely to be elements of
the source. In this example, a "+" operation is used in the right hand
side expression. The system supports many such operations including
"substring" to extract a portion of a string. These operations are termed
language "primitives". The set of language primitives depends on the
nature of the problem. However, by employing a programming language
implementation, the invention allows a user to combine these basic
operations into complex expressions. This expressiveness allows the user
to specify more complex transformations.
The system is organized into four component work centers: a) Communications
Interface (having a communication session as its input work unit); b)
De-enveloping (having an interchange as its input work unit); c)
Translation (having a document as its input work unit); and d) Enveloping
(having an enveloping request as its input work unit).
The communications interface work center uses a script to schedule a
communication session and describe how to break up the contents of the
communication into units of de-enveloping work.
The de-enveloping work center divides a communication interchange into its
component documents. It also performs a routing function, routing
documents to the required destination.
The translation work center manipulates an incoming document into the
format that is expected by another system. It can convert EDI data to a
format that can be printed or used by application programs and can convert
a file created by an application program to a standard EDI format. It is
implemented as an interpreter or compiler that understands translation
primitives and can be used with a script to perform transformations on
many kinds of data. In the illustrated embodiment, each of these work
centers are implemented using a novel EDI programming language, referred
to herein as the e-language.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of data flow through the EDI translation system.
FIG. 2 is a block diagram of a hardware implementation of the invention.
FIG. 3 is a sample communication script.
FIG. 4 is a sample EDI de-enveloping script.
FIGS. 5A, 5B, 5C, 5D and 5E are a sample application de-enveloping script.
FIG. 6 is a graphic representation of the organizational structure in which
files are stored before enveloping.
FIG. 7 shows sample application data input for a translation operation.
FIG. 8 shows sample EDI output corresponding to the input of FIG. 7 as
produced by the script shown in FIG. 13.
FIG. 9 is a document definition for an X12 810 invoice implemented in the
e-language.
FIG. 10 shows two sample segment definitions implemented in the e-language.
FIG. 11 shows two sample element definitions implemented in the e-language.
FIG. 12 shows three sample data type definitions implemented in the
e-language.
FIGS. 13A, 13B and 13C show a sample translation script for translating an
application document to an EDI document.
FIGS. 14A, 14B, 14C, 14D and 14E show a sample translation script for
translating an EDI document to an application document.
DETAILED DESCRIPTION
The system is organized into four component work centers: a) Communications
Interface; b) De-enveloping; c) Translation; and d) Enveloping. The flow
of data through the system is illustrated in FIG. 1. Data flows into the
system through communications interface 1. It then flows to the
de-enveloper 2, to the translator 3, and to the enveloper 4. From the
enveloper, the data flows back to the communications interface and thence
out of the system.
In the illustrated embodiment, a data item flowing through the system can
be considered conceptually as a package with a packing slip. As the data
item is processed by each component of the system, some information is
read from the packing slip, and additional information is written to the
slip. The information that appears on the slip determines how the package
will be processed. The system uses a "routing form" as an abstract
representation of the packing slip. A routing form follows a data item
through the system, and a component can both read from and write to the
routing form. The behavior of a component may depend upon information that
is read from the routing form.
The routing form provides for the following information: a) interchange
sender; b) interchange receiver; c) functional group sender; d) functional
group receiver; e) node (the network or application used); f) facility
(the communication protocol); g) content type (EDI, application, or text);
h) document type (transaction set identifier); i) translation script; j)
functional group; k) error message; and l) functional acknowledgement
flag. The information is placed in the routing form in the following
order: a) in the communications interface--the node, facility, and content
type; and b) in the de-enveloper--document type, senders, receivers,
functional group, functional acknowledgement flag, and translation script.
Error messages are written to the routing form by whichever module detects
the error.
The EDI translation system can operate on a variety of hardware. In the
illustrated embodiment, the system operates on a microcomputer having a 20
MHz Intel 80386 processor, 4-8 MB of RAM, one or more serial ports, a 100
MB fixed disk drive, a 2400 Baud modem with V.22 VIS/MSP protocol, and
having a UNIX operating system. The system may also be operated on a local
area network, with different machines implementing different functions of
the system. For example, one machine can serve as the communications
interface while another provides file storage.
As shown in FIG. 2, the system operates on microcomputer 10, which is
connected to one or more EDI networks 20 and a host 30. The connection 40
between the system and the host is a synchronous connection such as a Bell
208A modem or another line driver. EDI data is transmitted to and from
trading partners via the EDI networks, while application data is
transmitted to and from applications operating on the host. All data
transmission between the system and either the host or the EDI networks is
routed through the communications interface work center, shown at 50.
In an alternate embodiment, the host on which the applications are operated
and the microcomputer on which the EDI system operates are combined, so
that applications are operated on the same machine as the system.
In the illustrated embodiment, each of the EDI system work centers is
implemented in the e-language. Of course, the work centers could be
implemented in other languages, such as Lisp or C++, although with greater
difficulty. The e-language uses arithmetic, boolean, functional, and
control structures similar to those in BASIC, Pascal, and COBOL. In
addition, the e-language contains functions and data structures unique to
EDI processing. These specialized functions include facilities for
converting an EDI document to an EDI data tree and for performing the
reverse operation. The e-language also contains commands to determine the
status of various parts of the EDI system, and to perform low-level
operations on these parts. These functions, and other aspects of the
e-language, are described in the "e" Programmer's Reference Manual,
attached as an Appendix hereto and being a part of the disclosure of this
application.
The e-language allows the user to perform pattern-matching operations. For
example, the e-language can be used to describe the format of an
application file and to select the portions of the file to be used. This
allows the production of a customized EDI document from an application
file.
The e-language is structured in the manner described in the attached
Appendix. The syntax of the e-language borrows some of the command names
and clauses from BASIC and COBOL and general syntax structures from
Pascal.
The system employs the concept of a data stream. A stream is a source of
input or a destination of output. An input/output function generally takes
a stream as one of its arguments. When a program is executed, a default
input stream and a default output stream are created. Stream objects are
bound to one or the other stream. Other streams can be opened during a
program execution that can be bound to variables used in other run-time
calls for input/output purposes.
COMMUNICATIONS INTERFACE
The communications interface has two parts: a) a scheduler; and b) a set of
facilities drivers. The scheduler is a program that references a
user-created table to determine when to communicate with sources of
information (such as a trading partner or an application). The actual
communication is done through a facility driver. All of the mechanics of
transferring files via a given communication protocol are packaged in the
facility driver, and are accessible to the user through a communication
script. The script, which can be written in the e-language, specifies the
procedures to be used to create a de-enveloping work unit.
The communications interface work center defines the concept of a logical
connection to the outside world as a node. The logical implementation of
the connection is a communications facility. Two kinds of nodes are
defined: a) an EDI connection (or EDI network); and a non-EDI connection
(or application). Thus, in FIG. 2, EDI network 20 and an application
running on host 30 are each nodes. The node name is used to symbolize the
network or application, binding the name to the specific communication
facility that is used to execute the communication. In the illustrated
embodiment, the system assumes that communication facilities exist in the
form of hardware and software. These facilities are not part of the
system; the system interfaces to the facilities through facility drivers,
which are part of the system. The interface consists of a driver and a
script. A driver is a program capable of communicating with the specific
communication facility. This driver supports a set of primitives that is
supported by the e-language. Communications packages usually have either a
programming interface or a user interface. The driver must be interfaced
to whichever interface the specific communication package has. A script is
a program capable of configuring the communication package. This gives the
system the ability to set up and use the proper script.
The different facilities to which the system could interface are numerous.
It is therefore economical to have interfaces for common facilities such
as Unix Mail, Unix File System, and Remote Job Entry. The operations
provided by the interface depend on the capabilities of the facility, but
as a minimum include send and receive primitives.
A facility client is capable of both sending and receiving files. The EDI
translation system can be co | | |