|
Description  |
|
|
FIELD OF THE INVENTION
The present invention relates generally to scientific data management and
more particularly to an imaging system for electronically managing case
report forms.
BACKGROUND OF THE INVENTION
Before a new drug can be marketed in the United States, clinical research
must be conducted to prove the safety and efficacy of the drug. Typically,
the safety and efficacy of a new drug is determined by statistical
analysis of trial data collected during the clinical research phase of new
drug development. The statistical analysis depends on the accuracy of the
data being analyzed. Typically, trial data is manually recorded by
clinical researchers on case report forms. The data on the case report
forms must then be compiled before statistical analysis can be done.
The process of compiling clinical trial data for statistical analysis is
paper intensive. A typical study may include thousands of case report
forms. The case report forms are distributed among many users for
processing the case report forms. In most data management systems, each
user performs a specified task and then passes the case report forms to
the next user which performs another specified task. The case report forms
are passed from one user to another in this manner until processing is
complete. One or more of the users enter the data contained in the case
report forms into a database maintained on a computer.
The paper intensive processes used for scientific data management in the
past were designed to insure the quality and integrity of the scientific
database so that a reliable analysis could be made. However, these paper
intensive processes have some disadvantages. First, fairly elaborate
tracking systems must be created to track the case report forms as they
are moved from one user to the next. Normally, a separate tracking
database is used to keep track of documents as they are processed. At each
step of the process, the tracking database must be updated to reflect the
current disposition and location of the document. Maintaining these
tracking systems can be time consuming and cumbersome.
Another disadvantage of paper systems is that they are labor intensive. For
example, distribution of the case report forms is accomplished by manually
moving the case report forms from one user to the next. As a result,
support personnel must be hired for handling the case report forms. These
support personnel are not directly involved in the data processing.
Another disadvantage of paper systems is that access to a case report form
is limited to a single user at a time. That is, only one user at a time
possesses the case report form. If another user needs access to the case
report form, the other user would have to wait until processing of that
case report form is complete, or make a special request to remove the case
report form from the processing stream. Removing the case report form from
the processing stream increases the complexity of document tracking.
SUMMARY AND OBJECTS OF THE INVENTION
The invention is an electronic document management system particularly
designed to manage clinical trial data for pharmaceutical companies. The
electronic document management system guarantees the quality and integrity
of the scientific database without the inefficiencies inherent in a paper
system.
The electronic document management system is implemented in a computer
network having a plurality of workflow nodes interconnected by
communications media. Predetermined processing functions are performed at
each workflow node to process the information contained in the case report
forms (CRFs). The case report forms are scanned and converted into
electrical images which can be stored in a data storage medium. Generally,
each page of a case report form will form a separate image. The images can
then be routed through the network to process the case report form. To
enable routing of the images, the images are classified by type. A
separate routing scheme may be defined for each type of image. The routing
scheme defines the sequence of workflow nodes through which each image
must pass before the processing of that image is considered complete.
After scanning, each image is assigned a unique identification number and
is indexed. The index information is used to track the flow of images
within the network during processing. The index information includes a
type code used for document routing. After indexing, the document is
routed through the network according to the routing scheme defined for the
corresponding type. In each case, the routing scheme will include a
data-entry node where data contained in the image is entered into a
database. The database includes a key field for linking each database
record with the image which is the source of the data. As each image is
processed at the data-entry node, the identification number is
automatically entered into a key field of the database record to create a
permanent link between the database record and its source image. Linking
the database record with its source image enables the image to be
subsequently retrieved directly from the database.
The processing of a document in the scientific data management system is
divided into the following functions: document scanning, document
indexing, comment entry, clinical review, regulatory review, editing, data
entry, and ad hoc retrieval.
Incoming documents are normally scanned as they are received and converted
to electrical format. The scanned documents are assigned to a batch. After
scanning, the batch is routed to document indexing to enable tracking of
the documents. Document indexing is the process of associating identifying
information with each image. Once a document has been indexed, tracking
information is automatically maintained in a tracking database as
documents move through the system. The indexing process eliminates the
need for separate tracking systems.
After indexing, documents requiring clinical review or regulatory review
are routed respectively to a clinical review queue or a regulatory review
queue. All other documents are routed to an editing queue for editing.
When the clinical and regulatory review process is complete, the documents
are then routed to the editing queue.
The actions performed at the editing station include new document
processing prior to release to data entry, and data entry review. New
document processing involves reviewing documents for completeness and
clarity. Annotations are added to the document when necessary for
clarification. Data entry review involves reviewing issues that arise
during data entry.
Edited documents are passed to the data-entry work queue for entry into the
database. If any "hot keys" are generated during data entry, the
associated documents are routed back to the editing work queue for review.
The editor reviews "hot keys" inserted during data entry, and if
necessary, generates data clarification forms (DCFs). All data is
double-keyed by two separate data-entry operators. After the first
data-entry operator has committed his or her entries, the document is
routed to the second data-entry operator unless a "hot key" was generated
during data entry. If a "hot key" is generated during the first stage of
the data entry, the document is routed back to the editing station for
review, and then to the second data-entry operator after the review is
complete. If the second data-entry operator generates "hot keys" during
data entry, the document is again routed back to the editing station for
review. If no "hot keys" were generated during the second stage of the
data entry, the documents are routed to comment entry where comments can
be entered into a comment database.
Based on the foregoing, it is a primary object of the present invention to
provide an electronic document managing system for electronically managing
case report forms without the inefficiencies inherent in a paper system.
Another object of the present invention is to provide an electronic
document managing system wherein paper documents are converted to
electronic images which can be individually routed within the network.
Still, another object of the present invention is to provide an electronic
data management system which automatically tracks each image as it is
routed through the network without the need for a separate tracking
database.
Yet another object of the present invention is to provide an electronic
data management system which allows subdivision of documents into a
plurality of discreet images which can be independently routed through the
network.
Another object of the present invention is to provide an electronic data
management system which automatically links each record in the scientific
database with a corresponding image or images.
Still another object of the present invention is to provide an electronic
data management system which allows each image to be viewed simultaneously
by multiple users.
Other objects and advantages of the present invention will become apparent
and obvious from a study of the following description and the accompanying
drawings which are merely illustrative of such invention.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a schematic diagram of the network system for implementing the
document management system.
FIG. 2 is a flow diagram illustrating the major processes in the document
management system.
FIG. 3 is a drawing of a display showing the major components of the user
interface.
FIG. 4 is a drawing of a dialog box for selecting a batch during scanning
or indexing.
FIG. 5 is a drawing of the process screen used during the scanning process.
FIG. 6 is a drawing of the process screen used during the indexing process.
FIG. 7 is a drawing of the process screen used during the edit process.
FIG. 8 is a drawing of the screen used for TAGS processing.
FIG. 9 is a drawing of the process screen used during the data-entry
process.
FIG. 10 is a drawing of the process screen used during the comment entry
process.
DETAILED DESCRIPTION OF THE INVENTION
The invention is a computer-implemented, scientific data management system
particularly designed to manage clinical trial data for pharmaceutical
companies. Documents containing clinical trial data are scanned and
converted to an electrical format. After scanning, the documents are
indexed to enable tracking of the documents. The index information may
comprise a combination of system-defined and user-defined index fields.
After indexing, the images are routed in a predefined sequence to system
users. Each image is classified by type. For each image type, a separate
routing scheme can be defined. Thus, each page of a document can be routed
independently from the other pages of the same document. The documents,
are edited and data contained therein is entered into a scientific
database. When the data is entered, a link is dynamically established
between the database record and the image which is the source of the data.
Referring now to FIG. 1, a schematic diagram of the computer network system
is shown. The network system is indicated generally by the numeral 10. The
network system 10 includes a database server 12, one or more data storage
units 14, a workflow server 16, a plurality of user stations 18, and one
or more scanning stations 20.
The database server 12 is a mini-computer which serves most database
functions. The database server handles the storage and retrieval of data
in the scientific database management system. All requests for data are
made via the database server.
The data storage unit 14 is a memory device in which data can be stored in
an electrical format. The data storage unit 14 may comprise either
magnetic disks, optical disks, or any other storage medium commonly used
in the computer industry. In the present system, separate image storage
units 15 are used as the primary medium for the storage of images. The
image storage units 15 comprise one or more optical disks which are
preferably isolated in a separate segment on the network using a file
server 22 to handle requests for images. The file server 22 should be the
only node on the network that directly communicates with the image storage
unit 15. The file server 22 may contain a magnetic disk which serves as a
cache or temporary storage medium for frequently requested images.
The workflow server 16 manages the workflow functions for the system. The
workflow server 16 handles document and folder distribution, processes
alerts, updates workflow queues, and routes all images within the network.
The workflow server 16 tracks all images in process and updates the
database with the status and current location of each image.
The user work stations 18 are personal computers where the major data
processing functions are performed. Each work station 18 includes a
display for displaying scanned images and input means such as a computer
keyboard. The scanning station 20 is a specialized work station connected
to a scanner for scanning documents and converting the documents to an
electrical format.
The scientific data management system enables large volumes of documents to
be electronically managed thereby eliminating the need for handling paper
documents. The process is implemented by computer software running on
network resources.
I. Process Fundamentals
Protocols, Case Report Forms (CRF), And CRF Packets
The scientific data management (SDM) system of the present invention is
specifically designed to handle documents and data relating to clinical
testing of pharmaceutical and biomedical products. A protocol is the
highest level entity in the data management system. Generally, a protocol
corresponds to a single clinical trial or study. For example, a clinical
trial of a new drug involves administering the new drug to a human
population and then monitoring the patients and collecting data. The
patients will normally visit an investigator, which is usually a doctor or
other medical professional, on one or more occasions. During each visit,
the investigator examines the patient and records his findings on a
written document called a case report form (CRF). For each visit, a CRF is
completed by the investigator. A different CRF may be used for each visit,
or a single CRF may be used for more than one visit. The data management
system of the present invention is used to manage CRF's associated with a
clinical trial, to process data contained in the CRFs, and to analyze the
data collected.
When a new protocol is started in the data management system, a CRF packet
for the protocol is defined. A CRF packet describes each CRF page which is
expected in connection with a given protocol. That is, a CRF packet
describes a complete set of CRF pages used for a single patient over the
course of a study. As previously indicated, each study may include more
than one kind of CRF page, multiple copies of the same CRF page, or a
combination of the two. The CRF packet describes each page of each CRF
which is expected to be received during the course of the study. The
system uses the CRF packet to validate documents when they are received
and for quality control.
User Desktop
Each system user operates at his/her user workstation 18. While each user's
interface within the SDM system may differ, each interface has certain
common components. As shown in FIG. 3, each user interface includes a
document window 24, and a process screen 26. Additionally, dialog boxes 28
are used to prompt the user for information.
A program referred to herein as the document manager manipulates the images
scanned into the SDM system. The document manager groups images into
documents and keeps pages in the proper sequence. The images are displayed
in a window called the document window 24. The document manager allows the
user to zoom, pan, rotate, invert, and tile images.
The process screen 26 is closely coupled to the document window. The
process screen 26 contains all the controls required to perform certain
operations in the processing of CRF pages. Each process screen 26 contains
list boxes, entry boxes, entry fields, buttons, and other controls
required for a specific task.
The process screen is launched from a workflow queue. A workflow queue is a
list of documents requiring further processing. The user selects documents
from the workflow queue. Ordinarily, a group of documents is selected at a
time. After the selection is made, the process screen 26 and document
window 24 are displayed.
CRF Processing, DCFs and Tags
The major processes in a typical protocol include: (1) document scanning;
(2) document indexing; (3) monitoring review; (4) editing; (5) data entry;
and (6) comment entry. These processes and the general flow are
illustrated in FIG. 2.
The initial step in the process is document scanning. Each CRF is scanned
and digitized into a digital format. After scanning, each page of the
document is indexed to enable document tracking and routing. The document
is then electronically distributed to system users involved in data
processing. Each page scanned is routed according to predefined routing
schemes based on the index information. Typically, each page will be
routed to a scientific data editor who edits the document and then to a
data entry operator. The data entry operator enters data on the CRF page
into the scientific database. The data is double-keyed. That is, the data
is entered into the database by two different entry operators. The data
entered by each operator is compared and, if the data matches, it is
accepted into the database. If not, the data is rejected until the
discrepancy is resolved.
In some protocols, certain pages of a CFR may need to be reviewed by a
monitor prior to editing and data entry. In such cases, a separate routing
scheme can be designated for those pages only. The pages requiring review
by a monitor are first routed to the monitor. After the review is
complete, the page is sent to data editing and is processed in the normal
manner.
During data entry, certain questions might arise concerning the document.
If the data entry operator is unsure about the data, the document can be
routed back to the scientific data editor for clarification. If the
scientific data editor can resolve the question, it is sent to the next
user in the workflow scheme. For example, if the data entry A operator
encounters a problem and routes the document back to the scientific data
editor and the editor resolves the problem, the document will then be
routed to data entry B. If the document returns to the scientific data
editor from data entry B, it is routed to comment entry after the problem
is resolved.
In some cases, questions concerning the data cannot be resolved by the
scientific data editor. In such cases, the scientific data editor
generates a TAG. A TAG is a record of problems associated with a
particular CRF page. All TAGs records are stored in a separate TAGs
database. A separate TAGs program is used to generate and process TAGs
records. The TAGs program is used to update or modify entries in the main
database or to generate data clarification forms (DCFs).
When a question concerning the data is encountered that cannot be resolved
by the scientific data editor, a DCF is generated and sent to the
investigator associated with the particular CRF. An entry is also made in
the TAGs database, and the DCF is associated with the TAG. When the DCF is
returned by the investigator, the information is entered into the TAGs
database. The TAGs database is then used to update the data in the main
database.
Workflow Queues
Users of the SDM system are divided into workgroups. The workgroups are
defined by the system administrator. Each workgroup is assigned specific
functions. For example, in the process described herein, the following
groups and functions are used:
______________________________________
GROUP FUNCTIONS
______________________________________
Clinical Documents Document Scanning
Document Indexing A
Document Indexing B
Scientific Data Editing
Editing
Editing Review
Data Entry Data Entry A
Data Entry B
______________________________________
Each of the functions is described below.
Workflow queues are used for distributing work to users and for selecting
workflow items requiring an action. Each major process, such as indexing,
editing, data entry, and comment entry, has an associated workflow queue.
Each workflow queue is assigned to a specific workgroup. Only users in the
assigned workgroup can access items in a given workflow queue.
The queue functions as a container for workflow items requiring action.
Access the workflow items is made via the workflow queue. The workflow
queue contains a list of all active protocols which have workflow items
requiring further action. Associated with each protocol is a list of
document types requiring processing. Work is initiated from a workflow
queue by first selecting a protocol and then selecting a specific type of
item for processing. Once items are selected from a workflow queue for
processing, the items are marked "In Use." An "In Use" item cannot be
selected for processing by any other user until it is released. The item
may, however, be selected for any "read only" process, such as ad hoc
retrieval, which requires only viewing of the item.
Workflow Routing
The system allows documents to be split into components (i.e. pages) which
can be individually routed through the data management system. Each item
is classified by document type. Each document type is assigned to a
workflow group. For each workflow group, a separate rotating scheme can be
defined which details the sequence of workflow nodes and the actions to be
performed at each node. A node can be a workflow queue or a particular
user workstation. When a required action is performed at a workflow node,
the item is routed to the next node.
The routing scheme for a particular workflow category comprises a
collection of routing paths. A routing path is a link between two workflow
nodes which identifies one node as the source node and the other node as
the destination node. For example, in the path B.fwdarw.C, B is the source
node and C is the destination node. After work is completed at node B, the
item is routed by the workflow server to node C.
Each routing path has an associated "state" and "action" which defines the
routing path context. The "state" of a workflow item indicates its current
disposition while an "action" indicates the activity that is required of
the destination node. After an action is completed on a workflow item at a
given node, the "state" and "action" are updated to reflect the current
disposition of the item and the next required action. By examining the
"state" and "action" associated with each item, its location in the
workflow can be determined.
The workflow server uses routing schemes and routing paths to distribute
workflow items among nodes. At each node, the workflow server uses the
current node ID to determine the appropriate routing path. The item is
routed along that path to the next node. For example, suppose a workflow
item is a given node. The node then releases the item for further
processing. The workflow server routes the item from the current source
node defined by the routing path to the current destination node.
When a terminal node is reached in the routing process, the item is removed
from the workflow. A terminal node is the last node defined in the routing
scheme which is not a source node in any defined routing path. When the
workflow item reaches the terminal paths: A.fwdarw.B, B.fwdarw.C,
C.fwdarw.D. If the routing path B.fwdarw.C were deleted from the routing
scheme, the workflow would stop at node B because there is no path from B
to any other node. This would result in CRF pages being inadvertently
dropped from the workflow.
II. Protocol Setup
Before documents can be processed, certain information must be entered by a
system administrator to set-up the protocol. To start a new protocol, the
following steps must be performed:
1. Add protocol summary information.
2. Define index and tracking fields for protocol.
3. Define content and structure of CRF packet.
4. Define investigator names, IDs, and patient assignments.
5. Assign users to the protocol and define access rights of users.
6. Define routing schemes for each document type in protocol.
The step of defining the protocol involves entering summary information
about the protocol into the database. The summary information will
typically include a protocol identifier, a description of the protocol,
the start date of the protocol, the end date of the protocol, the priority
level, and priority date. The total number of pages expected to be
received is calculated based on the CRF packet definition and patient
information.
As previously described, the data management system internally tracks each
document that is scanned and indexed. The index and tracking information
may be different for each protocol. Since the indexing fields may differ
from one protocol to another, each protocol has its own table within the
database. A system administrator defines protocol specific index and
tracking fields during the protocol definition process. The index fields
and tracking table must be defined for each protocol before any documents
for that protocol can be processed.
In order to define the indexing/tracking table for a new protocol, the
following information is required:
1. the protocol name;
2. the protocol tracking table's name;
3. the table space name;
4. the tracking table size; and
5. the tracking table's fields and their attributes.
The protocol name is entered by the system administrator during protocol
setup. The tracking table name, table-space name, and tracking table size
are internally assigned by the database management system. The field
information is selected by the system administrator. The field information
includes the field name, field label, field data type, field size, a
logical field indicating whether the field is a key field, the field
order, a logical field indicating whether the field is indexed, and a
logical field indicating whether the field contains unique values.
After completion of the field definition process for a protocol's
indexing/tracking table, the database management system automatically
generates the indexing/tracking table and its indexes. After the
index/tracking table is built, the system will accept CRFs for this
protocol.
The indexing fields used in a given protocol will include certain
system-defined fields, and may include one or more user-defined fields.
The system-defined fields are default fields used in every protocol. Some
system-defined fields are modifiable, others are not. The system-defined
fields are shown in Table 1below:
______________________________________
FIELD NO.
FIELD NAME MODIFIABLE EXAMPLE
______________________________________
1 Doc.sub.-- Type
NO C
2 Protocol NO 0123
3 Rcvd Date NO 05/23/93
4 Version NO 1
5 Repeat NO 1
6 Continuation
NO 1
7 Multi-Page NO 1
8 Investigator
| | |