|
Description  |
|
|
BACKGROUND OF THE INVENTION
The invention relates to an interface for coupling data from a realtime
data feed into an application program.
A realtime data feed is any data source that provides the data on a
realtime basis, that is, essentially as the data becomes available or soon
thereafter. For example, the realtime data feed may be a public financial
wire service, such as might be available from the New York Stock Exchange,
Reuters or Knight-Ridder, that provides users with the most recent prices
for stocks and trades. It may be a private financial data source that
distributes financial data within a particular business or institution,
such as a bank. Or, it may be a source of monitoring and control data
generated by a manufacturing process within a company.
Frequently, it is desirable to analyze the data available from such data
feeds to make buy or sell decisions for financial instruments, to decide
other financial and/or management related issues, or to control
manufacturing processes. For this purpose, there are commercially
available applications that are programmed or can be programmed to perform
the required analysis using data gathered from those data feeds. For
example, there are spreadsheet programs, such as 1-2-3.TM. marketed by
Lotus Development Corporation, that provide such data analysis capability.
Often it is also desirable, if not necessary, to perform the analysis of
the data from the data feeds as rapidly as possible, even on a time scale
approaching the speed at which the data changes. To accomplish this,
others have provided interfaces which funnel the data coming from the data
feed directly into the application where it can be analyzed.
SUMMARY OF THE INVENTION
In general, in one aspect, the invention is a realtime engine for
interfacing one or more data feeds with an application where each of the
one or more data feeds delivers realtime data for each member of an
associated group of items that are available through the data feed. The
interface includes means for caching the most recent data values received
from a selected one of the one or more data feeds for at least some
members of the associated group of items for the selected data feed; and
means for sending at least some of the cached data values to the
application in response to a request for updates.
Preferred embodiments include the following features. The realtime engine
includes means for identifying a subset of the associated group of items
for the selected data feed for which the application desires data. The
subset identifies the members of the associated group for which the
caching means caches the most recent data values. Also, the realtime
engine includes means for determining which of the cached data values are
different from data values for corresponding items last sent to the
application. The sending means sends only those cached data values which
are determined to be different. In addition, the realtime engine includes
means for storing copies of the data values last sent to the application.
The determining means compares contents of the storing means with contents
of the caching means to determine which data values are different from
those last sent to the application.
Furthermore, the caching means includes means for recording for each cached
item whether it has been updated by the data feed since data values for
those cached items were last sent to the application. The sending means
sends only those cached data values which are identified by the recording
means as having been updated. The realtime engine includes means for
enabling the application to add members to and delete members from the
identified subset.
One advantage of the invention is that it provides an intelligent interface
between one or more data feeds and one or more instances of an application
that desire data updates from the data feeds. The interface caches the
most recent data values for a set of items identified by the application,
it ignores updates not relating to the identified items, and it sends only
the most recent data values for the identified items when polled by the
application for updates. This relieves the application of the burden of
having to process all updates arriving from a data feed for selected items
and of having to filter out those updates for which it has no interest.
Thus, the application is relieved of these computational burdens, thereby
freeing the application (e.g., a spreadsheet) for other activities such as
recalculation.
Another advantage is that the invention identifies only those items which
have been updated since the last poll and does not send updates to the
application for items which have not been updated since the last poll.
Furthermore, the invention also identifies those items for which the data
value has actually changed since the last poll, and sends updates to the
application only for those items. Thus, the application need not waste
time processing items whose data values have not actually changed.
Yet another advantage of the invention is that it also offers a filtering
capability which screens out all data relating to items that are not of
interest to the application and it sends for items of interest all data
received from the data feed.
Other advantages and features will become apparent from the following
description of the preferred embodiment and from the claims.
DESCRIPTION OF THE PREFERRED EMBODIMENT
FIG. 1 is a block diagram of a realtime system;
FIG. 2 is an overview of the three principle data structures within the
realtime engine shown in FIG. 1;
FIG. 3 shows the global interest list control block data structure;
FIG. 4 shows the data feed control block data structure and related data
structures;
FIG. 5 shows the master interest list record data structure;
FIG. 6 shows a linked list of field block data structures that is
referenced in the master interest list record;
FIG. 7 is a field data structure which is part of the field block data
structure of FIG. 6;
FIG. 8 is a global application information structure;
FIG. 9 is an application information structure;
FIG. 10 shows an interest/filter list information structure which is part
of the application information structure;
FIG. 11 shows several interest list records and their relationship to a
master interest list record;
FIG. 12 shows two interest list field structures and their relationship to
field block data structures within the master interest list;
FIG. 13 shows two filter list records and their relationship to
corresponding master interest list records;
FIG. 14 shows two filter list field structures and their relationship to
field block data structures within the master interest list;
FIG. 15 shows a queue information structure and a queue data structure;
FIG. 16 is a set of reader message buffers;
FIG. 17 is a ring of message buffer structures;
FIG. 18 is a message buffer structure;
FIG. 19 shows the format of an update/refresh message;
FIG. 20 shows the format of a request message;
FIG. 21 is pseudo-code describing the initialization of the RTE process by
the main thread;
FIG. 22 is pseudo-code describing the check for termination of the RTE
process performed by the main thread;
FIG. 23 is pseudo-code describing the termination routine executed by the
main thread when the RTE process is terminated;
FIG. 24 is pseudo-code describing the operation of the RTE private pipe
thread;
FIG. 25 is pseudo-code describing the operation of the feed server pipe
threads;
FIGS. 26a-b depict pseudo-code describing the operation of the updater
thread;
FIG. 27 is pseudo-code describing the registration function available
through the API;
FIG. 28 is pseudo-code describing the termination function available
through the API;
FIG. 29 is pseudo-code describing the rte.sub.-- getstdf function for
getting information about the first alphabetical data feed name;
FIG. 30 is pseudo-code describing the rte.sub.-- getnxtdf function for
getting information about the next alphabetical data feed name;
FIG. 31 is pseudo-code describing the rte.sub.-- getdf function for getting
information about an identified data feed;
FIG. 32 is pseudo-code describing the rte.sub.-- getfstfld function for
getting information about the first alphabetical field name;
FIG. 33 is pseudo-code describing the rte.sub.-- getnextfld function for
getting information about the next alphabetical field name;
FIG. 34 is pseudo-code describing the rte.sub.-- getfld function for
getting information about an identified field;
FIG. 35 is pseudo-code describing the rte.sub.-- openilist function for
opening an application interest list;
FIGS. 36a-c depict pseudo-code describing the rte.sub.-- addilist function
for adding an item to the application interest list;
FIG. 37 is pseudo-code describing the rte.sub.-- delikey function for
deleting an item from the application interest list;
FIG. 38 is pseudo-code describing the rte.sub.-- delifield function for
deleting a field from the application interest list;
FIG. 39 is pseudo-code describing the rte.sub.-- delilist function for
deleting an application interest list;
FIGS. 40a-d depict pseudo-code describing the rte.sub.-- closeilist
function for closing an application interest list;
FIG. 41 is pseudo-code describing the openflist function for opening an
application filter list;
FIGS. 42a-b depict pseudo-code describing the rte.sub.-- addflist function
for adding an item to the application filter list;
FIG. 43 is pseudo-code describing the rte.sub.-- delfkey for deleting a key
from the application filter list;
FIG. 44 is pseudo-code describing the rte.sub.-- delffield function for
deleting a field from the application filter list;
FIG. 45 is pseudo-code describing the rte.sub.-- delflist function for
deleting an application filter list;
FIGS. 46a-d depict pseudo-code describing the rte.sub.-- closeflist
function for closing an application filter list;
FIGS. 47a-b depict pseudo-code describing the rte.sub.-- update function
for polling an application interest list for updates; and
FIG. 48 is pseudo-code describing the rte.sub.-- read function for
retrieving updates from the filter queues.
Structure and Operation
Referring to FIG. 1, a realtime system 10 comprises one or more feed
servers 12(1)-12(n) (referred to generally as feed servers 12), a Realtime
Engine (RTE) 14, a feed configuration file 16, one or more instances of an
application program 18(1)-18(m), such as Lotus 1-2-3 Release 3.0, and an
application programming interface (API) 15, which provides a programming
interface between an application and RTE 14. Each of these components is a
separate process running on a user's PC which, in the described
embodiment, employs an OS/2 operating system.
Feed servers 12 are programs which handle communications between an
external data feed and RTE 14. RTE 14 supports two kinds of feed servers
12, namely, a dumb feed server 12(1) and a smart feed server 12(2). A dumb
feed server processes a one-way broadcast type data feed. It translates
the raw external data into update and refresh messages and feeds these to
RTE 14. A smart feed server supports a request/pass-through data feed
which accepts initial requests from RTE 14 for data items. The smart feed
server implements two functions. Like the dumb feed server, it translates
raw external data feed updates into update and refresh messages and feeds
these to RTE 14. It also handles requests from RTE 14 to the data feed to
add, refresh, or delete items.
A data feed may be any of a number of realtime data sources, i.e., sources
which provide data as the data becomes available or close to the time at
which the data is generated. For example, it may be a public financial
news source, a private financial data source, a source of data and control
information generated by a manufacturing process.
Feed servers 12 must satisfy at least two criteria. First, a feed server
must be able to handle communications from the data feed to which it is
connected, whether that be a device or a network wire. And second, a feed
server must be able to translate data from the format provided by the data
feed into the format required by RTE 14.
Feed configuration file 16 describes the properties of each of feed servers
12 connected to RTE 14. It uses a standard format to tell RTE 14 what feed
servers 12 are included in realtime system 10; the names of communication
pipes 20 that they use to send messages to the feed server; the field data
types and fields for each key; and, where applicable, the number of
concurrently active keys allowed by the data feed.
Note a each smart feed server also requires a second pipe which is used to
send messages to RTE. The name of the second pipe is predefined and is
published by RTE.
Of the components within realtime system 10, RTE 14 must be run first.
Typically, it is a detached process that runs automatically at start-up
time (i.e., when the PC is booted). It can be run manually, however, at
any time prior to running feed servers 12 or any application which makes
requests of RTE 14. RTE 14 requires no screen display 2. However, if a
screen display is available, RTE 14 only uses it when errors occur to
display error messages.
As a rule, feed server(s) 12 are run next after RTE 14. In general, these
programs also run as detached processes. Like RTE 14, feed servers 12 may
be run automatically at start-up time as a convenience to the user or they
may be run manually.
Finally, the user can run up to twelve instances of an application 18. Each
application 18 provides its own interface to RTE 14. In the described
embodiment, the instances of the application are spreadsheet programs.
To obtain real-time data, the user must define a set of 3 ranges for each
group of real-time data desired. The 3 ranges contain: name of the data
feed; a list of the keys (symbols) desired; a list of the fields desired.
Up to ten of these groups can be defined for each worksheet file
associated with the spreadhseet program. Once the worksheet in the
application program is set up properly with the required information to
determine the real-time data of interest, the user can begin updating the
real-time data. This is done via API 15.
When realtime updating is begun, the following actions occur to initialize
RTE 14 for updating the data in the worksheet. API 15 calls RTE 14 and
passes a data structure which identifies all the elements in the realtime
ranges which have been specified. RTE 14 keeps an in-memory database of
all the elements (keys) requested by the user. If an element is new and
unique, it is added to this database. RTE 14 also keeps track of which
elements occur in which ranges so that the master database contains only
unique elements and no duplicates. RTE 14 also keeps track of which
elements are associated with a given instance of application program 18.
If a smart feed server is used, then RTE 14 sends a "request add" message
to the smart feed server for each new element (key) so the feed server
will know that it now must send data for that new element. Finally, before
updating begins, RTE 14 calls back to API 15 to write NA values for all
the cells that are to be updated in real-time.
API 15 and RTE 14 have distinct responsibilities in getting real-time
updates into application program 18. API 15 provides mechanisms enabling
the application to select the data feeds from which it desires data, to
identify the items for which it wants updates and to poll the RTE for
updates of the identified set of items. RTE 14 is the policeman at the
intersection between feed server(s) 12 and API 15. Independently of the
communication between RTE 14 and API 15, RTE 14 maintains continuous
communication with feed server(s) 12 to make sure the most recent data is
being obtained and updated. This occurs even if the user is not running
API 15. In performing this function, RTE 14 serializes the access to each
of feed server(s) 12 up to the maximum of seven. To ensure that both
processes can keep up with the arrival rate of the data, RTE 14 runs at
time-critical priority and usually feed server(s) 12 run at the same
priority. As data elements arrive from a feed server, RTE 14 looks up the
element (key) in its in-memory database. If the data element does not
exist in the in-memory database, then it isn't currently of interest to
the user (i.e., it has not been registered or added by any application)
and it is discarded. If the data element does exist in the in-memory
database, then that data element is updated.
RTE 14 maintains counters and flags to provide four levels of checking to
determine if any given data value has changed since the last update of
that instance of an application 18. Only the most recent, changed values
are written into an instance of application 18 when the application
requests updates. The four levels of checking are:
1. Has a value for a given data feed been updated?
2. Has some field within a given key been updated?
3. Has a given field value been updated?
4. Has the data value for the field actually changed from the previous
value that was written to the worksheet?
If updates have occurred since the last time updates were solicited, RTE 14
calls back to a function supplied by API 15 when the application requests
updates. This function is called once for each cell in the worksheet whose
value has changed and the new value is written to that cell.
When API 15 is terminated (and the user is still in application program
18), then real-time updates no longer occur in the worksheet. However, RTE
14 continues to receive data from feed server(s) 12 and it keeps the
elements which still exist in its in-memory database updated with the most
recent real-time data. This allows the user to resume updating at any time
and quickly obtain current values for all elements which were previously
being updated in real-time.
When an instance of application program 18 is terminated, all the elements
associated with that instance of the application program which are not
being referenced by any other instance of the application program are
marked for deletion from the in-memory database of RTE 14. These elements
will no longer receive updates, and they are deleted. In the case of a
smart feed server, a delete request message will be sent for each element
to be deleted so that the smart feed server will know to stop sending data
for that element.
FIG. 2 is an overview of the data structures used by RTE 14 to carry out
the functions just described. RTE 14 includes a data feed dictionary
structure 100, a master interest list structure 200, an application
structure 400 that includes an application interest list structure 500 and
an application filter list structure 600. Data feed dictionary structure
100 contains data feed control blocks 110 that store information about
each of the data feeds that are connected to RTE 14. Most of the items in
the data feed dictionary are statistics about the operation of the data
feed. Each data feed control block 110, in turn, identifies a field
information block 120 which describes all of the fields associated with
that data feed.
Master interest list structure 200 contains all the records and fields
requested by the user. It is essentially an in-memory database created
from the items specified by the user in each of the applications
(processes) being run. It is a union of all items requested by the user.
Application structure 300 is used to maintain the application information
for RTE 14 and includes an application interest list structure 400 and an
application filter list structure 500, for each instance of application
program 18. Each application interest list structure 400 identifies the
data feeds, the keys and the fields for which the relevant application
wants the most current updates in response to its polling of RTE 14.
Application filter list 500 identifies the data feeds, the keys and the
fields for which the relevant application wants to receive all updates
supplied by the data feed for identified items.
These and the underlying data structures which support them will now be
described in greater detail.
Data Feed Structures
Referring to FIG. 3, RTE 14 maintains one static instance of a global
interest list block 1000, which is the main data structure containing
general statistics and controlling all the data feeds for RTE 14. In
general, it contains a variety of control information, a structure of
global statistics, an array of data feed control blocks and a table for
each process managed by RTE 14.
More specifically, global interest list block 1000 contains the following
information. It includes two flags, namely, an initialization flag 1002
and an engine process running flag 1004, and it includes an engine process
id 1006, which identifies the RTE process. Initialization flag 1002
indicates whether RTE 14 has been initialized and engine process running
flag indicates whether the RTE process is running. If the RTE 14 has been
initialized, flag 1002 is set to TRUE, otherwise it is FALSE. If the RTE
process is running, engine running flag 1004 is set to TRUE, otherwise it
is FALSE. Engine running flag 1004 is used to terminate RTE 14.
Block 1000 includes four message areas, three areas for storing messages
that are to be sent to smart feed servers 12 and one area for storing
messages that RTE 14 sends to itself. The four areas are a request add
message area 1008 for storing request add messages, a request refresh
message area 1010 for storing request only (also referred to as request
refresh) messages, a delete message area 1012 for storing delete messages
and an internal delete message area 1014 for storing internal delete
messages.
Request add messages are used when a user adds new items (keys) to an
interest list. Request refresh messages are used when a user adds new
fields to items already in the interest list. Upon receipt of either a
request add message or a request refresh message, the feed server sends a
refresh message to RTE 14 for all fields of each item/key specified.
Request delete messages are used when a user terminates an instance of the
application program or when the user stops using the items/keys in an
application. Finally, internal delete messages are used to delete items
from the interest lists maintained by RTE 14.
Messages are stored into these areas during that phase of operation in
which the user is adding items to, deleting items from or modifying items
in any interest list used to identify items/keys for which an application
desires update data. The messages stored in these areas are eventually
sent to the appropriate feed server and/or to RTE 14.
Block 1000 includes a global statistics data structure 1016 which is used
to record global statistics about the operation of RTE 14. This area is
used primarily for monitoring the performance of RTE 14 and is not
integral to its operation.
Block 1000 includes a data dictionary 1018 (which corresponds to data
dictionary 100 of FIG. 2). Data dictionary 1018 is an array of seven data
feed control blocks 110(1) through 110(7), one for each data feed that can
be managed by RTE 14. As will be explained in greater detail shortly, the
information stored in data dictionary 1018 describes the information that
is available about the data feeds to which RTE 14 is connected.
After data dictionary 1018, there is a global application information data
structure 1020 which is used for storing information about the instances
of the application programs that are connected to RTE 14.
Next is a sorted array of pointers 1022 to data feed control blocks 110 in
which data feed control blocks 110 are arranged in alphabetical order by
name. The sorted array of pointers is used by cataloging functions
available through API 15 to report the identity of data feeds to an
application.
Finally, block 1000 includes a pointer 1024 to a memory heap 1026, which is
a linked list of memory pages. This area is used for creating the interest
list data structures which identify the keys and fields of interest to the
instances of the application and which form the database of most recent
values for those items. Pages are added to memory hap 1026 as they are
needed.
FIG. 4 identifies the data structures that make up data dictionary 1018
within global interest list control block 1000. Each data feed control
block 110 within data dictionary 1018 contains an area 1030 for storing
statistics about the operation of the data feed. For example, data feed
statistics area 1030 contains five counters: an add message counter
1030(1) for keeping track of the total number of add messages for the data
feed; a field update counter 1030(2) for keeping track of the total number
of field updates for that data feed; a request message counter 1030(3) for
keeping track of the total number of request messages that are waiting to
be sent to that data feed; a delete message counter 1030(4) for keeping
track of the total number of delete messages that are pending for that
data feed; and an internal delete message counter 1030(5) for keeping
track of the total number of internal delete messages that are pending for
that data feed.
Each data feed control block 110 also includes a field 1032 for the name of
the data feed and a field 1034 for a description of the data feed. Both
the data feed name and the description are supplied to RTE 14 by
configuration file 16 (to be described). There is also a field 1036 for
the name of the pipe to the feed server. Field 1036 is only used in the
case of smart feed servers, in which case, the name of the pipe is
supplied by configuration file 16.
Another portion of data feed control block 110 provides a connection into
master interest list structure 200 (see FIG. 2) through a hash index that
identifies a bucket in | | |