WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Method for the dynamic replication of data under distributed system control to control utilization of resources in a multiprocessing, distributed data base system    

Custom CD of patents similar to US4432057 : Method for the dynamic replication of data under distributed system control to control utilization of resources in a multiprocessing, distributed data base system - $19.95
United States Patent4432057   
Link to this pagehttp://www.wikipatents.com/4432057.html
Inventor(s)Daniell; Thomas P. (Palo Alto, CA); Harding, Jr.; Robert C. (Cupertino, CA); Lewis; Neil J. (Oakland, CA); Nauckhoff; Sven H. H. (San Jose, CA)
AbstractA method for dynamic replication of data under distributed system control to control the utilization of resources in a multiprocessing, distributed data base system. Previously, systems providing for data replication at nodes of a multiprocessing, distributed data base system required that a central node maintain control, or that replicated data be synchronized by immediately conforming all copies of an updated data item. By this invention, requests for access to data of a specified currency are permitted and conformation of updated data is selectively deferred by use of a control procedure implemented at each node and utilizing a status and control (SAC) filed at each node which describes that node's view of the status for shared data items at other nodes.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 4432057
Method for the dynamic replication of data under distributed system

     control to control utilization of resources in a multiprocessing,

     distributed data base system - US Patent 4432057 Drawing
Method for the dynamic replication of data under distributed system control to control utilization of resources in a multiprocessing, distributed data base system
Inventor     Daniell; Thomas P. (Palo Alto, CA); Harding, Jr.; Robert C. (Cupertino, CA); Lewis; Neil J. (Oakland, CA); Nauckhoff; Sven H. H. (San Jose, CA)
Owner/Assignee     International Business Machines Corporation (Armonk, NY)
Patent assignment
All assignments
Company News
Publication Date     February 14, 1984
Application Number     06/325,531
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     November 27, 1981
US Classification     707/8 707/201
Int'l Classification     G06F 009/00
Examiner     Zache; Raulfe B.
Assistant Examiner    
Attorney/Law Firm     Beckstrand; Shelley M.
Address
Parent Case    
Priority Data    
USPTO Field of Search     364/300
Patent Tags     dynamic replication data under distributed system control control utilization resources multiprocessing, distributed data base
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
4344134
Barnes
712/16
Aug,1982

[0 after 0 votes]
4007450
Haibt
709/226
Feb,1977

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B

[0 market size comments]
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%

[0 market share comments]
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%

[0 reasonable royalty comments]
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

[0 Guesstimation of Royalty Value Comments]
License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
[0 license availability comments]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
[0 owner/assignee comments]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

[0 competitive advantage comments]
Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

[0 commercial alternatives comments]
 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


We claim:

1. A method for operating a computing apparatus having a plurality of nodes, each node having storage means for storing a plurality of data items and means responsive to a request for accessing a specified data item, and communication means interconnecting selected nodes, the method including the steps of controlling access to the specified data item and controlling the modification of copies of the data item, characterized by the steps of:

distributing data access control to each node of the apparatus; and, responsive to a request having a specified currency, dynamically replicating data items while selectively deferring conformation of the replicated data;

such that the most current data items migrate to their respective affinity nodes.

2. The method of claim 1, wherein said replicating step is further characterized by:

selectively updating a data item at a request processing node in connection with the processing of a request at said request processing node while selectively deferring conformation of copies of the updated data item at other nodes.

3. A method for operating a computing system including a plurality of nodes, with each node having means for storing at least one data item, comprising the steps of:

responsive to a request having a specified currency, dynamically replicating data under distributed system control, while

selectively deferring conformation of the replicated data.

4. A method for operating a computing apparatus to dynamically replicate data under distributed system control comprising the steps of:

storing a data item at each of two nodes;

storing at each of the nodes a dipole half for said date item, the dipole half stored at a first node describing the status of the data item at the second node assumed by said first node;

responsive to a request for access to said data item at said first node, (1) determining from the dipole half stored at said first node if the status at the second node assumed by the first node is in conflict with the request, (2) if in conflict, resolving the conflict by communicating with said second node, and (3) when the conflict is resolved, granting the request for access.

5. A method for operating a computing apparatus including a plurality of nodes, with each node having means for storing at least one data item and means responsive to a request for accessing a specified data item, and communication means interconnecting selected nodes, the computing apparatus being operated in response to a request for accessing a specified data item and controlling the modification of copies of the data item at a plurality of nodes, the method characterized by the steps of:

generating a data request specifying a required currency for a specific data item at the node of the request;

determining if the node of the request views any other node as having for said specific data item a currency status which is in conflict with said required currency;

resolving currency conflicts; and thereafter

granting the request for accessing the specified data item.

6. A method for operating a first node of a multinode computing apparatus, said first node including means for storing at least one data item, means responsive to a request for accessing a specified data item, and a port for communicating data and control signals with respect to at least one other node, said first node being operated in response to a request for accessing a specified data item and controlling the modification of copies of data items at said first node, the method characterized by the steps of:

generating a data request specifying a required currency for a specific data item at said first node;

determining if said first node views any other node as having for said specific data item a currency status which is in conflict with said required currency;

communicating a message containing a data request to other nodes and receiving from each such other node a response resolving any currency conflicts; and thereafter

granting the request for access to the specified data item.

7. The method of claim 6, further characterized by the steps of:

storing or enabling generation of a set of dipole halves at said first node, with a dipole half in said set for each other node with which said first node shares a data item; each such dipole half specifying the status assumed by said first node for the shared data item at the other node.

8. The method of claim 7, further characterized by the step of:

specifying in each such dipole half the currency control, quality control, and responsibility control assumed by said first node for the shared data item at an other node; wherein said currency control describes the currency of the contents of the shared data item at said other node assumed by said first node and is established as one of exclusive, unique clean, shared clean, prior, or not accessible; wherein said quality control defines the relative quality of the shared data item at said first node and at or through said other node; and wherein said responsibility control specifies whether said other node has agreed to always be able to obtain a copy of said shared data item at the currency given by said currency control on behalf of said first node.

9. A method for operating a node of a computing system, said node including means for storing at least one data item and control information for use in controlling access to such data item, and a port for communicating data and control signals with respect to at least one other node, the method characterized by the steps of:

responsive to a request specifying a required currency for a specific data item at said node, determining if said node views any other node as having for said spcific data item a currency status which is in conflict with said required currency; and

for each such currency conflict, generating a message containing a data request to resolve the conflict for communication to the other node.

10. The method of claim 9, further characterized by the steps of:

responsive to a request for access to a data item, generating a request quark specifying the currency control set, quality control and responsibility control required by the said request; and

storing at said node in a status and control file for each data item stored at said node a set of dipole halves, with one dipole half in said set for each other node with which said node shares such data item, each such dipole half specifying the status of the shared data item at the other node assumed by said node.

11. A method for operating a first node of a computing system, said first node including means for storing at least one data item, means responsive to a request for accessing a specified data item, and a port for communicating data and control signals with respect to at least one other node, said first node being operated in response to the request for accessing the specified data item and controlling the modification of copies of data items at said first node, the method characterized by the steps of:

storing or enabling generation of a set of dipole halves at said first node, with a dipole half for each other node with which said node shares a data item;

responsive to a request quark specifying for a given data item the dipole half desired at said first node by a related node and the dipole half at the related node for said first node, executing a quark processing unit comprising the following steps:

determining if any dipole halves stored at said first node for said given data item have a conflicting status;

generating a conflict quark for each dipole half having a conflicting status for communication to the other node having the conflicting status;

waiting for a response to each communicated conflict quark;

modifying dipole halves stored at said first node for other nodes as required to reflect processing of said request quark, and modifying the dipole half at said first node for said related node as required to reflect processing of said request quark; and

selectively generating a response quark or otherwise responding to the originator of said request quark.

12. A method for operating a computing system including a plurality of nodes interconnected by a communication network; each node storing in a data file a plurality of data items which may be unique or copies of data items stored at other nodes; each node, responsive to a request for access to a specified data item by an application, controlling the applications's access to said stored data item; and each node selectively communicating the updated data item to other nodes storing copies of the specified dta item; wherein improvements to the controlling, storing, and communicating steps provide dynamic replication of data under distributed system control and comprise the steps of:

storing a dipole with respect to a data item shared by a node pair, a dipole half being stored in each node of the node pair specifying the status, assumed by this node, of the shared data item at a related node, with the data state of a given data item at this node being defined by the set of dipole halves stored at this node with respect to all related nodes sharing copies of that given data item;

responsive to a request for access to a data item from an application at this node having a copy of the data item at a data state consistent with the request, granting the request without network interaction with related nodes;

responsive to a request for access to a data item from an application at this node having a data state conflict with the request, negotiating a dipole change with each related node with which this node receiving the access request shares a conflicting dipole, and upon resolution of all such conflicts, granting the request; and

responsive to an update request which has been granted, storing the updated data item at this node and only thereafter in connection with an application request communicating from some other node, or in the course of data base conformation processing, communicating the updated item to that other node.
 Description Submit all comments and votes
 


BACKGROUND OF INVENTION

1. Field of the Invention

This invention relates to new and useful improvements in methods of operating general purpose digital computing systems. More specifically, it relates to a method for utilizing data storage and communication resources in a multiprocessing, distributed data base system by dynamically replicating data under distributed system control.

2. Description of the Prior Art

A multiprocessing general purpose computing system typically includes a plurality of nodes interconnected by a communication network. Each node, in such a system, may include a data processor, a data storage device, and communication ports. A data processor may be executing, in a multiprogramming mode, under control of a plurality of operating system components, in which event the data processor may be considered a plurality of nodes. The data storage device stores a data file, the operating system and its information management components, and user application programs.

Data is information abstracted from some aspect of business important to an enterprise. The challenge is to utilize data storage and communication resources of the system so as to give end users access to the data with an availability, performance, and cost commensurate with their business needs. Access to the data must be controlled to ensure the consistency and integrity of the data. Among additional characteristics of data accesses in a distributed data processing environment are geographic and temporal affinity. The basis for distributed data structures is geographic affinity: accesses to a given data item tend to cluster geographically. A basis for the method for dynamic replication of data is temporal affinity: data items which have been accessed recently may be more likely to be accessed in the near future than data items not recently accessed. The node at which accesses for a given data item tend to cluster is called the affinity node; the affinity node for a given data item may not be known ahead of time, and it may vary with time.

Distributed data technology may be categorized according to the attributes of data location, degree of data sharing, degree to which data base management control is provided network-wide, and to the type of data access. Data location may be centralized, partitioned, or replicated. Degree of data sharing may be centralized, decentralized, or distributed. Data base management control may be user provided (distributed data) or system provided (distributed data base). Data access may be by transaction shipping, function shipping, or data shipping.

Historically, a centralized approach has been used for managing data base storage and accesses. In this approach, both data management and application processing are centralized. A single data base manager is used, and the teleprocessing network is used to connect users to the central facility. In a variation on the centralized approach, some of the processing is distributed among nodes in a network, but the data is kept centralized.

The advantages of a centralized data base approach are that (1) data base integrity can be ensured by the single data base manager; (2) all application programs can be written to a single application programming interface: application programs need not be aware of data location since all data is stored in one location; (3) many tools are available to solve the problems of administering data in the centralized environment; and (4) a single system is easier to operate, maintain and control.

Some disadvantages to the centralized approach are: (1) communication costs are high for some enterprises: application performance may be degraded due to communication delays; (2) data availability may be poor due to instability in the teleprocessing network or the central system, which may have to be mitigated by backup systems and redundant communication; and (3) the processing capabilities of a single system have already been reached by some enterprises.

Two approaches for distributing data to the nodes of a distributed data system are (1) partitioning and (2) static replication. In the partitioned data approach there is no primary copy of the data base, whereas there may be in static replication.

A partitioned data base approach divides the data base into distinct partitions, and the partitions are spread among the nodes. A given data item resides at only one node location. Each location has a data base manager which manages the data at its location. A data distribution manager takes a data request from an application program and maps it to a local request, if the data is held locally, or to a remote request if the data is held at another location.

Good data availability and access performance result in a partitioned distributed data base if the data required is held locally. Furthermore, data base integrity is facilitated since each data item is managed by a single data base manager. These results may be achieved if a good partitioning algorithm exists, is known ahead of time, and is stable.

In a partitioned data base, the system must provide a network-wide scope of recovery for programs which change data at more than one location.

Among the disadvantages of a partitioned data base system are (1) reduced availability and performance result if the partitioning algorithm does not match the data access patterns; (2) the application program may have to be aware of the data location, or at least the data partitioning algorithm, and access the data base differently, depending upon data location; (3) changing the data base partitioning algorithm is very difficult because data location is reflected in the application programs, exits, or declarations at each node; (4) existing data relocation and algorithm changes at each node must be synchronized network-wide, and therefore the partitioning algorithm may not be adjusted as needed to maintain optimum performance and availability; and (5) programs which access data items uniformly across a partitioned data base, or which must access the entire data base, will suffer poor performance and availability.

Static replication techniques for distributing data include those with and without a central node. In the former, the central location stores a primary copy of the data base, and each location has a data base manager and a copy of the data base. In typical uses of static replication, the primary data base is copied and sent to each replica location, or node, where the data then becomes available for local processing. Data modifications made at each replica location are collected for later processing against the primary data base. Between periods of application processing, local modifications are sent to the central location and applied against the primary data base. Because this technique for managing replicated data bases does nothing to prevent multiple updates, the occurrence of such must be detected during primary data base update and resolved manually; otherwise, the application must be restricted so that, somehow, multiple updates do not occur. After the primary data base has been made to conform to replica changes, new copies are sent to the replica locations, and the whole process starts over again.

The main advantage of static replication with a primary copy at a central location is high availability and good response time since all data is locally accessible. However, significant disadvantages exist, among them: (1) because the system does not prevent multiple updates, data base integrity is difficult to ensure, severly restricting the data base processing which is feasible for static replicas; (2) the system does not ensure current data for application accesses requiring such; (3) special operational procedures are required for collecting and applying replica modifications to the primary data base, which can be costly and prone to error: typically, primary data base conformation occurs in the middle of the night, and since this is when problems are most likely to be encountered, key personnel must be available; and (4) the data base may not be available during the conformation procedure: providing a large enough window for conformation is not feasible in many applications, the data transmission bandwidth may be unnecessarily large because updates and replicas are transmitted only during the narrow window between periods of operation, and if one or more of the nodes is incapacitated, then conformation may not be possible in the scheduled window.

Many variations have been described in the literature with respect to the basic techniques of static replication described above. The application can be designed so that multiple updates do not occur, or the replicas can be limited to read accesses only. The application program can collect updates itself for later transmission to the primary location, or this information can be gleaned from data base manager logs. Full replicas or only partial replicas can be formed at the replica locations. The entire replica data base or only changes to data held can be transmitted. Replicas can be continually synchronized by sending modifications made by a transaction to the various nodes and receiving acknowledgments as part of transaction termination processing. Such techniques of synchronization may solve the integrity problems of static replication, but lose much of the performance and availability benefits.

U.S. Pat. No. 4,007,450 by Haibt describes a distributed data control system where each node shares certain of its data sets in common with other nodes, there being no primary copy at a central location, but replicas are continually synchronized. Each node is operative to update any shared data set unless one of the other nodes is also seeking to update, in which event the node with the higher priority prevails. Each node stores in its memory the node location of each shared data set and the updating priority each node has with respect to each respective set of shared data. When a data set is updated at a node, all nodes having a replica are sent the update. As above, such a technique solves the integrity problems of static replication, but loses much of the performance and availability benefits.

A method for utilizing data storage and communication resources in a distributed data base system is needed which avoids the disadvantages while maintaining the advantages of the central, partitioned, and static replication techniques: (1) high availability and performance for data which is held at the node where it is accessed; (2) high availability and performance for application programs that can accept data which is possibly not the most current; (3) data location transparency, such that application programmers and end users need not be aware of data location or even of a data partitioning algorithm and the data base appears the same as in the single system, or central, approach; (4) data automatically migrates to the locations where it is accessed without the need for a partitioning algorithm; (5) good performance for programs which access the data base uniformly: a given node can be made to have a copy of every data item in the data base, but it need not be the most current; (6) no special update procedures or windows required, but rather are managed network-wide by the system; (7) data base integrity, with multiple updates prevented and application programs receiving data as current as they require; and (8) the teleprocessing network can be unstable, with the control not requiring that communication with all or any other node be available to access data held locally.

SUMMARY OF THE INVENTION

The invention provides a method for operating a computing system including a plurality of nodes, with each node having means for storing at least one data item, characterized by the steps of accepting a request having a specified currency, and responsive thereto dynamically replicating data under distributed system control while selectively deferring conformation of the replicated data.

This invention further provides a new and useful method for operating a multiprocessing system including a communication network interconnecting a plurality of data processing nodes accessing a distributed data base, the method including the steps of storing unique and replicated data items at a plurality of nodes, and responsive to a request by an application at this node enabling access by the application to the copy of a data item stored at this node, and communicating copies of updated data items to other nodes; wherein the improvement controls utilization of data storage and communication resources to enhance data access performance and ensure network-wide data consistency and integrity where accesses to data may have time varying and a priori unknown geographic and temporal affinity and wherein the communication network may be unstable, by dynamically replicating data under distributed system control, and comprises the steps of:

storing a dipole with respect to a data item shared by a node pair, a dipole half being stored in each node of the node pair specifying the status of the shared data item at the related node of the node pair assumed by this node of the node pair, with the data state of a given data item at this node being defined by the set of dipole halves stored at this node with respect to all related nodes sharing copies of said given data item;

responsive to a request for access to a data item from an application at this node, where this node stores a copy of the data item at a data state consistent with the request, granting the request without network interaction with other nodes; otherwise,

responsive to a request for access to a data item from an application at this node, where this node has a data state inconsistent with the request, negotiating a dipole change with each related node with which this node receiving the access request shares a conflicting dipole, and upon resolving all conflicting dipoles, granting the request; and

responsive to an update request which has been granted, storing the updated data item at this node and only thereafter in connection with application requests communicated from related nodes, or in the course of data base conformation processing with respect to related nodes, communicating the updated data item to related nodes.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagrammatic flowchart showing the data storage, processing, and communication resources to be allocated among competing tasks in a representative configuration of nodes in a distributed data base and multiprocessing/multiprogramming system.

FIG. 2 is a diagrammatic illustration of a typical data base.

FIG. 3 is a diagrammatic representation of the format of a communication message.

FIG. 4 is a diagrammatic illustration of a quark processing unit.

FIGS. 5 and 6 are flowchart illustrations of the method steps required to process a request for access to a data item and a utility program request, respectively.

FIG. 7 shows the manner of combining FIGS. 7A and 7B into one illustration.

FIGS. 7A and 7B provide a more detailed diagrammatic illustration of the formats and field interrelationships associated with the quark processing unit of FIG. 4.

DESCRIPTION OF THE PREFERRED EMBODIMENT

According to the preferred embodiment of the invention, the distributed data control system components are illustrated in FIG. 1, which shows three nodes 10, 12, 14 interconnected by communication links 16, 18 and by links 20, 22 to other nodes in a distributed data base, multiprocessing, multiprogramming system. No one node is a primary node. Every node 10,12,14, . . . , has the capability of storing some number of data items in respective data files: 36, 38, 40. Copies of data items are dynamically created at a node as required to support the processing which occurs at the node. Physically, a node may be a general purpose computer, or central electronic complex, such as an IBM System/360 or 370, described in U.S. Pat. No. 3,400,371 by G. M. Amdahl, et al, and in IBM System/370 Principles of Operation, IBM Publication GA22-7000-6.

As is illustrated by the EMPLOYEE data base of FIG. 2, different applications have different requirements for the currency of the data they access. An application program 26, for example, which is updating the payroll information 30 to compute a raise for a given employee 32 needs to work with the most current value, herein designated CLEAN, of the PAYROLL segment 30. A data item is CLEAN, in the view of application program 26, if its value represents the most recently committed update and if there are no uncommitted updates outstanding: that is, no updates to the values in the employee's record, committed or uncommitted, will be allowed at another node 10, 14. However, another application program 28 which is preparing a report on job statistics by location is probably content with JOBHIST segment 34 values which are possibly not the most current, herein designated PRIOR. PRIOR data allows those transactions which do not require the most recently committed value for a data item to enjoy potentially improved performance and availability.

Thus transactions within applications 24, 26 and 28 can have different currency requirements for the data they access. In generating a request for a data item, application program 24, would specify the key of the record sought and the appropriate currency. In this way, applications 24, 26, 28 can take advantage of the increased performance and availability which can often be achieved by specifying less stringent currency requirements. Typically, multiple transactions with differing currency requirements can process the same file simultaneously, and these transactions can be running at the same or different nodes 10, 12, 14. The specification of currency which may be performed under control of the system programmer and data base administrator facilities, by the application programmer, or by defaults, is concerned with concurrent usage of a data item where at least one user may be updating the data item. The term update is used generically to include insertion, deletion, and modification of data item contents. All updates are "atomic": A data item will never be read or updated while in a partially updated state with respect to a single update request. All updates are either committed or backed out. A committed update is an update that can no longer be backed out, and it will be made available to other transactions anywhere in the network, as will be described hereafter.

Referring once again to FIG. 1, the major components of the distributed data control system will be described. Each node 10, 12, 14 has the capability of storing some number of data items. In the exemplary embodiment described herein, a data item may be a DL/1 data base record, such as is shown in FIG. 2. These data items may be held uniquely at the node or may be replicas of data items held at one or more other nodes. This replica data is stored in data files 36, 38, 40, which may be located in main storage or on separate storage devices. Data which is required to honor an application data base call from, say, application program 26 can be requested from another node 10 and stored in data file 38. Existing data in data file 38 may have to be removed to make room for the new data. Application programs 24, 26, 28 run under the control of transaction managers 42, 44, 46, respectively. Examples of transaction managers 42, 44, 46 include IMS/VS DC, described in IBM publication GH20-1260, and CICS/VS, described in IBM publication GC33-0066.

Data distribution managers 48, 50, 52 ensure data integrity and consistency, and implement the major portion of the method steps of this invention, as will be described in greater detail hereafter. Data distribution manager 48 receives a data call from application 24 via transaction manager 42, ensures that the requested data is held at the required currency in data file 36, and then passes the data call to data base manager 54. Similarly, data base managers 56, 58 are provided at each of the other nodes 12, 14.

Access conflicts between application programs 24, 26, 28 running at different nodes 10, 12, 14 are managed using control information in a format designated dipoles. Each of the two nodes sharing a data item has one-half of the dipole. A node's dipole half describes that node's view of the other node's status for the shared data item. It describes, for example, whether a copy of the data item is available through the other node, the currency (CLEAN or PRIOR) it is assuming for its copy, and whether it is authorized to make updates to the data item. The format of a dipole half (sometimes abbreviated in the figures and text as DIPOLEH) will be described hereafter in connection with FIG. 7.

The set of all dipole halves held at a node defines the status of the data at the node, and is referred to as the data state at the node. Data state information is held in the status and control (SAC) files 60, 62, 64 for nodes 10, 12, and 14, respectively. Data distribution manager (DDM) 48 can determine from SAC file 60 whether or not applications 24 at its node 10 can safely reference or update a data item in data file 36 as well as obtain information about the currency of the data item.

Any item an application program 24 inserts, deletes, or replaces a data item in data file 36 a record of the update is placed in SAC file 60. It is not necessary to inform other locations 12, 14 of the updates as they occur at node 10, and thus, conformation of the copies held at the other node is deferred.

Data base manager 54 receives data calls from transaction manager 42 (which have been intercepted by data distribution manager 48 and processed as hereinafter described), and accesses data file 36 according to the data call and returns the results to the application through elements 42 and 48. Examples of data base managers useful herein are DL/1 DOS/VS, described in IBM publication GH20-1246, and IMS/VS, described in IBM publication GH20-1260. The description of this preferred embodiment will assume a DL/1 environment.

Communication facilities 66, 68, 70 provide internodal communication, and the following, for example, may be used as such communication facilities: ACF/VTAM, described in IBM publication GC38-0282 and IMS/VS DC, supra. The multiple region option (MRO) of CICS/VS may be used when a plurality of nodes 10, 12, 14 physically reside on the same central electronic complex. Processing of a transaction request for a data item at a node is governed by the data state at the node, which as previously noted is defined by the set of dipole halves stored in the node's SAC file, with a dipole half existing for each combination of this node and related node. As each dipole half describes this node's view of the corresponding data item at the related node, the state at this node is an exocentric view of the status of data in the network. Dipole halves are communicated between nodes 10, 12, 14 over links 16, 18 in messages 72 having the format FIG. 3. Referring to FIG. 3, a message 72 from a source node (say, 12) to a destination node (say, 14) includes the following fields: source reflection ID 76, destination reflection ID 78, quark 74 and data 80. If data 80 accompanies a message 72, its presence will be indicated by a field in a request dipole half in quark 74.

Quark 74 describes a unit of work, and is used to communicate and respond to requests for dipole changes and data. It includes the following fields: requesting node name and data file (RN), this node name and data file (TN), data item key (K), request dipole half at TN for RN, dipole half at RN for TN, and response desired indicator (RDI). Processing at source node 12 may be waiting on a response to this message 72. If processing is waiting, then some identification of that processing is sent in source reflection ID. When destination node 14 prepares a response, it returns this ID in the destination reflection ID field 78 of message 72. The distributed data manager 52 at destination node 14 may be waiting on this message 72. That processing is identified by destination reflection ID 78 of this message 72. This information is used at destination node 14 to determine locking requirements and in notifying the waiting DDM 52 when this message 72 has been received and successfully processed at destination node 14.

Referring now to FIG. 4, description will be given of the fundamental quark processing unit 82.

Request quark 84 is analyzed by procedures defined by one or more of the three tables: conflict determination (CONFON) 86; request reflection (RESPON) 88; and response generation (RESPRN) 90.

CONFON 86 determines whether or not there are conflicts for a given request quark 84. The data item key in request quark 84 is used to select those dipole halves which describe the same data item for nodes other than the requesting node. Each dipole half for another node is compared to the desired dipole half for the requesting node. If there is a conflict, then conflict quark 92 is generated to be sent to the other node.

If any conflicts were found, then quark processing unit 82 waits 94 until notified that all conflicts 92 have been resolved before further processing request quark 84.

RESPON 88 determines whether or not any dipole halves for other nodes need to be modified to reflect the processing of quark 84 from the requesting node. For example, if request quark 84 is accompanied by a new data item value, then the dipole halves for all other nodes with a replica of that data item may need to be marked to indicate that those other nodes now have worse data than this node.

RESPRN 90 changes the dipole half for the requesting node to reflect processing of request quark 84 and generates response quark 96.

Referring now to FIG. 5, a description will be given of an overview of the processing of request, conflict, and response quarks for a transaction request (data call) occurring at transaction node 10. The process is initiated by receipt at a transaction node 10 of a data call from an application 24 at node 10 for access to a data item, the data call including the data item key and the currency requirements.

DL/I calls GU, GHU, and ISRT are data calls in the exemplary embodiment. Quark processing is not required for DL/I calls REPL, DLET, GNP, and GHNP calls since DL/I requires that they be immediately preceded by one of the above data calls. (The DL/I calls GN and GHN are not supported since they do not necessarily include a key K.)

In addition, for each data item modified by the application program, the equivalent of a data call called local advice (LADV) is generated by the DDM at the time the modification is committed by the Data Base Manager. Quark processing must be performed for each such LADV.

The data call is converted into a request quark 100, which is processed through quark processing unit 102 executed by data distribution manager 48 to ensure that the requested data is held at the required currency in data file 36 of this transaction node 10. Conflict quarks 104 are generated and processed as required at other nodes (including 12) to resolve conflicting data states, with a conflict quark 104 generated for each conflicting dipole half identified by CONFON 106.

Each conflict quark 104, outgoing from node 10 in message 105, becomes a request quark (such as 108) when received at another node (such as destination node 12). There the second request quark 108 is processed through an instance 110 of the quark processing unit. This may require the generation of still further conflict quarks for communication to still other nodes, as represented by links 112, where the process recurses. When a response represented by links 114 is received for each of the still further conflict quarks, the dipole half at other node 12 for node 10 is modified by RESPRN 134 to remove the original conflict between node 10 and node 12 (detected by CONFON 106), whereupon RESPRN 134 generates response quark 116. Since the response desired indicator (RDI) is set in request quark 108, derived from conflict quark 104, a response quark 116 is created and included in message 118 back to the source node 10.

If destination node 12 for this message 105 determines that an additional interchange is required to complete processing, then it sets the response desired indicator in response quark 116. A (response) message 118 is returned regardless of the indicator in quark 104. When received at the transaction node 10, response quark 116 becomes an instance of request quark 120, with one such instance required for each conflict quark 104 sent out from transaction node 10. Each instance of request quark 120 is processed through an instance 122 of quark processing unit 82, successful completion of which is required for instance 102 to process to completion.

Referring again to FIG. 5, quark processing units 102, 110, 122 include CONFON 106, 124, 126, RESPON 128, 130, 132, and RESPRN 134 and 136, respectively. RESPRN is not included in quark processing unit 102 since the request quark 100 is generated on behalf of a local application; no dipole half need be changed and no response quark need be generated. The first instance 102, 110 of a quark processing unit at a node also includes wait 138, 140. Such waits 138, 140 are not required in processing instances of request quark 120 received in response to conflict quark 104 because it is not expected that CONFON 125 will detect any conflicts with request quark 120; consequently, no conflict quarks will be generated by CONFON 126, and there will be no wait required. (CONFON 126 could, therefore, be eliminated but it is included in each instance of processing unit 122 to facilitate the implementation of the recursive unit 82 (122) described hereafter in connection with FIG. 7.)

The source reflection indentifier field 76 in message 105 is set by transaction node 10 to identify the wait 138. When other node 12 responds, message 118 will include a destination reflection identifier field 78 set to the above source reflection identifier to identify message 118 to wait 138. Lines 135 represent the collection of all (response) messages 118, which when received and processed permit completion of processing through RESPON 128. (As all dipole half changes required at node 10 in resolving conflict quarks 104 have been made by RESPRN 136, RESPON 128 can initiate a response to the data call from application 24.)

CONFON 86 (106, 124, 126), RESPON 88 (128, 130, 132), and RESPRN 90 (134, 136) are implemented, in this embodiment, by tables processed according to procedures to be described hereafter for performing the steps of identifying and resolving conflicting dipoles. CONFON 86 detects dipole conflicts between a request quark 84 received at a node (e.g., 10) and the dipole halves in SAC file 60 at that node. RESPRN 90 processes and updates as required the dipole half stored in SAC file (e.g., 60) of node 10 for a requesting node RN, and RESPON 88 processes and updates as required the dipole halves stored in the SAC file (e.g., 60) of node 10 for other nodes ON than the requesting node RN.

The preferred embodiment assumes that the two halves of a dipole can be changed by RESPRN 134, 136 within a single scope of recovery. CICS/VS OS/VS Version 1 Release 6 is an example of a transaction manager which provides such capability.

Referring now to FIG. 6, a description will be given