WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
System for merging virtual partitions of a distributed database    

Custom CD of patents similar to US4853843 : System for merging virtual partitions of a distributed database - $19.95
United States Patent4853843   
Link to this pagehttp://www.wikipatents.com/4853843.html
Inventor(s)Ecklund; Denise J. (Beaverton, OR)
AbstractAn object-oriented, distributed data base system separates into a plurality of virtual partitions following communication failure between sites accessing the data base. Each partition accesses a separate copy of an initial data base and independently updates groups of data objects included in the data base to add new versions of data objects to the data base. Each virtual partition maintains a copy of all previous versions of data objects and maintains a change list describing all group updates that it executes. Following restoration of communication between sites, each virtual partition merges the data bases maintained by separate partitions to form a consistent merged data base permitting versions of data objects and collections of data objects created by any one of the separate virtual partitions to be identified and accessed in the merged data base.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 4853843
System for merging virtual partitions of a distributed database - US Patent 4853843 Drawing
System for merging virtual partitions of a distributed database
Inventor     Ecklund; Denise J. (Beaverton, OR)
Owner/Assignee     Tektronix, Inc. (Beaverton, OR)
Patent assignment
All assignments
Company News
Publication Date     August 1, 1989
Application Number     07/136,174
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     December 18, 1987
US Classification     707/203 707/202
Int'l Classification     G06F 015/20
Examiner     Chan; Eddie P.
Assistant Examiner     Kulik; Paul
Attorney/Law Firm     Bedell; Daniel J. Hulse; Robert S. ,
Address
Parent Case    
Priority Data    
USPTO Field of Search     364/200 MS File 364/900 MS File 364/200 364/900 364/300
Patent Tags     merging virtual partitions distributed database
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
4769772
Dwyer
707/2
Sep,1988

[0 after 0 votes]
4686620
Ng
707/10
Aug,1987

[0 after 0 votes]
4648036
Gallant
707/203
Mar,1987

[0 after 0 votes]
4635189
Kendall
707/10
Jan,1987

[0 after 0 votes]
4558413
Schmidt
707/203
Dec,1985

[0 after 0 votes]
4511964
Georg
711/202
Apr,1985

[0 after 0 votes]
4432057
Daniell
707/8
Feb,1984

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B

[0 market size comments]
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%

[0 market share comments]
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%

[0 reasonable royalty comments]
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

[0 Guesstimation of Royalty Value Comments]
License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
[0 license availability comments]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
[0 owner/assignee comments]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

[0 competitive advantage comments]
Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

[0 commercial alternatives comments]
 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


I claim:

1. For a data base system providing a plurality of separate virtual partitions, each storing separate instances of an initial data base comprising an initial set of versions of data objects and initial directory data indicating primary and alternate paths of decendency relating successive versions of each data object,

each virtual partition independently executing group updates with respect to a referenced group of data objects by adding descendant data object versions to paths of data objects of the referenced group, each virtual partition maintaining a separate change list describing all group updates that it executes,

a method for execution by a virtual partition for providing a merged data base reflecting changes to the initial data base resulting from group updates described by change lists maintained by said plurality of separate virtual partitions, the method comprising the steps of:

selecting a collection of group updates from among the group updates described by said change lists according to predetermined selection criteria subject to a restriction that when the collection includes a plurality of group updates adding data object versions to any one path of a data object, said plurality of group updates must all be described by a change list maintained by one of said plurality of separate virtual partitions;

executing the collection of group updates on versions of data objects included in said initial data base to produce a resulting set of data object versions;

generating additional data object versions, each additional data object version resulting from group updates described by said change lists other than group updates included in said collection; and

altering said initial directory data so that it indicates that the additional data object versions relate to data object versions of said resulting set by alternate paths of decendency.

2. The method of claim 1 wherein selection of the collection of group updates is subject to an additional restriction that a data object version created by a group update of the collection other than a first version of a data object must be created either by modifying a data object version created by another group update of the collection or by modifying a data object version of the initial set.

3. The method of claim 1 wherein said predetermined criteria comprises maximization of a number of group updates included in the collection, subject to said restriction.

4. The method of claim 1 wherein said predetermined criteria comprises maximization of a number of data object paths altered by group updates included in the collection, subject to said restriction.

5. The method of claim 1 wherein said predetermined criteria comprises maximization of a number of versions created by executing group updates included in the collection, subject to said restriction.

6. The method of claim 1 wherein said predetermined criteria comprises maximization of a weighted combination of a number of paths altered by group updates included in the collection and of a number of group updates included in the collection, subject to said restriction.

7. For a data base system providing a plurality of separate virtual partitions, each storing separate instances of an initial data base comprising an initial set of versions of data objects, each version of a particular data object, other than a first version of the data object, having been created by modifying an existing predecessor version of the data object,

the initial data base further comprising directory data associated with each data object of the initial set of versions of data objects, the directory data identifying paths of decendency for the associated data object, each path of decendency comprising sequentially created versions of the data object, wherein each data object version included in a path except a first data object version of the path directly descends from a last created data object version included in the path, the directory data classifying each data object version included in each path as being one of current and non-current, wherein only one data object version in each path is classified as current,

each virtual partition independently executing group updates by carrying out operations with respect to a referenced group of data objects such that a path of a data object of the group is altered, and each virtual partition maintaining a separate change list describing all group updates that it executes,

a method for execution by a virtual partition for providing a merged data base reflecting changes to the initial data base resulting from all group updates described by change lists maintained by said plurality of separate virtual partitions, the method comprising the steps of:

selecting a collection of group updates from among all group updates described by said change lists according to predetermined selection criteria subject to a restriction that when the collection includes a plurality of group updates altering any one path of a data object, said plurality of group updates must be described by a change list maintained by one of said plurality of separate virtual partitions;

executing the collection of group updates in sequence on said initial set of data objects to produce a resulting set of data objects; and

adding additional data object versions to alternate paths of the resulting set of data objects, each additional data object version resulting from group updates described by said change lists other than group updates included in said collection.

8. The method of claim 7 wherein selection of the collection of group updates is subject to an additional restriction that a successor data object version created by a group update of the collection must be created either by modifying a data object version created by another group update of the collection or by modifying a data object version of the initial set.

9. The method of claim 7 wherein said predetermined criteria comprises maximization of a number of group updates included in the collection, subject to said restriction.

10. The method of claim 7 wherein said predetermined criteria comprises maximization of a number of data object paths altered by group updates included in the collection, subject to said restriction.

11. The method of claim 7 wherein said predetermined criteria comprises maximization of a number of versions created by executing group updates included in the collection, subject to said restriction.

12. The method of claim 7 wherein said predetermined criteria comprises maximization of a weighted combination of a number of paths altered by group updates included in the collection and of a number of group updates included in the collection, subject to said restriction.

13. For a data base system providing a plurality of separate virtual partitions, each storing separate instances of an initial data base comprising an initial set of versions of data objects, each version of a particular data object, other than a first version of the data object, being created by modifying an existing predecessor version of the data object,

the initial data base further comprising an initial set of directory data associated with each data object of the initial set of versions of data objects, the directory data identifying paths of decendency for the associated data object, each path of decendency comprising sequentially created versions of the data object, wherein each data object version of the path except a first data object version of the path directly descends from a last created data object version of the path, the directory data classifying each path as one of principal and alternate, and classifying each ata object version of each path as being one of current and non-current, wherein only one path of each object is classified as principal and only one data object version of each path is classified as current,

the initial data base further comprising an initial set of configuration specifications, each configuration specification referencing a group of said initial set of data object versions, at least one configuration specification providing at least one floating reference to a current data object version of a principal path of a data object, and

each virtual partition independently executing group updates, each group update carrying out at least one of a set of operations with respect to a group of data objects referenced by a configuration specification, the set of operations including an operation that adds a new version of a data object to a path associated with the data object directly descending from a current data object version of the path, an operation that alters classifications of paths, an operation that alters classification of data object versions, and an operation that creates a new alternate path associated with a data object by creating a new version of the data object directly descending from a noncurrent version of the data object, and

each virtual partition maintaining a separate change list describing all group updates that it executes,

a method executed by a particular virtual partition for providing a merged data base reflecting changes to the initial data base resulting from all group updates described by change lists maintained by the separate virtual partitions, the method comprising the steps of:

obtaining said change lists maintained by the separate virtual partitions;

selecting a collection of group updates from among all group updates described by said change lists according to predetermined selection criteria subject to a requirement that the collection includes no two group updates described by differing change lists of the separate virtual partitions which two group updates alter a same data object path;

executing the collection of group updates in sequence on said initial set of versions of data objects to produce a resulting set of data objects; and

adding additional data object versions to alternate paths of the resulting set of data objects, each additional data object version resulting from group updates described by said change lists other than group updates included in said collection.

14. The method of claim 13 wherein selection of the collection of group updates is subject to an additional restriction that a successor data object version created by a group update of the collection must be created either by modifying a data object version created by another group update of the collection or by modifying a data object version of the initial set.

15. The method of claim 13 wherein said predetermined criteria comprises maximization of a number of group updates included in the collection, subject to said requirement.

16. The method of claim 13 wherein said predetermined criteria comprises maximization of a number of data object paths altered by group updates included in the collection, subject to said requirement.

17. The method of claim 13 wherein said predetermined criteria comprises maximization of a number of versions created by executing group updates included in the collection, subject to said requirement.

18. The method of claim 13 wherein said predetermined criteria comprises maximization of a weighted combination of a number of paths altered by group updates included in the collection and of a number of group updates included in the collection, subject to said requirement.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

The present invention relates in general to an optimistic, robust, distributed data base system, and in particular to a system for merging virtual partitions in a distributed data base system.

The purpose of a database system is to reliably store data and to process valid, authorized read and write requests on that data. A database system provides a uniform interface for accessing data by executing transactions. A transaction is a unit of work comprising a sequence of steps required to achieve some goal. Each transaction consists of an inhibition phase; a read, compute, and write phase; and a commit phase. In the initiation phase, the transaction is authenticated and assigned some priority for being executed. In the read, compute, and write phase, the transaction carries out the requested action. The commit phase installs all of the results written by the transaction into the database making them visible to other transactions.

The purpose of a "distributed" database system is to provide global database service to users at a collection of sites. A distributed database system is a coalition of sites that share access to a set of data. Each site maintains directory information about the data so that each site has knowledge of all data stored by the database system. Copies of the data itself are distributed among the sites. The sites are connected by a communication network consisting of some physical communication medium and a communication protocol for controlling communication over that medium. This basic communiction protocol is part of the network services of an operating system.

A distributed database must, under normal operating conditions, maintain internal and mutual consistency of the data stored by the system. Internal consistency is maintained if all database updates performed by committed transactions are reflected in the database, none of the updates performed by the uncommitted transactions are reflected in the database, and no transaction is allowed to read a data value written by another transaction that is not yet committed. A distributed database system maintains mutual consistency of the database if given a data item, all replicated copies of the data item are the same. Mutual consistency is defined to mean, "all copies converge to the same state and would be identical should update activity cease".

A "robust" distributed database system is one that processes transactions and preserves internal and mutual consistency even when system components have failed. There are a number of types of component failures. A "site crash" is a machine failure due to loss of power, operating system deadlock or panic, processor malfunction, or human intervention to restart a site. A "network partition" occurs when two or more groups of sites are running but are unable to communicate. "Media failure" occurs when a storage device fails while reading or writing data, rendering some portion of the stored data unreadable. "Software failure" occurs when the internal consistency or the mutual consistency of the database has been compromised due to an error in the implentation of the database system or due to the occurrence of failure types not managed by the protocol implemented by the software.

Preserving internal consistency in the event of failures requires some mechanism for undoing results written by uncommitted transactions and ensuring the results written by committed transactions are reflected in the database. Preserving mutual consistency in the event of failures in the distributed environment requires that all sites acquire knowledge of new data items and new values for existing data items in as timely a manner as possible.

A "robust" system should provide users with continuous service while maintaining internal consistency and mutual consistency of the database. The traditional approach to building a robust system has been to extend failure free protocols for maintaining internal consistency and mutual consistency. However, degrees of robustness can be gained only at considerable cost in time and storage space. In addition, a system can be robust only in the event of failures it can detect and have anticipated detecting.

A robust distributed data base system is described in the paper "Robustness in Distributed Hypothetical Databases" by D. Ecklund, published 1985 in Proceedings of The Nineteenth Hawaii International Conference on System Sciences, and incorporated herein by reference. The system described in this paper establishes "virtual partitions" after system component failure prevents communication between various sites. Each virtual partition is a collection of sites that have access to a copy of the data base and which can still communicate with each other. The sites in each virtual partition continue to read and write access their copy of the data base even though they cannot communicate those changes to other sites in the system. The described system can support robust optimistic access control because under normal processing conditions the system ignores the possibility of update conflicts until they actually occur. If a failure partitions the network, the system may unknowingly process conflicting updates in separate partitions. In some sense the system ignores these conflicts until the failure is repaired and the partitions are brought back together. When the partitions are merged, the conflicting updates occur but are managed by implicitly deriving alternate versions. The general philosophy is that the results of conflucting updates will be by allowing one result to prevail over another.

However, the system described in the above mentioned paper is unsuitable for merging partitions in a hierarchical data base system wherein groups of data objects are grouped into "configurations" that themselves have version histories. While the system can resolve conflicting updates of low level data objects, it is unable to resolve conflicts between updates of configurations so as to provide a consistent version history of each configuration.

SUMMARY OF THE INVENTION

A data base system provides a plurality of separate virtual partitions, each storing separate instances of an initial data base. The data base comprises an initial set of versions of data objects, each version of a particular data object, other than a first version of the data object, being created by modifying an existing version of the data object. The initial data base also includes an initial set of directory data associated with each data object, the directory data identifying non-overlaping "paths of descendancy" for the associated data object, wherein each path of descendancy comprises sequentially created versions of the data object. Each data object version included in a path, except a first data object version of the path, directly descends from a last created data object version of the path. The directory data classifies each path as one of "principal" and "alternate", and classifies each version of each path as being one of "current" and "non-current". However, one and only one path of each object is classified as "principal" and one and only one version of each path is classified as "current".

The initial data base further includes an initial set of configuration specifications, each configuration specification referencing a group of the initial set of data object versions including floating references to a "current" data object version of a "principal" path of a data object.

Each virtual partition independently executes group updates, each group update carrying out at least one of a set of operations with respect to a group of objects referenced by a configuration specification. A group update may, for example, add a new version of a data object to a path, alter classifications of paths, and/or alter classification of data object versions. Each virtual partition maintains a separate change list describing all group updates that it executes.

In accordance with the invention, each virtual partition may provide a merged data base reflecting changes to the initial data base resulting from all group updates described by change lists maintained by the separate virtual partitions. The merged data base maintains internal and mutual consistency for all objects and configurations.

To form the merged data base, a virtual partition first obtains the change lists maintained by the separate virtual partitions. The partition then selects a collection of group updates from among all group updates described by the change lists according to predetermined selection criteria, for example, to maximize the number of group updates in the collection. However, selection of group updates is subject to a restriction that the collection cannot include group updates described by differing change lists of the separate virtual partitions when the group updates alter the same data object path. The virtual partition then executes the collection of group updates in sequence with respect to the initial set of data objects to produce a resulting set of data objects. Finally, the partition adds additional data object versions to alternate paths of the data objects, each additional data object version resulting from group updates described by change lists other than group updates included in the collection.

It is accordingly an object of the invention to provide a method for merging copies of a distributed data base that have been independently updated by separate virtual partitions so as to preserve internal and mutual consistency of the data base.

The subject matter of the present invention is particularly pointed out and distinctly claimed in the concluding portion of this specification. However, both the organization and method of operation of the invention, together with further advantages and objects thereof, may best be understood by reference to the following description taken in connection with accompanying drawings wherein like reference characters refer to like elements.

BRIEF DESCRIPTION OF THE DRAWING

FIG. 1 is a block diagram of a five layer software architecture for a CAE data base system;

FIG. 2 is a block diagram of a CAE database environment with multiple local area networks connected by gateway sites;

FIG. 3 is a block diagram of sites A, B, C, D, and E connected by a local area network and grouped into the federations named Big, Little, and Little;

FIG. 4 is a block diagram of the federations Big, Little, and Little of FIG. 3 with site C enrolling in the federation named Little that site A belongs to;

FIG. 5 is a graphical view of a data object containing four version paths;

FIG. 6A is a block diagram of two clients performing a duplicate checkout of the current version of two data objects,

FIG. 6B is a block diagram of the independent modification of the checked out data by the two clients of FIG. 6A;

FIG. 6C is a block diagram of the results produced by the checkin of the new versions by the two clients of FIG. 6B;

FIG. 7 is a graphical view of steps required in processing a typical write request;

FIG. 8 is a state diagram of illustrating a recovery process;

FIG. 9 is a graphical view of three consecutive temporal versions from one path of a data object;

FIG. 10 is a graphical view of partitioning behavior of four sites;

FIG. 11A is a graphical view of the three identical directory records two sites A and B;

FIG. 11B is a graphical view of the directory records on site B after the divergence recovery;

FIG. 11C is a graphical view of the directory records on site B after the execution of the update request;

FIG. 11D is a graphical view of the directory records on site A after incorporation;

FIG. 12 is a graphical view of a merge of two partitions;

FIG. 13 is a graphical view of a merge of three partitions;

FIG. 14 is a graphical view of partitioning behavior of three sites A, B, and C;

FIG. 15 is a block diagram of the updates performed by two partitions on one object;

FIG. 16 is a graphical view of the object of FIG. 15 following a merge recovery;

FIG. 17 is a graphical view of updates performed by two partitions on one object; and

FIG. 18 is a graphical view of the object of FIG. 17 following a merge recovery.

DESCRIPTION OF THE PREFERRED EMBODIMENT

The present invention is a distributed data base system that has application, for example, in conjunction with a large computer-aided engineering (CAE) project in which many project teams at separate locations must read and write access a common data base. FIG. 1 shows a five layer software architecture for a CAE data base system. Each layer provides services to the layers above.

A user applications layer 10 is a collection of CAE tools such as a graphic editor, a design rule checker, and a simulator. A representation interface layer 12 is concerned with the mapping between internal and external format of the design data stored by the system. A configuration management layer 14 builds a single complex entity called a "configuration" by selecting and combining versions of subcomponents of the entity. At a semantic data model layer 16, an object-oriented data model is used to enforce consistency of the design. A distributed storage server layer 18 provides reliable storage of the design data. Below the storage server layer is a local operating system 20.

Users interact with the layered system by executing a tool program. The tool program asks the database system for access to a copy of the design the user wishes to work on. When a request for data is received, the configuration management layer 14 determines exactly which pieces of data are being requested. The storage server layer 18 locates and provides the requester with a private copy of each data item. The representation interface layer 12 converts the data to the format appropriate to the requesting tool and the user than works with his copy of the data using the operations provided by the tool program. When all user modifications have been completed, the tool requests that the database system store the revised design in the database. The representation interface layer 12 formats the external representation of the revised design into its corresponding internal format. The configuration management layer 14 determines which subcomponents have been modified, and if necessary, modifies the configuration definition. The semantic data model layer 16 certifies the modified design by invoking appropriate consistency checking programs. Once the modified design has been certified it is stored in the database.

FIG. 2 shows a suitable CAE database environment with multiple local area networks 22 connected by gateway sites 24. Each local area network connects a number of machines at sites 26. Project teams of engineers spread over the network access the system through a workstation in their office or through a larger machine.

The communication medium for each local area network (LAN) is an Ethernet 28. ("Ethernet" is a registered trademark of Xerox Corporation.) An Ethernet provides broadcast communication among the sites connected to it. Messages are received in the order they are sent and the non-deliverability of a message is reported to the sender. Each site has a network controller 30 which receives messages broadcast over the Ethernet medium. In order to maximize the number of sites connected to a single Ethernet 28, some controllers are connected to a splitter 32 that passes incoming messages from the Ethernet on to all attached controllers 30. When a site 26 sends a message, the message is sent by the controller 30 through the splitter 32 and out onto the Ethernet 28. Every network controller 30 must be connected to a splitter 32 or an Ethernet tap 34.

Each Ethernet tap 34, each splitter 32, each network controller 30, and each site 26 is a possible failure point in the environment. A "site failure" is a site crash followed, at some time in the future, by a restart of that site. A failed site may also have experienced a media failure due to the site crash. Permanent removal of a site is also possible.

If a controller 30 fails, the site 26 connected to that controller cannot communicate with the remainder of the network. This single site forms one physical partition and the remaining sites form another. The sites that continue to communicate are unable to determine if the failed site has crashed or has been partitioned by a communication failure.

If a splitter 32 or a tap 34 fails, each site connected to the Ethernet 28 by that tap or splitter becomes a single site partition and the remaining sites form another physical partition. It is important to note that a splitter generally has no local loop-back capabilities. Two sites attached to the same splitter can communicate with each other only by sending a message out over the entire Ethernet proper. Thus, if a splitter 32 fails, each site connected to the Ethernet through the splitter forms a single isolated partition.

A site that connects multiple local area networks is called a gateway site 24. If a gateway site 24 fails, two partitions are formed. We call such partitions "probable partitions". An important characteristic of a probable partition is that the maximum set of sites that will be members of the partition is already known.

FEDERATIONS

In order for users to interact with each other and to share their data a global mechanism is provided for organizing users and data in a meaningful way. A "federation" is such a mechanism. A federation is a loosely associated collection of sites, users, and the data they share. Formally, a federation is defined by an ordered triple <O, S, U>, where O is the set of objects in one scheme defined by the data model, S is the set of sites that belong to the federation, and U is the set of users who belong to the federation. Hereinbelow, the term "user" and the term "client" shall be used interchangeably and will refer to the upper layers of the database system which act as clients to the storage system.

If one user at a site is a member of a federation, that site is also a member of that federation. A user may be a member in any number of federations. Membership in a federation is associated with permission to access the data stored in that federation. A user may be an associate member or a participating member. A "participating" member may modify the database by creating or updating entities in the object set. Each entity created is owned by the creating user. Associate members can only read the objects stored by the federation. Sites also hold participant or associate status. If any user at a site is a participating member of a federation, that site is a participating member of that federation.

All sites belonging to a federation maintain fully redundant direction information about the sites, users, and data objects in that federation. Participating sites maintain copies of the data extensions for the set of objects. For reliability, multiple sites may store full copies of the data extensions. Each federation has an associated name that may or may not be unique, and a federation is named when it is defined by one user at one site. There is no meta-federation name server to approve federation names, hence users at other sites may define distinct federations having identical names. By way of example, FIG. 3 shows sites A, B, C, D, and E connected by a local area network 35 and grouped into the federations named Big, Little, and Little.

FEDERATION MANAGEMENT

User initiated actions and environmental system failures may at any moment cause changes in the active structure of a federation. The user initiated operations which affect the set of sites and the set of users in a federation are "define", "enroll", and "secede". A user may define a new federation on the local site, enroll in a federation which exists on another site, or secede from a federation in which the user is currently a member. The execution of each of these operations may be affected by site crashes and network partitions.

"Define" is a user initiated operation which creates a new federation. The initiating user must give a name to the federation being created. This proposed federation name must be a unique federation name on the local site. If the local site already belongs to a federation of the specified name, the new federation cannot be defined and the user is informed. If the federation name is unique, a new scheme <O, S, U> is created. The site set contains the local site, and the user set contains the initiating user. The local site becomes the designed name server site for the new federation. The user and the local site are automatically considered to be participating members in the new federation. The object set is initially empty. Optionally the defining user may specify a default replication factor for the object set of the new federation.

The "enroll+ operation allows a user to join an existing federation. If the site local to that user does not currently belong to this federation, the site is enrolled also. The user initiating the enroll must give the name of the federation which is to be joined, the name of a site known to belong to that federation, and the participation mode the user wishes to enroll under. It is necessary for the upper layers to provide the name of a remote site in an enroll request due to the possibility of duplicate federation names. In such case, the distributed nature of the storage system is seen by the upper layers of the system. FIG. 4 shows the federations Big, Little, and Little of FIG. 3 with site C enrolling in the federation named Little that site A belongs to. A successful execution of the enroll operation will result in the initiating user being added to that federation's set of users. If the local site was not a member of the federation at the time of the enrollment, the local site will be added to that federation's set of sites. Since the local site will not have a directory for federations to which it does not belong, a copy of the directory for this federation is obtained.

An enroll operation may fail for many reasons. It is possible that the local site and the specified site already belong to different federations of the same name. In this case, it is impossible for the user to enroll in the desired federation. The enroll operation will also fail if the specified site can not be reached due to nonexistence of the specified site, or due to a site failure, or a network failure. In such a case, the user may try specifying another site, if another site is known to belong to the desired federation.

The "secede" operation is used when a user wishes to withdraw from a federation. A user may secede from a federation only if he is no longer the owner of any entities of the object set. If all of the local users of a federation secede from that federation the site also secedes from that federation. The seceding site is obliged to retain all redundant copies of object data and sufficient directory information to access these copies until they have migrated or have been archived on a remote dedicated device.

STRUCTURE OF THE STORAGE MODEL

Ecklund and Price proposed a single site storage model for the object set of a federation in "Multiple Version Management of Hypothetical Databases" published 1985 in Proceedings of The Nineteenth Hawaii International Conference on System Sciences, which paper is incorporated herein by reference. The storage model reflects all of the features critical to data management in any team design environment: support for a high degree of data sharing; mutual consistency of sets of data objects; storage and retrieval of temporal versions of data objects; and creation; storage, and retrieval of alternate versions of data objects. Ecklund et al. proposed a distributed hypothetical storage server (DHSS) as an extension of the single site storage model in "Federations: Scheme Management in Locally Distributed Databases", published 1985 in Proceedings of The Nineteenth Hawaii International Conference on System Sciences, which paper is incorporated herein by reference. The present invention is embodied as an improvement to the DHSS, the improved distributed hypothetical storage server hereinafter referred to as "IDHSS".

IDHSS tracks the evolution of a design, storing and maintaining access to temporal and hypothetical versions of each design. The use of hypothetical versions allows a designer to investigate alternate but related designs. Each alternate version is related to either the principal version of a design or to an alternate version of the same design. This means that at the configuration management layer, each entity maps to a design component and at the storage layer that design is represented by IDHSS as a "tree" of instantiations. Such a tree is called a "IDHSS object". The root of the tree is the first instantiation (first version) of the design to be stored in the system. One path in the tree is denoted as the principal path on instantiations along this path correspond to each recorded update of the principal design (i.e., successive temporal versions of the principal design). Every other path in the tree corresponds to an alternate design; the instantiations along such an alternate path are the temporal versions of the alternate design. For each path in the tree there is a distinguished node which corresponds to the "current" version (with respect to time) in that path. FIG. 5 shows a IDHSS object containing four version paths, the principal path 36 and three alternate paths 38. Current versions of each path are labeled with a "C".

The IDHSS data base consists of a collection of IDHSS objects.

OBJECT NAMING

IDHSS supports a name mapping facility to associate an object name with a version of an object. When a IDHSS object is created by a client, that client must provide a unique name to be associated with the object. Each federation has a designed name server site. The function of the name server is to certify, within the federation, the uniqueness of each user-defined object name.

The storage server will map an object name first to a tree of instantiations and then to the particular representative in the tree that is the current version in the principal path. As alternate paths are created, each path is assigned a number called its "implicit alias". The implicit alias numbers are assigned within each tree in the order in which the alternate paths are created. Alternate paths may be named directly by the client or may be referenced by their object name and implicit alias. Non-current versions are referenced by appending a time specification to a valid name. The specified time being the time at which the node instance was the current instance. Names without a time modifier are called "floating references" as the mapping function will map them to the current instance of some path and which instance is current changes over time. Names with time modifiers are "fixed references". This naming scheme allows the client to specify any version in a tree of instantiations.

The following Table I presents the object base name references and the mapping performed by the storage server:

TABLE I ______________________________________ A valid object base name is of the following form: "Entity --Name"