WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Distributed multiprocess transaction processing system and method    
United States Patent4819159   
Link to this pagehttp://www.wikipatents.com/4819159.html
Inventor(s)Shipley; Dale L. (Los Gatos, CA); Arnett; Joan D. (San Jose, CA); Arnett; William A. (San Jose, CA); Baumel; Steven D. (Sunnyvale, CA); Bhavnani; Anil (Campbell, CA); Chou; Chuenpu J. (Sunnyvale, CA); Nelson; David L. (Santa Clara, CA); Soha; Maty (Cupertino, CA); Yamada; David H. (San Jose, CA)
AbstractThe method and means of fault-tolerant processing includes a plurality of system building blocks, each including a real-time processor and specialized processors and local non-volatile memory that are coupled to communicate internally within each of the system building blocks, which, in turn, communicate with one another over local-area network links, and communicate with the remainder of the system over an I/O bus controlled by an I/O processor. Transaction-based processing is under control of a transaction coordinator which permits all of the transaction operations to complete successfully and then alter stored data for the completed transaction, or not to alter any stored data if a transaction is not completed. The transaction coordinator maintains a record of the distributed file accesses required during processing of a transaction, and prevents other transactions from altering stored data during processing of a transaction.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 4819159
Distributed multiprocess transaction processing system and method - US Patent 4819159 Drawing
Distributed multiprocess transaction processing system and method
Inventor     Shipley; Dale L. (Los Gatos, CA); Arnett; Joan D. (San Jose, CA); Arnett; William A. (San Jose, CA); Baumel; Steven D. (Sunnyvale, CA); Bhavnani; Anil (Campbell, CA); Chou; Chuenpu J. (Sunnyvale, CA); Nelson; David L. (Santa Clara, CA); Soha; Maty (Cupertino, CA); Yamada; David H. (San Jose, CA)
Owner/Assignee     Tolerant Systems, Inc. (San Jose, CA)
Patent assignment
All assignments
Publication Date     April 4, 1989
Application Number     06/902,191
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     August 29, 1986
US Classification     714/19 714/15
Int'l Classification     G06F 015/00
Examiner     Shaw; Gareth D.
Assistant Examiner     Fairbanks; Jonathan C.
Attorney/Law Firm     Smith; A. C .
Address
Parent Case    
Priority Data    
USPTO Field of Search     364/200 MS File 364/900 MS File 364/131 364/184 364/186 364/187 371/11 371/12 371/13 371/25 371/67 371/68
Patent Tags     distributed multiprocess transaction processing
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
4751702
Beier
714/13
Jun,1988

[0 after 0 votes]
4628508
Sager
714/13
Dec,1986

[0 after 0 votes]
4577272
Ballew
714/15
Mar,1986

[0 after 0 votes]
4480304
Carr
710/200
Oct,1984

[0 after 0 votes]
4358823
McDonald
714/11
Nov,1982

[0 after 0 votes]
4245306
Besemer
709/245
Jan,1981

[0 after 0 votes]
4228496
Katzman
710/100
Oct,1980

[0 after 0 votes]
4141066
Keiles
700/81
Feb,1979

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


We claim:

1. A distributed processing system comprising: a plurality of processing units,

a plurality of interprocessor communications links each extending from each processing unit to each other processing unit,

each processing unit including a plurality of interprocessor communications links, a real time processor, an applications program processor, a local memory and at least one I/O processor each connected by a bus internal to the unit and separate from an I/O channel, wherein the real time processor controls access to the internal bus by the remaining portions of the processing unit,

a plurality of I/O channels, one connected to each I/O processor included within each processing unit, and

a plurality of communications processors having at least two ports, each port being connected to an I/O channel associated with a different processing unit, and adapted to connect to a plurality of I/O devices,

a plurality of disk controllers connected to each I/O channel, a plurality of disks each having at least two ports and configured to have a first port connected to a disk controller associated with an I/O channel connected to a first processing unit and a second port connected to a disk controller associated with an I/O channel connected to a second processing unit, and further configured to be addressable by only one of said disk controllers at a time,

each processor being associated with at least a key one of said disks which said processor is capable of addressing and each such key disk being configured to store a log-containing transaction information associated with the associated processing unit, and

at least one of said processing units being capable of recognizing the failure of another processing unit, examining the transaction information contained in the log stored on the associated disk, and establishing control of that associated disk.

2. A method for fault tolerant transaction processing comprising the steps of

establishing an association between each of a plurality of processing units and at least one non-volatile storage media having at least two ports, whereby only one processing unit is allowed to address the associated at least one non-volatile storage media so long as the processing unit continues to operate and does not relinquish control of the at least one non-volatile storage media,

storing information concerning status of transactions coordinated by each processing unit in a log on the associated at least one non-volatile storage media,

interconnecting the plurality of processing units through at least one interprocessor communications link,

monitoring signals transmitted from time to time from each processing unit to determine whether any of the processing units has failed,

permitting a second processing unit to establish communications with the at least one non-volatile storage media formerly associated with a failed processing unit and to examine the transaction information for the transactions formerly coordinated by the failed processing unit, and

causing the second processing unit to complete the transactions formerly coordinated by the failed processing unit by committing or aborting those transactions.

3. A method for fault tolerant transaction processing in a processing system having a plurality of processing units, non-volatile memory means for storing a log therein for each processing unit, and interprocessor communications links connecting each of the processing units, the method comprising the steps of

creating a log in the non-volatile memory means for each of the processing units;

establishing a transaction coordinator in one of the processing units for each transaction processed by the system,

recording information developed by the associated transaction coordinator about the associated transaction in the log,

detecting the failure of the processing unit wherein the transaction coordinator resides,

causing a second processing unit to establish communications with the log associated with the failed processing unit,

scanning the log to determine entries which are potentially inconsistent with information stored elsewhere in the system,

interrogating other processing units to identify inconsistencies between the log and information stored elsewhere in the system, and

committing and aborting transactions in accordance with the actual state of those transactions as necessary to cause the log and the information stored elsewhere to be consistent.

4. Fault tolerant transaction processing apparatus comprising:

means coupling each of a plurality of processing units and one non-volatile storage media associated therewith to enable only one processing unit to address the associated at least one non-volatile storage media so long as the processing unit associated therewith continues to operate and does not relinquish control of such at least one non-volatile storage media,

each of said non-volatile storage media storing thereon information concerning status of transactions coordinated by the associated processing unit,

means including an interprocessor communication link interconnecting the plurality of processing units,

means responsive to signals transmitted from each processing unit to determine whether any of the processing units has failed for enabling a second of the plurality of processing units to establish communications with the at least one non-volatile storage media formerly associated with a failed one of the plurality of processing units, and to examine the information stored on such formerly-associated non-volatile storage media concerning status of transactions formerly coordinated by the failed one of the plurality of processing units, and

said second processing unit thereby being enabled to coordinate the transactions formerly coordinated by the failed one of said plurality of processing units for completing or aborting those transactions.

5. Fault tolerant transaction processing apparatus including a plurality of processing units and a plurality of non-volatile memory means and including interprocessor communications links coupling each of the plurality of processors, the apparatus comprising:

means operatively coupling each of the processing units an associated non-volatile memory means for storing thereon a log for the processing unit associated therewith;

at least one of the plurality of processing units including transaction coordinating means for each processed transaction;

means for recording in the log of a non-volatile memory information developed by the associated transaction coordinating means about the processed transactions;

means for detecting the failure of a processing unit which the transaction coordinating means for enabling a second one of the plurality of processing units to establish communications with the non-volatile memory means having there the log associated with the failed processing unit;

a file system for storing information about the actual states the processed transactions;

means for scanning the log associated with a processing unit to determine entries of information which are inconsistent with information stored in the file system; and

means for interrogating other processing units to identify inconsistencies between the logs associated therewith and information stored in said file system for enabling a processing unit to commit or abort transactions in accordance with the actual state of those transactions as determined by information in the associated log being consistent with information stored in the file system.

6. A method for coordinating fault tolerant transaction processing within a system having a plurality of processing units, an operating system, and a plurality of transaction coordination logs stored within non-volatile memory means, and including interprocessor communications links connecting each of the processing units, the method comprising the stops of:

creating a transaction manager within the operating system for each transaction coordination log;

establishing a transaction manager for each transaction processed by the system;

recording information developed by the transaction manager about a transaction in a transaction coordination log;

detecting a failure of the processing unit on which a transaction manager is executing during a transaction;

creating a backup transaction manager within the operating system for another processing unit;

establishing communications between the backup transaction manager and the transaction coordination log associated with the failed transaction manager;

scanning the transaction coordination log to determine entries are potentially inconsistent with information stored elsewhere in the system;

interrogating other Processing units to identify inconsistencies between the transaction coordination log and information stored elsewhere in the system, and

committing and aborting transactions in accordance with the actual state of those transactions as necessary to cause the transaction coordination log and the information stored elsewhere to be consistent.
 Description Submit all comments and votes
 


FIELD OF THE INVENTION

The present application relates to multiprocessing computer systems, and particularly relates to distributed fault tolerant on-line transaction processing computer systems.

BACKGROUND OF THE INVENTION

Multiprocessing systems have been known for some time. Various types of multiprocessing systems exist, including parallel processing systems and a variety of forms of computing systems designed for on-line transaction processing.

On-line transaction processing is generally contrasted with batch processing and real time processing. Batch processing involves queueing up a plurality of jobs with each job serially begun after completion of the prior job and completed prior to beginning the next job, with virtually no interaction with the user during processing. If access to a data base was required, the data base was loaded and unloaded with the job. The elapsed time between placing the job in the queue and receiving a response could vary widely, but in most instances took more than a few minutes so that a user could not reasonably input the job and wait for a response without doing intervening work. Until the late 1970's most commercial computer system architectures were intended primarily for batch processing. Batch processing systems have found particular application in scientific applications.

Real time processing systems represent a small share of the commercial market, and are used primarily in manufacturing applications where a stimulus or request must be acted on extremely quickly, such as in milliseconds. Typical applications for real time processing systems involve process control for monitoring and controlling highly automated chemical or manufacturing processes.

On-line transaction processing systems, on the other hand, frequently involve large databases and far greater interaction with a plurality of individuals, each typically operating a terminal and each using the system to perform some function, such as updating the database, as part of a larger task and requiring a predictable response within an acceptable time. On-line transaction processing systems typically involve large data bases, large volumes of daily on-line updates, and extensive terminal handling facilities. Frequently in on-line transaction processing systems only the current version of a database will be contained within the system, without paper backup.

Computer system architectures designed specifically for on-line transaction processing were introduced in the late 1970's, although more conventional batch systems are frequently offered in non-batch configurations for use in the on-line transaction processing. Over time, on-line transaction processing has come to impose several requirements on the processing system. Those requirements include substantially continuous availability of the system, expandability (usually in a modular form), data integrity even in the event of a component failure, and ease of use.

The requirements for substantially continuous availability of the system and data integrity, taken together, are generally referred to as "fault tolerance". A commercially acceptable on-line transaction processing system must therefore offer, as one of its attributes, fault tolerance. However, the term fault tolerance may still be the subject of confusion since it can apply to both hardware and software, hardware only, or software only; in addition, fault tolerance can mean tolerance to only one component failure, or to multiple component failures. In the current state of the art, fault tolerance is generally taken to mean the ability to survive the failure of a single hardware component, or "single hardware fault tolerance".

It may be readily appreciated that fault tolerance could not be provided in a single processor system, since failure of the processor would equate to failure of the whole system. As a result, fault tolerant systems involve multiple processors. However, not all fault tolerant systems need be suited to on-line transaction processing.

Fault tolerant multiprocessor systems range from so-called "cold", "warm" and "hot" backup systems to distributed, concurrent on-line transaction processing systems such as described in U.S. Pat. No. 4,228,496. Cold, warm and hot backup systems are used primarily with batch processing systems, and involve having a primary computer performing the desired tasks with a second computer at varying stages of utilization. When the primary computer fails, the system operator performs a varying range of steps and transfers the task formerly performed on the failed primary system onto the substantially idle backup system. This form of fault tolerant design was usually prohibitively expensive, offered little protection against data corruption, and presented generally unacceptable delays for on-line use.

Fault tolerant distributed processing systems have included systems using a lock-stepped redundant hardware approach initially developed for military and aerospace applications and currently marketed, in a somewhat modified form, by Stratus Computer, as well as those using a combination of hardware and software to achieve fault tolerance, such as described in the afore-mentioned '496 patent. Another approach using a combination of hardware and software to achieve fault tolerance was formerly marketed by Synapse Computer, and involved providing a single additional processor as a hot backup for all other processors in the multiprocessor system.

The redundant hardware approach suffers from a number of limitations, including particularly difficulties in maintaining the requisite tightly couple relationship between the various system elements, and limitations in software development and flexibility.

While the system described in U.S. Pat. No. 4,228,496 provided many improvements in the field of distributed fault tolerant computing, that system also suffers from limitations relating to the overhead required for handling of transaction-based operations. With regard to the overhead required for handling transactions, the system described in the '496 patent appears to require continued communications between primary and backup processors to ensure that the status of the transaction at key stages, called checkpoints, is communicated from the primary to the backup processor. This relatively continuous checkpointing imposes an undesirable overhead requirement. Moreover, depending upon the application being run by the system, the overhead requirement can become an extreme burden on the system.

The system described in the '496 patent also suffers from the limitation of requiring applications programs to be compatible with or written for specially developed software. Such specially developed software in many instances requires programmers to learn new programming languages and unnecessarily limits the ease with which applications can be developed for or ported to the system. It has become well recognized that one of the major stumbling blocks to use of more efficient systems for transaction processing has been the cost of rewriting the customer's application programs for use on a fault tolerant transaction processing system, and these costs are greatly magnified when learning of an entirely new language is required.

As a result, there has been a need for a distributed multiprocessing system capable of fault tolerant operation with simplified handling of transaction based operations.

Thus, there has also been a need for a loosely coupled distributed multiprocessing system capable of fault tolerant operation using conventional operating systems.

SUMMARY OF THE INVENTION

The present invention substantially resolves many of the aforementioned limitations of the prior art by providing a distributed, multiprocess on-line transaction processing system which employs multiple concurrent processors communicating by conventional LAN links and based on the UNIX operating systems modified for multiprocessor operation.

Fault tolerance is provided by the distributed processing architecture of both the hardware and the software, including multiported disks and related devices, unique and moveable message queues, distributed system rendezvous, extent based file allocation, and kernel based transaction processing, among others which will be more greatly appreciated from the detailed description provided hereinafter.

The hardware architecture of the current system is based on the National 32000 chip set, and utilizes a plurality of system building blocks (SBBs) each comprising a real time processor, a user processor, an I/O processor and a system interconnect board (SIB) and local memory. The specialized processors, the SIB and the memory communicate internally within the SBB by means of an internal mainframe bus.

The SBBs communicate with one another over LAN links such as Ethernet, and communicate with the remainder of the system over the I/O bus controlled by the I/O processor.

The system is transaction based, so that each transaction is treated atomically and requires no unusual management or overhead. Checkpointing is eliminated.

For purposes of the present invention, a transaction is defined as a sequence of operations that execute atomically, such that either all of the operations execute successfully, or none of the operations are permitted to alter the stored data. The atomicity of transactions is ensured by establishing a transaction coordinator, which maintains a record, or log, of the distributed file accesses required during processing of the transaction, combined with file and block level locks to prevent other transactions from altering the data at inappropriate times during processing.

In most cases, the transaction completes, at which time the files or blocks read or updated by the transaction are released to be used by other transactions. If a transaction aborts, the transaction coordinator causes all data files that the transaction changed to revert to the state they were in at the time the transaction began. During processing of a transaction, a consistent view of all required files is maintained; that is, no required file may be changed by any other activity in the system until the transaction has either completed or aborted. In the event the SBB having the transaction coordinator fails, the coordinator migrates to another SBB and a consistent view of the data is again restored by restoring the data to its state prior to the beginning of the last transaction, and notifies the process' signal handler with a SIGABORT signal and code. The process may then restart the process if desired. Restarting of the coordinator occurs automatically. In this manner continuous availability of the system and the data is provided.

Because no unusual languages are required, complete rewriting of user programs is not required, providing significantly improved portability of applications to the fault tolerant environment. Improved throughput is provided by, among other things, the use of interprocess communications channels which permit I/O operations to be localized to the SBB associated with the disk owning the data, independently of the location of the requesting SBB, rather than requiring each I/O to be managed from the requesting SBB.

It is therefore one object of the present invention to provide an improved multiprocessor system.

It is another object of the present invention to provide a concurrent distributed multiprocessing system which is transaction based.

It is yet another object of the present invention to provide a concurrent distributed multiprocessing system which is fault tolerant.

It is still another object of the present invention to provide a multiprocessing system which uses a conventional and readily transportable operating system such as UNIX.

It is a further object of the present invention to provide a distributed, fault tolerant multiprocessing system in which interprocessor communications are managed as portions of a local area network, such as through Ethernet links.

It is a still further object of the present invention to provide a multiprocessor system which can be automatically and dynamically balanced.

It is yet a further object of the present invention to provide a fault tolerant multiprocessor system in which message queues related to a logical data volume can be moved and reopened to provide access to the data in the event of a processing failure.

These and other objects of the present invention can be better appreciated from the following detailed description of the present invention, when taken in light of the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic block diagram showing in functional block diagram form a two processor (SBBO and SBBl) arrangement of the multiprocessor system according to the present invention;

FIG. 2 shows in schematic block diagram form the elements of a System Building Block, hereinafter referred to as an SBB;

FIG. 3 shows in schematic block diagram form the Real Time Processor Unit (RPU), of the SBB;

FIG. 4 and 4a, 4b, 4c, 4d show in schematic block diagram form the User Processing Unit (UPU) of the SBB;

FIGS. 5a-5b shows in schematic block diagram form the I/O processor (IOP) of the SBB, and the flow diagram for the associated control program, respectively;

FIG. 6 and 6a, 6b schematic block diagram form the System Interconnect Board (SIB) of the SBB.

FIGS. 7a-7b show in schematic block diagram form the main memory of the SBB, and the partitioning of the main memory by the operating system, respectively; and

FIGS. 8a, 8b, 8c, 8d show four types of read operations in accordance with the present invention.

FIGS. 9a, 9b, 9c conceptually depicts three types of interprocess communications channels.

DETAILED DESCRIPTION OF THE INVENTION

Referring first to FIG. 1, the general hardware architecture of the system of the present invention can be better appreciated. FIG. 1 illustrates in block diagram form a two processor module embodiment. First and second processor modules, or SBBs, 10 and 20. The SBBs 10 and 20 are interconnected by dual local area network (LAN) links 30 and 40, which may for example be Ethernet links, fiber optics cables, or other suitable communications channel.

Each of the SBBs 10 and 20 communicate through I/O channels or buses 50 and 60, respectively, with various forms of peripherals controllers, including tape controllers 70, disk controllers (DC) 80, and communications interface processors, or communications interfaces, 90. Each I/O channel is terminated with a channel terminator 100. As will be better appreciated from FIG. 2, a single SBB may have either one or two I/O channels. The SBBs 10 and 20 and the various controllers 70, 80 and 90 are each provided with a local power supply (ps).

The tape controllers, which may be single ported, are associated with at least one tape drive 110. Likewise, the disk controllers 80 are each single ported, although the associated disk drives 120, 130, 140 and 150 are preferably dual ported and accessible from two SBBs. Each disk is "owned" by only a single SBB, and that disk may only be addressed by that SBB for so long as the SBB retains ownership. Ownership is maintained by the owner SBB regularly updating a timestamp in a status sector of the disk, which is also regularly checked by the volume manager (associated with the operating system, as discussed hereinafter) of the non-owning SBB. If the owner SBB fails, the volume manager of a non-owning SBB will observe the outdated timestamp and assert ownership of the disk.

Typically, timestamps are applied to disks periodically and whenever an action occurs which could affect data consistency. As will be understood by those skilled in the art, the disks may be operated in a mirrored configuration, or may be operated without a mirroring backup. Additional disks may be added to the system during operation.

Each of the two communications interfaces 90 shown in FIG. 1 in turn communicates with a System Distribution Board, or SDB 160, to which one or more terminals 170 may be connected. Printers and other similar output devices may be connected to the communications interface 90. Additional communications interfaces 90 may be added to the system when the system is operating by providing updated configuration information to the system and initializing the communications interface.

It will be appreciated that, while FIG. 1 shows a two SBB embodiment, the configuration of FIG. 1 could be expanded to permit communication between a number of SBBs limited only by the characteristics of the LAN links 30 and 40. In such an arrangement, the additional SBBs may be added to the system by simply identifying the additional SBBs to the SIBs in the system, extending the LAN links 30 and 40 to the additional SBBs, and by adding appropriate disk controllers and disks to permit storage of and access to the necessary files. The additional SBBs may be identified to the SIBs prior to system start-up, or may be identified to the SIBs serially while the system is operating. In addition, failed SBBs can be removed from the system and restarted without taking the system down, providing a simple and effective form of modular expandability.

In such an arrangement, the disk controllers will preferably be shared, in at least some instances, between the SBBs 10 and 20 and the additional SBBs to provide balanced loading and better access in the event of failure. In addition, while FIG. 1 describes an implementation using dual Ethernet cables, other forms of local area networking such as fiber optics will also provide suitable, and in some cases improved, performance.

Referring now to FIG. 2, the configuration of the SBBs 10 and 20 can be better appreciated. Each SBB is comprised of a plurality of special purpose processors and related logic which communicate through a Mainframe Bus, or MFB 200. In one embodiment, the MFB 200 is a 64 bit data bus.

The special purpose processors and related logic which comprise the SBB 10 includes a Real Time Processor Unit (RPU) 210, a User Processor Unit (UPU) 220, I/O Processors (IOP) 230a and 230b, and System Interconnect Board (SIB) 240, and Memory (MEM) 250. As can be seen from FIG. 2, the RPU 210 provides the primary control for access to the MFB 200, and thus communicates with each other module which forms the SBB by means of one or more lines. Although these lines are shown separately from the MFB 200, in fact each of the lines between the modules which form the SBB are part of the MFB 200, and are shown separately only for purposes of clarity. The lines included in the MFB 200, but not separately shown in FIG. 2, are address lines (0-31), address parity lines (0-2), data lines (0-63), data parity (0-7), data byte valid (0-7), and data byte valid parity bit. It will also be appreciated, both in FIG. 2 and throughout the remaining FIGS. discussed herein, that numerous detailed control signals have been eliminated to avoid obscuring the present invention in details which will be apparent to those of ordinary skill in the art, given the teachings herein.

The RPU 210 communicates with each of the other modules in the SBB by means of some lines which go to each of the other modules, and other lines which run uniquely to one of the other modules. Lines which run to each of the other modules include bidirectional ready, abort and busy lines 254a-c, two bidirectional lines 256a-b for sending address and data strobes and write signals, and a reset line 258 extending from the RPU 210 to each of the other modules. The RPU 210 is the first module in the SBB to initialize on system reset, and in turn sets each of the other SBB modules to a known state.

Lines which run uniquely from the RPU 210 to the remaining Specialized processor modules, the UPU 220, the IOP's 230a-b and the SIB 240, include interrupt request lines 260a-c from the other modules to the RPU 210, bus request lines 262a--d from the other modules to the RPU, and bus grant lines 264a-d from the RPU 210 to the other modules. It should be noted that both IOPs use only one interrupt line 260b with the operating system left to determine which IOP has actually made the request. Also, each IOP 230 has a unique bus request line 262 and bus grant line 264. The memory 250 communicates with the remainder of the system primarily over the MFB 200 as necessary, but may send an interrupt request directly to the RPU 210 over interrupt request line 266.

As will be appreciated from the discussion hereinafter, the UPU 220 maintains most frequent access to the MFB 200, and therefore the RPU 210 includes additional lines to better facilitate the frequent communications which occur between the RPU 210 and the UPU 220. These additional lines include a non maskable interrupt request line 268 from the UPU 220 to the RPU 210, an interrupt request line 270 from the RPU 210 to the UPU 220, a bus clear line 272 from the RPU 210 to the UPU 220, and a reset UPU line 274 from the RPU 210 and separate from the general reset line 258.

As will be better appreciated hereinafter, the purpose of the UPU 220 is, in general, to execute user level software in the SBB, and the UU includes cache storage for software executing on the UPU. The IOPs 230a-b generally control communication between the peripherals and the SBBs, including interrupts, arbitration and data transfer. Included in such functions is the processing of a channel program built by the RPU 210 in the MEM 250. As shown partially in FIG. 1, the I/O channels or buses 50 and 60 and associated controllers 70, 80 or 90 communicate with the SBB through the IOPs 230a-b. While two IOPs are shown in FIG. 2, the system may be configured to have one or more IOPs.

The SIB 240 controls communications between processors, including providing a dual ported Ethernet or other LAN interface to accommodate the dual Ethernet or other LAN cables 30 and 40. In one embodiment, the size of the memory 250 can range from 4 to 12 megabytes of solid state RAM of a generally conventional design.

Referring now to FIG. 3, the RPU 210 can be better appreciated. A microprocessor (MPU) 300, which may for example be a 32032 such as is commercially available from National Semiconductor (or in some embodiments a 32016 processor available from the same vendor), receives timing control signals from a timing control unit (TCU) 302. The TCU 302 may be a 32201 chip, also available from National.

The MPU 300 provides and receives address and data information on a main bus 304 in a conventional manner. The main bus 304 then communicates address information through an address latch 306, and provides and receives data through a pair of data buffers 308 and 310. The address latch 306 provides address information via a bidirectional address bus 312 to an EPROM array 314, an EEPROM array 316, local RAM 318, and an interrupt control circuit (CIO) 320, as well as one or more asynchronous communications controllers (ASCCs) 322a-b. In addition, the address bus 312 communicates with an real time clock (RTC) address buffer 326 and an MFB address buffer 328.

The data buffer 308 communicates data via a first bidirectional data bus 330 with the EPROM array 314, and the EEPROM array 316; it also communicates data, via a secondary data buffer 332, with the CIO 320 and the ASCCs 322a-b. In addition, the first data bus 330 communicates with an RTC data buffer 336, error status registers 338, a diagnostics data buffer, control registers 32, and an MFB data buffer 344. The data buffer 310 communicates with the RAM 318 via a second data bus 334.

The EPROM array 314, which may vary greatly in size but typically is on the order of 512K bytes, stores the firmware for the RPU. The EEPROM array 316, which typically is on the order of 2k bytes, stores identification for the SBB, system and network, and also stores diagnostic related information. The RAM 318, which is typically parity protected, may be on the order of one megabyte. The CIO 320, which may for example be an 8536 chip as commercially available from Zilog or AMD, receives a plurality of interrupt request inputs 346 in addition to connections to the bus 312 and buffer 332, and sends an interrupt signal to the MPU 300 by means of an interrupt output 348. The ASCCs 322a and 322b may be, for example, an 8531 chip available from Zilog and other manufacturers, or may be any of several other types; but typically will provide an output compatible with a desired communications protocol, such as RS232 or RS422 as examples.

The RTC address buffer 324 communicates with a real time clock (RTC 350, which in turn communicates on the data side with RTC data buffer 336. The RTC 350 is provided with a battery back-up 352 to ensure continued operation during system outages. Lastly on the address bus side, the MFB Address Buffer 328 communicates bidirectionally with the MFB 200.

On the data bus side, the data buffer receives input from a switch bank, which permits the RPU to be forced to specific states during diagnostics. The status registers 338 receive an error strobe input 356 from error logic, which detects bus timeout, bad parity and bus abort signals from the remainder of the system. The control register 342 asserts control bits to be used on and off board. On board uses of the control bits may include a status display 358, which may for example comprise LEDs, indicator lights, or so on. Off board uses include fault and ready LED arrays. Lastly, the MFB data buffers 344 communicate data with the MFB 200.

In addition to the elements of the RPU previously described, the RPU also includes MFB arbitration logic 360, which may for example be a 68452 chip as available from Motorola; the arbitration logic 360 is otherwise not connected to the remainder of the RPU except to receive bus requests from the RPU and to provide bus grants as appropriate. The arbitration logic 360 receives bus requests from the elements of the SBB capable of making such a request, and also receives a bus busy signal. When appropriate in response to bus requests, the arbitration logic 360 provides a bus granted output on line 364 and a bus clear signal on line 366.

Referring now to FIG. 4, the UPU 220 can be better appreciated. The UPU 220 comprises a CPU 400, which may for example be either a National 32032 processor or alternatively a National 32016 processor, together with a memory management unit (MMU) 402, and a floating point arithmetic unit (FPU) 404. The MMU 402 may be a National 32082 device, and the FPU 404 may be a National 32081. A timing control unit 406, also part of the National 32000 chip set, provides timing signals to the CPU 400, MMU 402 and FPU 404, each of which communicates via a bidirectional main bus 408. The main bus 408 also connects the various specialized processors 400-404 to address latches 410 and a bidirectional data buffer 412.

The output of the address latch 410 is provided on a latch address bus 414 to variety of locations. The latch 410 provides address information to a control decoder 416, a parity enable mux 418, and a bus enable gating logic 420. In addition, the latch address bus 414 communicates data from the address latches 410 to MFB address buffers 422, error latches 424, CIO 426, and address parity generator/checker 428. The MFB address buffers 422 in turn supply address information to the MFB 200.

Further, the bus 414 supplies address data to a TAG RAM 430, a TAG data buffer 432, a TAG hit comparator 434, a TAG parity generator 436, flush counter decode logic 438, a flush counter 440, and a cache data parity RAM 442. Finally, the bus 414 supplies address data to the cache data RAM 444.

The bus enable gating logic 420 provides gating signals on an eight bit gating