WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
System for method for performing a context switch operation in a massively parallel computer system    
United States Patent6047122   
Link to this pagehttp://www.wikipatents.com/6047122.html
Inventor(s)Spiller; Cynthia J. (Wakefield, RI)
AbstractA parallel computer comprises a plurality of processing elements and a control processor all interconnected by a communications network. The control processor and the processing elements process a plurality of user programs, with the control processor and the processing elements processing each user program in parallel. The control processor, while processing each user program, generates user program processing commands and transfers them to the processing elements over the communications network, and each the processing element processes data associated with a particular user program in response to each user program processing command received from the communications network. The control processor periodically further generates a context switch command to enable the processing elements to, in parallel, perform a context switch operation, in which they switch processing of a user program they are currently processing and begin processing of another user program.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 6047122
System for method for performing a context switch operation in a

     massively parallel computer system - US Patent 6047122 Drawing
System for method for performing a context switch operation in a massively parallel computer system
Inventor     Spiller; Cynthia J. (Wakefield, RI)
Owner/Assignee     TM Patents, L.P. (New York, NY)
Patent assignment
All assignments
Publication Date     April 4, 2000
Application Number     08/916,136
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     August 21, 1997
US Classification    
Int'l Classification    
Examiner     Barry; Lance Leonard
Assistant Examiner    
Attorney/Law Firm     Cesari and McKenna, LLP
Address
Parent Case     This is a continuation of application Ser. No. 08/698,396 filed on Aug. 15, 1996 now abandoned, which is a continuation of application Ser. No. 08/249,654 filed on May 26, 1994 now abandoned, which is a continuation of application Ser. No. 07/880,125 filed on May 7, 1992 now abandoned.
Priority Data    
USPTO Field of Search    
Patent Tags     performing context switch operation a massively parallel computer
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5680550
Kuszmaul
712/11
Oct,1997

[0 after 0 votes]
5590283
Hillis
712/29
Dec,1996

[0 after 0 votes]
5530809
Douglas
709/250
Jun,1996

[0 after 0 votes]
5390298
Kuszmaul
712/22
Feb,1995

[0 after 0 votes]
5353412
Douglas
709/243
Oct,1994

[0 after 0 votes]
5265207
Zak
712/15
Nov,1993

[0 after 0 votes]
5212777
Gove
712/229
May,1993

[0 after 0 votes]
5197130
Chen
712/3
Mar,1993

[0 after 0 votes]
5179702
Spix
718/102
Jan,1993

[0 after 0 votes]
5170482
Shu
712/12
Dec,1992

[0 after 0 votes]
5152000
Hillis
712/11
Sep,1992

[0 after 0 votes]
5129077
Hillis
712/13
Jul,1992

[0 after 0 votes]
5123109
Hillis
712/22
Jun,1992

[0 after 0 votes]
5113390
Hayashi
370/352
May,1992

[0 after 0 votes]
4860201
Stolfo
712/11
Aug,1989

[0 after 0 votes]
4847755
Morrison
712/203
Jul,1989

[0 after 0 votes]
4835685
Kun
710/260
May,1989

[0 after 0 votes]
4831518
Yu
714/4
May,1989

[0 after 0 votes]
4709327
Hillis
712/14
Nov,1987

[0 after 0 votes]
4608631
Stiffler
710/113
Aug,1986

[0 after 0 votes]
3573852
Watson
451/11
Apr,1971

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed as new and desired to be secured by Letters Patent of the United States is:

1. A parallel computer comprising a plurality of processing nodes interconnected by a control network and a data router, wherein:

(a) each processing node comprises a data router interface, a control network interface, and a processor, each processor being either a scalar processor or a processing element;

(b) the control network comprises a plurality of network nodes arranged in a fat-tree structure for transferring program commands and status information between the processing nodes, whereby each processing node can transfer program commands and status information to every other processing node through the control network;

(c) the data router comprises a plurality of router nodes arranged in a fat-tree structure for transferring data packets between the processing nodes, whereby each processing node can transfer data packets to every other processing node through the data router;

(d) one of the scalar processors comprises means for periodically issuing a context switch command to initiate a context switch causing the parallel computer to switch from a first user program in a first context to a second user program in a second context;

(e) the data router comprises means for replacing data packets relating to the first context with data packets relating to the second context in each router node in response to the context switch command;

(f) the control network comprises means for replacing program commands and status information relating to the first context with program commands and status information relating to the second context in each network node in response to the context switch command; and

(g) each processing node comprises means for replacing internal data relating to the first context with internal data relating to the second context in response to the context switch command.

2. The parallel computer of claim 1 wherein one of the scalar processors is a control processor, the control processor including:

(a) means for periodically sequencing into a supervisor operating mode to determine whether a time period for processing the first context has expired; and

(b) initiating means for initiating the context switch if the time period has expired.

3. The parallel computer of claim 2 wherein the processing nodes further comprise:

(a) means responsive to the initiating means for emptying the control network, the data router, and the processing nodes of information relating to the first context;

(b) means responsive to the initiating means for loading the control network, the data router, and the processing nodes with information relating to the second context; and

(c) means responsive to the initiating means for storing the information relating to the first context so that the first context may be subsequently restored.

4. The parallel computer of claim 3 wherein:

(a) the initiating means comprises means for issuing a context switch packet from the control processor to the control network, the context switch packet comprising an identification of the second context and a time at which the second context was initiated;

(b) the control network further comprises:

(i) means for directing the context switch packet up the control network fat-tree structure to a root network node; and

(ii) means for propagating the context switch packet down the control network fat-tree structure from the root network node to reach every processing node; and

(c) the processing nodes further comprise

(i) means responsive to the context switch packet for emptying the control network, the data router, and the processing nodes of information relating to the first context;

(ii) means responsive to the context switch packet for loading the control network, the data router, and the processing nodes with information relating to the second context; and

(iii) means responsive to the context switch packet for storing the information relating to the first context so that the first context may be subsequently restored.

5. The parallel computer of claim 3 wherein:

(a) each data packet further comprises a message address that specifies an intended destination processing node;

(b) the initiating means comprises means for issuing an all-fall-down command to the data router whereby the data router directs every data packet relating to the first context directly down the fat-tree structure for storage in a processing node;

(c) the processing nodes further comprise a data router monitoring means for determining when the data router is empty of data packets relating to the first context; and

(d) the initiating means further comprises means, responsive to the data router monitoring means, for transmitting data router packets relating to the second context up the data router fat-tree structure such that the processing nodes can begin processing the second user program in the second context.

6. The parallel computer of claim 1 wherein the context switch command is processed within a partition, the partition comprising:

(a) a control processor;

(b) less than all of the processing nodes;

(c) a logical root of the control network comprising a network node which is a root of a fat-tree sub-structure within the control network fat-tree structure that interconnects every processing node within the partition; and

(d) a logical root of the data router comprising a router node which is a root of a fat-tree sub-structure within the data router fat-tree structure that interconnects every processing node within the partition.

7. In a parallel computer comprising a plurality of processing nodes, one of the processing nodes being a control processor, a control network, the control network being arranged in a first fat-tree structure interconnecting the processing nodes, and a data router, the data router being arranged in a second fat-tree structure interconnecting the processing nodes, a method for performing a context switch from a first context to a second context comprising the steps of:

(a) issuing a context switch command from the control processor to the control network, the data router, and the processing nodes;

(b) emptying the data router of data relating to the first context in response to the context switch command;

(c) loading the data router with data relating to the second context in response to the context switch command;

(d) emptying the control network of program information relating to the first context in response to the context switch command; and

(e) loading the control network with program information relating to the second context in response to the context switch command.

8. The method of claim 7 wherein the step of emptying the data router comprises the steps of:

(a) associating a destination address with each item of data in the first fat-tree structure;

(b) directing each item of data down the first fat-tree structure toward any of the processing nodes;

(c) storing each item of data in one of the processing nodes when the item of data arrives at one of the processing nodes; and

(d) generating a signal to the control processor when the data router contains no further items relating to the first context.

9. The method of claim 7 wherein the step of emptying the control network further comprises the steps of:

(a) buffering program information relating to the first context received from the control network at each processing node;

(b) storing program information to be transmitted over the control network at each processing node;

(c) determining the last item of program information received at each processing node and the last item of program information sent from each processing node so that the state of the control network may be subsequently restored; and

(d) flushing the control network of all program information relating to the first context.

10. The method of claim 7 wherein the step of issuing a context switch command further comprises the step of broadcasting a context switch command packet over the control network.
 Description Submit all comments and votes
 


FIELD OF THE INVENTION

The invention relates generally to the field of digital computer systems, and more particularly to computer systems including a plurality of processors operating generally in parallel.

BACKGROUND OF THE INVENTION

A digital computer system generally comprises three basic elements, namely, a memory element, an input/output element and a processor element. The memory element stores information in addressable storage locations. This information includes data and instructions for processing the data. The processor element fetches information from the memory element, interprets the information as either an instruction or data, processes the data in accordance with the instructions, and returns the processed data to the memory element. The input/output element, under control of the processor element, also communicates with the memory element to transfer information, including instructions and the data to be processed, to the memory, and to obtain processed data from the memory.

Most modern computing systems are considered "von Neumann" machines, since they are generally constructed according to a paradigm attributed to John von Neumann. Von Neumann machines are characterized by having a processing element, a global memory which stores all information in the system, and a program counter that identifies the location in the global memory of the instruction being executed. The processing element executes one instruction at a time, that is, the instruction identified by the program counter. When the instruction is executed, the program counter is advanced to identify the location of the next instruction to be processed. In many modern systems, the program counter is actually advanced before the processor has finished processing the current instruction.

Von Neumann systems are conceptually uncomplicated to design and program, since they do only one operation at a time. A number of advancements have been made to the original von Neumann paradigm to permit the various parts of the system, most notably the various components of the processor, to operate relatively independently and to achieve a significant increase in processing speed. One such advancement is pipelining of the various steps in executing an instruction, including instruction fetch, operation code decode (a typical instruction includes an operation code which identifies the operation to be performed, and in most cases one or more operand specifiers, which identify the location in memory of the operands, or data, to be used in executing the instruction), operand fetch, execution (that is, performing the operation set forth in the operation code on the fetched operands), and storing of processed data, which steps are performed relatively independently by separate hardware in the processor. In a pipelined processor, the processor's instruction fetch hardware may be fetching one instruction while other hardware is decoding the operation code of another instruction, fetching the operands of still another instruction, executing yet another instruction, and storing the processed data of a fifth instruction. Since the five steps are performed sequentially, pipelining does not speed up processing of an individual instruction. However, since the processor begins processing of additional instructions before it has finished processing a current instruction, it can speed up processing of a series of instructions.

A pipelined processor is obviously much more complicated than a simple processor in a von Neumann system, as it requires not only the various circuits to perform each of the operations (in a simple von Neumann processor, many circuits could be used to perform several operations), but also control circuits to coordinate the activities of the various operational circuits. However, the speed-up of the system can be dramatic.

More recently, some processors have been provided with execution hardware which includes multiple functional units each being optimized to perform a certain type of mathematical operation. For example, some processors have separate functional units for performing integer arithmetic and floating point arithmetic, since they are processed very differently. Some processors have separate hardware functional units each of which performs one or only several types of mathematical operations, including addition, multiplication, and division operations, and other operations such as branch control and logical operations, all of which can be operating concurrently. This can be helpful in speeding up certain computations, most particularly those in which several functional units may be used concurrently for performing parts of a single computation.

In von Neumann processors, including those which incorporate pipelining or multiple functional units (or both, since both may be incorporated into a single processor), a single instruction stream operates on a single data stream. That is, each instruction operates on data to enable one calculation at a time. Such processors have been termed "SISD," for "single-instruction/single-data." If a program requires a segment of a program to operate on a number of diverse elements of data to produce a number of calculations, the program causes the processor to loop through that segment for each calculation. In some cases, in which the program segment is short or there are only a few data elements, the time required to perform such a calculation may not be unduly long.

However, for many types of such programs, SISD processors would require a very long time to perform all of the calculations required. Accordingly, processors have been developed which incorporate a large number of processing elements all of which may operate concurrently on the same instruction stream, but with each processing element processing a separate data stream. These processors have been termed "SIMD" processors, for "single-instruction/multipledata." An example of such a system is disclosed in U.S. Pat. No. 4,598,400, issued Jul. 1, 1986, in the name of W. Daniel Hillis, for Method And Apparatus For Routing Message Packets.

SIMD processors are useful in a number of applications, such as image processing, signal processing, artificial intelligence, database operations, and computer simulation of a number of things, such as electronic circuits and fluid dynamics. In image processing, each processing element may be used to perform processing on a pixel (picture element) of the image to enhance the overall image. In signal processing, the processors concurrently perform a number of the calculations required to perform such computations as the "Fast Fourier transform" of the data defining the signal. In artificial intelligence, the processors perform searches on extensive rule bases representing the stored knowledge of the particular application. Similarly, in database operations, the processors perform searches on the data in the database, and may also perform sorting and other operations. In computer simulation of, for example, electronic circuits, each processor may represent one part of the circuit, and the processor's iterative computations indicate the response of the part to signals from other parts of the circuit. Similarly, in simulating fluid dynamics, which can be useful in a number of applications such as weather prediction and airplane design, each processor is associated with one point in space, and the calculations provide information about various factors such as fluid flow, temperature, pressure and so forth.

Typical SIMD systems include a SIMD array, which includes the array of processing elements and a router network, a control processor and an input/output component. The input/output component, under control of the control processor, enables data to be transferred into the array for processing and receives processed data from the array for storage, display, and so forth. The control processor also controls the SIMD array, iteratively broadcasting instructions to the processing elements for execution in parallel. The router network enables the processing elements to communicate the results of a calculation to other processing elements for use in future calculations.

More recently, in massively parallel computing systems, multiple instruction/multiple data (MIMD) systems have been developed in which a plurality of processors each operates in response to its own instruction stream processes data. In a MIMD system, each processor can process data based on its individual programming and the results of previous processing, separately from the other processors, which can be an advantage over SIMD systems in connection with some types of problems. However, a processor often requires the results of processing by other processors, or the processors must synchronize their respective processing statuses, which can be more easily achieved in a SIMD system.

As a result, new computer architectures are being developed, generally known as "S/MIMD" (for "synchronous MIMD"). In an S/MIMD system, a control processor transmits commands to a set of processors, each of which processes data in response to the command. In response to each command, a processor may execute one or more instructions. S/MIMD systems thus maintain a single point of control, but the control is on a command-by-command basis, rather than on an instruction-by-instruction basis. The particular instruction or series of instructions executed by a particular processor in response to a command may depend on the command itself, as well as on results of previous processing by the particular processor, and perhaps on results of previous processing by other processors. In any case, the control processor provides a degree of synchronization for the processors which receive commands therefrom.

SUMMARY OF THE INVENTION

The invention provides a new and improved parallel computer including an arrangement for performing in a parallel manner a context switch operation.

In brief summary, the new arrangement provides in one aspect a parallel computer comprising a plurality of processing elements and a scalar processor all interconnected by a communications network. The communications network further comprises a data router for transferring data between processors and a control network for transferring program commands, status information, and synchronization signals between processors. The scalar processor and the processing elements process a plurality of programs, with the scalar processor and the processing elements processing each program in parallel. The scalar processor, while processing each program, generates commands and transfers them to the processing elements over the communications network, and each of the processing elements processes data associated with a particular program in response to each command received from the communications network. The scalar processor periodically further generates a command to enable the processing elements to, in parallel, switch processing of a program they are currently processing and begin processing of another program.

In another aspect, the new arrangement provides a parallel computer comprising a plurality of processing elements and a control element. Each of the plurality of processing elements processes user programs in response to program commands. Each processing element further comprises a context switch program for enabling it, in response to receipt of a context switch command, to switch from processing of a user program it is then processing to processing of another user program. The control element generates program commands for transfer to the processing elements generally in parallel to enable the processing elements to process the user programs such that all of the processing elements are generally processing the same user program in parallel. The control element in response to selected conditions transmits context switch commands to the processing elements to enable the processing elements to, in parallel, switch from processing of a user program it is then processing to processing of another user program.

BRIEF DESCRIPTION OF THE DRAWINGS

This invention is pointed out with particularity in the appended claims. The above and further advantages of this invention may be better understood by referring to the following description taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a general block diagram of a massively parallel computer system constructed in accordance with the invention;

FIG. 2A and 2B, together with FIG. 2B-1 and 2B-2, are block diagrams useful in understanding the structure and operation of the data router of the computer system of FIG. 1;

FIG. 3 is a diagram depicting the structure of message packets transferred over the data router;

FIG. 4A, together with FIG. 4A-1 through 4A-4, along with FIG. 4B through 4E are block and logic diagrams useful in understanding the structure and operation of the control network of the computer system of FIG. 1;

FIG. 5 is a diagram depicting the structure of message packets transferred over the control network;

FIG. 6 is a general block diagram of a processing element in the computer system depicted in FIG. 1;

FIG. 7A-1 comprises a general block diagram of a data router interface circuit useful in interfacing the processing element depicted in FIG. 6 to the data router of the computer system depicted in FIG. 1, and FIG. 7A-2A and 7A-2B contain definitions of registers in the data router interface;

FIG. 7B-1 comprises a general block diagram of a control network interface circuit useful in interfacing the processing element depicted in FIG. 7A-1 to the control network of the computer system depicted in FIG. 1, and FIG. 7B-2A and 7B-2B contain definitions of registers in the control network interface;

FIG. 8SP-1 through 8SP-12, 8PE-1 through 8PE-12, and 9A through 9K detail the operations performed in connection with a context switch operation.

DETAILED DESCRIPTION OF AN ILLUSTRATIVE EMBODIMENT

I. General Description of Computer System

II. General Description of Communications Networks

A. Data Router

B. Control Network

III. General Description of Processing Element

A. General

B. Data Router Interface

C. Control Network Interface

IV. Operations of System in Connection with a Context Switch Operation

I. General Description of Computer System

The invention provides new and improved facilities for controlling a massively-parallel computing system. Prior to describing an illustrative embodiment of the particular invention, it would be helpful to describe in detail one embodiment of a massively-parallel computing system which makes use of the invention. Further details of the embodiment are disclosed in U.S. patent application Ser. No. 07/592,029, entitled Parallel Computer System, filed Oct. 3, 1990, in the name of David C. Douglas, et al., now abandoned in favor of several continuations-in-part including U.S. patent application Ser. No. 07/746,035 (see below) and U.S. patent application Ser. No. 07/746,038 (see below); U.S. patent application Ser. No. 07/746,035, entitled Massively Parallel Computer Partitionable Through Switchable Fat-Tree Control Network, filed Aug. 16, 1991, in the name of David C. Douglas, et al., now matured into U.S. Pat. No. 5,353,412; and U.S. patent application Ser. No. 07/746,038, entitled Input/Output System For Massively Parallel Computer System, filed Aug. 16, 1991, in the name of David Wells, et al., now matured into U.S. Pat. No. 5,361,363, all of which are assigned to the assignee of the present application.

FIG. 1 is a general block diagram of a massively parallel computer system 10 which makes use of the invention. With reference to FIG. 1, system 10 includes a plurality of processing elements 11(0) through 11(N) (generally identified by reference numeral 11), scalar processors 12(0) through 12(M) (generally identified by reference numeral 12) and input/output processors 13(0) through 13(K) (generally identified by reference numeral 13). Input/output units (not shown), such as, for example, disk and tape storage units, video display devices, printers and so forth may be connected to the input/output processors to supply information, including data and program commands, for processing by the processing elements 11 and scalar processors 12 in the system, and may also receive processed data for storage, display and printing. The scalar processors 12 may also be connected to input/output units including, for example, video display terminals which permit one or more operators to generally control system 10. The system 10 may also include a plurality of spare processing elements 11s(0) through 11s(J) (generally identified by reference numeral 11s) which may be used as described below. The system 10 further includes a control network 14, a data router 15 and a diagnostic network 16. The control network 14 permits one or more scalar processors 12 to broadcast program commands to processing elements 11. The processing elements 11 which receive the commands execute them generally concurrently. The control network 14 also permits processing elements 11 to generate status information which they may supply to the scalar processors 12. The control network 14 is also used by the processing elements 11 to perform selected types of arithmetic operations, termed "scan" and "reduce" operations, as described below. The control network 14 may also be used to provide status and synchronization information among the processing elements 11.

The data router 15 transfers data among the processing elements 11, scalar processors 12 and input/output processors 13. In particular, under control of the scalar processors 12, the input/output processors 13 retrieve data from the input/output units and distribute the data to the respective scalar processors 12 and processing elements 11. During processing, the scalar processors 12 and processing elements 11 can transfer data among themselves over the data router 15. In addition, the processing elements 11 and scalar processors 12 can transfer processed data to the input/output processors 13. Under control of the scalar processors 12, the input/output processors 13 can direct the processed data that they receive from the data router 15 to particular ones of the input/output units for storage, display, printing, or the like. The data router 15 in one particular embodiment is also used to transfer input/output commands from the scalar processors 12 to the input/output processors 13 and input/output status information from the input/output processors 13 to the scalar processors 12.

The diagnostic network 16, under control of a diagnostic processor (not shown in FIG. 1), facilitates testing of other portions of the system 10 to identify, locate and diagnose defects. The diagnostic processor may comprise one or more of the scalar processors 12. In addition, the diagnostic network 16 may be used to establish selected operating conditions in the other portions of the system 10 as described below. One embodiment of the diagnostic network 16 is described in detail in the aforementioned Douglas, et al., and Wells, et al., patent applications and will not be repeated here.

The system 10 is synchronous. All of its elements operate in accordance with a global SYS CLK system clock signal provided by a clock circuit 17.

One particular embodiment of system 10 may include hundreds or many thousands of processing elements 11 operating on a single problem in parallel under control of commands broadcast to them by the scalar processors 12. In that embodiment, the processing elements 11 operate in parallel on the same command on their individual sets of data, thereby forming a parallel computer system.

In addition, the system 10 may be dynamically logically partitioned, by logical partitioning of the control network 14 as described below, into multiple logical subsystems which may concurrently operate on separate problems or separate parts of a single problem. In that case, each partition includes at least one scalar processor 12 and a plurality of processing elements 11, the scalar processor 12 supplying the commands for processing by the processing elements in its partition. The spare processing elements 11s, which except for the positions of their connections to the control network 14 and data router 15 are otherwise similar to processing elements 11, may be used to substitute for failed processing elements 11 in a partition as described below, to augment the number of processing elements in a partition if there are insufficient processing elements 11 to form a partition with a desired number of processing elements 11, or to provide additional processing elements which may themselves be formed into partitions. In the following, unless otherwise stated explicitly, a reference to a processing element 11, in either the singular or plural, will also be taken as a corresponding singular or plural reference to a spare processing element 11s; that is, the processing elements 11 and spare processing elements 11s will be jointly referred to herein generally as processing elements 11.

It should be noted from the following description that the partitioning is only in relation to the control network 14, but not the data router 15. This facilitates transfer of data between processing elements of different partitions if they are, for example, processing different parts of a particular problem, or, more generally, for inter-process communications, if each processing element of the diverse partitions is processing correspondingly diverse, but possibly interacting, processes. This further facilitates transfer of data from processing elements of any partition to the input/output processors 13 to permit storage or display of data, as well as transfer from the input/output processors 13 of stored data to processing elements of any partition.

II. General Description of Communications Networks

A. Data Router 15 Before proceeding to a detailed descripti