|
|
|
| United States Patent | 5475856 |
| Link to this page | http://www.wikipatents.com/5475856.html |
| Inventor(s) | Kogge; Peter M. (Endicott, NY) |
| Abstract | A Parallel RISC computer system is provided by a multi-mode dynamic
multi-mode parallel processor array with one embodiment illustrating a
tightly coupled VLSI embodiment with an architecture which can be extended
to more widely placed processing elements through the interconnection
network which couples multiple processors capable of MIMD mode processing
to one another with broadcast of instructions to selected groups of units
controlled by a controlling processor. The coupling of the processing
elements logic enables dynamic mode assignment and dynamic mode switching,
allowing processors operating in a SIMD mode to make maximum memory and
cycle time usage. On and instruction by instruction level basis, modes can
be switched from SIMD to MIMD, and even into SISD mode on the controlling
processor for inherently sequential computation allowing a programmer or
complier to build a program for the computer system which uses the optimal
kind of parallelism (SISD, SIMD, MIMD). Furthermore, this execution,
particularly in the SIMD mode, can be set up for running applications at
the limit of memory cycle time. With the ALLNODE switch and alternatives
paths a system can be dynamically achieved in a few cycles for many many
processors. Each processing element and memory and has MIMD capability the
processor's an instruction register, condition register and program
counter provide common resources which are used in MIMD and SIMD. The
program counter become a base register in SIMD mode. |
|
|
|
Title Information  |
|
|
|
|
|
Drawing from US Patent 5475856 |
|
|
Dynamic multi-mode parallel processing array |
|
|
|
|
|
| Publication Date |
December 12, 1995 |
|
|
|
|
|
| Filing Date |
October 17, 1994 |
|
|
|
|
|
|
|
|
|
|
|
| Parent Case |
This application is a continuation of application Ser. No. 07/798,788,
filed Nov. 27, 1991 now abandoned. |
|
|
|
|
|
|
|
|
|
|
|
|
|
Title Information  |
|
|
References  |
|
|
| *references marked with an asterisk below are user-added references |
|
U.S. References |
|
|
| Add a new US reference: |
| | Reference | Relevancy | Comments | Reference | Relevancy | Comments | 5239629 Miller 710/317 Aug,1993 |      Your vote accepted [0 after 0 votes] | | 5230079 Grondalski 712/18 Jul,1993 |      Your vote accepted [0 after 0 votes] | | 5212777 Gove 712/229 May,1993 |      Your vote accepted [0 after 0 votes] | | 5165023 Gifford 710/317 Nov,1992 |      Your vote accepted [0 after 0 votes] | | 5010477 Omoda 712/4 Apr,1991 |      Your vote accepted [0 after 0 votes] | | 5008882 Peterson
Apr,1991 |      Your vote accepted [0 after 0 votes] | | 4992933 Taylor 712/22 Feb,1991 |      Your vote accepted [0 after 0 votes] | | 4916652 Schwarz 708/510 Apr,1990 |      Your vote accepted [0 after 0 votes] | | 4891787 Gifford 712/205 Jan,1990 |      Your vote accepted [0 after 0 votes] | | 4873626 Gifford 710/120 Oct,1989 |      Your vote accepted [0 after 0 votes] | | |
|
|
|
|
U.S. References |
|
|
Foreign References |
|
|
|
|
|
|
Foreign References |
|
|
Other References |
|
|
|
|
|
|
Other References |
|
|
|
|
|
References  |
|
|
|
|
|
| Market Size |
|
Estimate the gross annual revenues of the relevant market
sector:
|
| | |
| |
|
|
| Market Share |
|
Estimate the percentage of the relevant market sector this invention will capture:
|
| | |
| |
|
|
| Reasonable Royalty |
|
What percentage of gross sales should the inventor or assignee be paid?
|
| | |
| |
|
|
|
Public's "Guesstimation" of Royalty Value
|
| Market Size | N/A | [No votes] | | x | Market Share | N/A | [No votes] | | x | Reasonable Royalty | N/A | [No votes] |
| | N/A | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
Market Review  |
|
|
Technical Review  |
|
|
Claims  |
|
|
What is claimed is:
1. A dynamic multi-mode parallel processing array, comprising:
a plurality of processors, each processor having a control unit for decoding and executing instructions of an instruction set, a data flow unit and a local memory, each of said control units having an instruction register, a program counter, a
condition code register and a parallel mode bit (PMB), the PMB indicating whether the processor obtains instructions from a controlling processor or from said local memory;
an interconnection path between the instruction registers of the processors;
the instruction set having a plurality of instructions, each instruction having a parallel execution type bit (PET) that is used in conjunction with the PMB by the control unit to determine whether an instruction should be executed, the
instruction set having a switch mode instruction for changing the PMB bit of a processor executing the instruction;
the processors organized into one or more groups, each group having a processor configured as the controlling processor, wherein any processor in the plurality of processors can be dynamically configured as the controlling processor, the
controlling processor enabling the processors of a group to operate in a MIMD or SIMD mode, and to switch modes dynamically; and
when a group of processors are operating in SIMD mode the controlling processor provides instructions to the instruction registers of the other processors in the group, each instruction provided via the interconnection path when the controlling
processor fetches the instruction;
wherein a controlling processor of a group of processors fetches instructions, and wherein other processors in the group latch into their instruction registers each instruction as the controlling processor fetches it, such that at the end of the
instruction fetch all other processors have in their instruction register the fetched instruction; and
wherein the instruction set includes instructions which modify a processor's program counter to function as a base register, including one or more of the following instructions:
a "Jump" instruction which when executed in a processor operating in SIMD mode sets the processor's program counter to a value provide with the jump instruction; and
a "Load Immediate" instruction which when executed in a processor operating in SIMD mode loads a register with the contents of the processors' local memory, at the address specified by the processor's program counter and then increments the
program counter.
2. A dynamic multi-mode parallel processing array, comprising:
a plurality of processors, each processor having a control unit for decoding and executing instructions of an instruction set, a data flow unit and a local memory, each of said control units having an instruction register, a program counter, a
condition code register and a parallel mode bit (PMB), the PMB indicating whether the processor obtains instructions from a controlling processor or from said local memory;
an interconnection path between the instruction registers of the processors;
the instruction set having a plurality of instructions, each instruction having a parallel execution type bit (PET) that is used in conjunction with the PMB by the control unit to determine whether an instruction should be executed, the
instruction set having a switch mode instruction for changing the PMB bit of a processor executing the instruction;
the processors organized into one or more groups, each group having a processor configured as the controlling processor, wherein any processor in the plurality of processors can be dynamically configured as the controlling processor, the
controlling processor enabling the processors of a group to operate in a MIMD or SIMD mode, and to switch modes dynamically; and
when a group of processors are operating in SIMD mode the controlling processor provides instructions to the instruction registers of the other processors in the group, each instruction provided via the interconnection path when the controlling
processor fetches the instruction;
wherein a group of processors operating in SIMD mode switch to MIMD mode when the controlling processor provides each processor in the group with the switch mode instruction whereby each processor begins fetching instructions from its local
memory; and
wherein the processors can communicate with each other using a load instruction to send and a store instruction to receive, said instructions containing an address which is used as a processor address on the interconnection path of the processor
to communicate with, wherein the processor stalls until the communication takes place.
3. A dynamic multi-mode parallel processing array, comprising:
a plurality of processors, each processor having an control unit for decoding and executing instructions of an instruction set, a data flow unit and a local memory, each of said control units having an instruction register, a program counter, a
condition code register and a parallel mode bit (PMB), the PMB indicating whether the processor obtains instructions from a controlling processor or from said local memory;
an interconnection path between the instruction registers of the processors;
the instruction set having a plurality of instructions, each instruction having a parallel execution type bit (PET) that is used in conjunction with the PMB by the control unit to determine whether an instruction should be executed, the
instruction set having a switch mode instruction for changing the PMB bit of a processor executing the instruction;
the processors organized into one or more groups, each group having a processor configured as the controlling processor, wherein any processor in the plurality of processors can be dynamically configured as the controlling processor, the
controlling processor enabling the processors of a group to operate in a MIMD or SIMD mode, and to switch modes dynamically; and
when a group of processors are operating in SIMD mode the controlling processor provides instructions to the instruction registers of the other processors in the group, each instruction provided via the interconnection path when the controlling
processor fetches the instruction;
wherein a group of processors operating in SIMD mode switch to MIMD mode when the controlling processor provides each processor in the group with the switch mode instruction whereby each processor begins fetching instructions from its local
memory; and
wherein a switch mode instruction executed by a first processor causes the first processor to stall and use an address provided by the switch mode instruction as a key back to the controlling processor, the controller processor can then execute a
switch mode instruction with an "address" which matches the key, and causes the first processor to leave the stall and resume tracking the controlling processor's instructions.
4. A dynamic multi-mode parallel processing array, comprising:
a plurality of processors, each processor having a control unit able to decode and execute instructions of an instruction set, a data flow unit and a local memory, each control unit having an instruction register, a program counter, a condition
code register and a parallel mode bit (PMB), the PMB indicating whether the processor obtains instructions from a controlling processor or from local memory;
an interconnection path between the instruction registers of the processors;
the instruction set having a plurality of instructions, each instruction having a parallel execution type bit (PET) that is used in conjunction with the PMB by the control unit to determine whether an instruction should be executed, the
instruction set having a switch mode instruction for changing the PMB bit of a processor executing the instruction;
the processors organized into one or more groups, each group having a processor configured as the controlling processor, wherein any processor in the plurality of processors can be dynamically configured as the controlling processor, the
controlling processor enabling the processors of a group to operate in a MIMD or SIMD mode, and to switch modes dynamically; and
when a group of processors are operating in SIMD mode the controlling processor provides instructions to the instruction registers of the other processors in the group, each instruction provided via the interconnection path when the controlling
processor fetches the instruction;
wherein a group of processors operating in SIMD mode switch to MIMD mode when the controlling processor provides each processor in the group with the switch mode instruction whereby each processor begins fetching instructions from its local
memory; and
wherein a variable subset of processors operating in MIMD mode execute switch mode instructions that cause the variable subset of processors to operate in SIMD mode while those processors that are not part of the variable subset continue
operating in MIMD mode.
5. The array according to claim 4 wherein the variable subset of processors return to MIMD mode when the controlling processor provides each processor in the variable subset with the switch mode instruction.
6. A dynamic multi-mode parallel processing array, comprising:
a plurality of processors, each processor having a control unit for decoding and executing instructions of an instruction set, a data flow unit and a local memory, each of said control units having an instruction register, a program counter, a
condition code register and a parallel mode bit (PMB), the PMB indicating whether the processor obtains instructions from a controlling processor or from said local memory;
an interconnection path between the instruction registers of the processors;
the instruction set having a plurality of instructions, each instruction having a parallel execution type bit (PET) that is used in conjunction with the PMB by the control unit to determine whether an instruction should be executed, the
instruction set having a switch mode instruction for changing the PMB bit of a processor executing the instruction;
the processors organized into one or more groups, each group having a processor configured as the controlling processor, wherein any processor in the plurality of processors can be dynamically configured as the controlling processor, the
controlling processor enabling the processors of a group to operate in a MIMD or SIMD mode, and to switch modes dynamically; and
when a group of processors are operating in SIMD mode the controlling processor provides instructions to the instruction registers of the other processors in the group, each instruction provided via the interconnection path when the controlling
processor fetches the instruction; and
wherein the instruction set provides instructions where all memory references for data are performed via LOAD and STORE instructions, and where addressing for data accesses is a base plus displacement, and where addition and index register
updates are applied after a memory operation has begun, as a post address update; and wherein all instructions that perform computational operations are register to register, and said processors execute instructions in one or more execution cycles
without need of memory references.
7. A dynamic multi-mode parallel processing array, comprising:
a plurality of processors, each processor having a control unit for decoding and executing instructions of an instruction set, a data flow unit and a local memory, each of said control units having an instruction register, a program counter, a
condition code register and a parallel mode bit (PMB), the PMB indicating whether the processor obtains instructions from a controlling processor or from said local memory;
an interconnection path between the instruction registers of the processors;
the instruction set having a plurality of instructions, each instruction having a parallel execution type bit (PET) that is used in conjunction with the PMB by the control unit to determine whether an instruction should be executed, the
instruction set having a switch mode instruction for changing the PMB bit of a processor executing the instruction;
the processors organized into one or more groups, each group having a processor configured as the controlling processor, wherein any processor in the plurality of processors can be dynamically configured as the controlling processor, the
controlling processor enabling the processors of a group to operate in a MIMD or SIMD mode, and to switch modes dynamically; and
when a group of processors are operating in SIMD mode the controlling processor provides instructions to the instruction registers of the other processors in the group, each instruction provided via the interconnection path when the controlling
processor fetches the instruction
wherein a controlling processor of a group of processors fetches instructions, and wherein other processors in the group latch into their instruction registers each instruction as the controlling processor fetches it, such that at the end of the
instruction fetch all other processors have in their instruction register the fetched instruction; and
wherein after latching the fetched instruction each processor looks at its PMB and the PET obtained from the fetched instruction to determine whether to execute the fetched instruction, executing the fetched instruction accordingly.
8. The array according to claim 7 wherein while the processors are decoding and executing the fetched instruction the controlling processor begins fetching a next instruction.
9. The array according to claim 8 wherein the controlling processor of a group of processors fetches another instruction all processors in SIMD mode capture the memory reference in their instruction registers.
10. A dynamic multi-mode parallel processing array, comprising:
a plurality of processors, each processor having a control unit for decoding and executing instructions of an instruction set, a data flow unit and a local memory, each of said control units having an instruction register, a program counter, a
condition code register and a parallel mode bit (PMB), the PMB indicating whether the processor obtains instructions from a controlling processor or from said local memory;
an interconnection path between the instruction registers of the processors;
the instruction set having a plurality of instructions, each instruction having a parallel execution type bit (PET) that is used in conjunction with the PMB by the control unit to determine whether an instruction should be executed, the
instruction set having a switch mode instruction for changing the PMB bit of a processor executing the instruction;
the processors organized into one or more groups, each group having a processor configured as the controlling processor, wherein any processor in the plurality of processors can be dynamically configured as the controlling processor, the
controlling processor enabling the processors of a group to operate in a MIMD or SIMD mode, and to switch modes dynamically on an instruction-by, instruction basis; and
when a group of processors are operating in SIMD mode the controlling processor provides instructions to the instruction registers of the other processors in the group, each instruction provided via the interconnection path when the controlling
processor fetches the instruction; and
wherein the PMB indicates SIMD and MIMD modes and the PET indicates local and array operations of the associated processor.
11. The array according to claim 10 wherein when a group of processors are operating in SIMD mode, the program counters provide a base register function in the processors receiving instructions from the controlling processor.
12. The array according to claim 10 wherein when a group of processors are operating in SIMD mode, the condition code register provide a local enable function in the processors receiving instructions from the controlling processor.
13. The array according to claim 10 wherein there is provided a plurality of groups, with each group dynamically switching between modes of operation.
14. The array according to claim 10 wherein the interconnection path is a dynamic switching connection network.
15. The array according to claim 10 wherein the interconnection path provides a broadcast path between the instructions registers of each processor in a group of processors executing in SIMD mode.
16. The array according to claim 10 wherein the interconnection path is a multi-stage interconnection network that provides a broadcast path to the instruction registers of a group of processors operating in a SIMD mode.
17. The array according to claim 10 wherein a switch mode instruction provides a value to load into the program counter when executed.
18. The array according to claim 10 wherein the switch mode instruction does not change the program counter when executed.
19. The array according to claim 10 wherein a group of processors operating in SIMD mode switch to MIMD mode when the controlling processor provides each processor in the group with the switch mode instruction whereby each processor begins
fetching instructions from its local memory.
20. The array according to claim 10 wherein the interconnection path is utilized to pass broadcast messages and to interconnect a variety of subsets of processors within a group of processors, such that the processors may implement on a
selective basis both SIMD and MIMD operations, some processors running SIMD and others running MIMD.
21. The array according to claim 10 wherein is included for processor interconnection a dynamic switching multi-stage network, wherein without blocking via alternative path selection processors can be set up as part of an interconnected system
permitting selection of various groups of processors on the network which have desirable resources, and use them to run programs which can take advantage of the SIMD and MIMD needs of the application.
22. The array according to claim 10 wherein the controlling processor is dynamically rotated through a group of processors so that each processor is configured as the controlling processor for one or more instructions.
23. The array according to claim 10 wherein the instructions registers of processors are interconnected via an asynchronous network enabling a controlling processor to pass instructions through the interconnection network to other processor
instruction registers coupled via said network.
24. The array according to claim 10 wherein each processor is provided with a port through which passes the switch mode instruction.
25. The array according to claim 10 wherein the controlling processor can switch to SISD mode permitting a programmer or compiler to build a program for the array which uses any one or all modes of parallelism.
26. The array according to claim 10 wherein the interconnection path is a dynamic multi-stage two sided switching network enabling point to point coupling of processors without blocking.
27. The array according to claim 26 wherein the multi-stage interconnection network provides broadcast paths to the instruction registers of a group of processors operating in SIMD mode.
28. The array according to claim 10 wherein a multi-stage interconnection network provides an alternative path between the instruction registers of a group of processors operating in SIMD mode.
29. The array according to claim 28 wherein the multi-stage interconnection network a dynamic multi-stage two sided switching network enabling point to point coupling of processors without blocking.
30. The may according to claim 10 wherein a controlling processor of a group of processors fetches instructions, and wherein other processors in the group latch into their instruction registers each instruction as the controlling processor
fetches it, such that at the end of the instruction fetch all other processors have in their instruction register the fetched instruction.
31. The array according to claim 30 wherein execution of SIMD mode instructions proceeds with the program counter used as a base register.
32. The array according to claim 30 wherein the instruction set includes instructions which when executed in a processor operating in SIMD mode uses the program counter as a base register, the instruction including a Load immediate instruction
and a store immediate instruction.
33. The array according to claim 10 wherein each processor is a reduced instruction set computer (RISC).
34. The array according to claim 33 wherein the interconnection path is provided by an ALLNODE interconnection network.
35. The array according to claim 10 including:
a first group of processors operating in SIMD mode having a first controlling processor providing instructions for said first group over the interconnection path; and
a second group operating in SIMD mode having a second controlling processor providing instructions for the second group over the interconnection path.
36. The array according to claim 35 wherein the first controlling processor provides the switch mode instruction to the first group of processors causing the first group of processors to enter MIMD mode.
37. The array according to claim 36 wherein the second controlling processor provides the switch mode instruction to the second group of processors causing the second group of processors to enter MIMD mode. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
FIELD OF THE INVENTIONS
The field of the inventions described is computer systems, and the inventions relate particularly to computer systems which can implement dynamically multi-modes of processing utilizing an array of processors to execute programs in parallel
within the array of processing elements.
CROSS REFERENCE TO RELATED APPLICATIONS
The present application related to:
"Broadcast/Switching Apparatus For Executing Broadcast/Multi-Cast" by H. T. Olnowich et al U.S. Ser. No. 07/748,316, filed Aug. 21, 1991 (IBM Docket EN991030A)
"Multi-Sender/Switching Apparatus For Status Reporting Over Unbuffered Asynchronous Multi-Stage Networks" by H. W. Olnowich et al U.S. Ser. No. 07/748,302, filed Aug. 21, 1991 (IBM Docket EN991030B)
"Sync-Net--A Barrier Synchronization Apparatus For Multi-Stage Networks" by P. L. Childs et al U.S. Ser. No. 07/748,303, filed Aug. 21, 1991 (IBM Docket EN991049)
"GVT-Net--A Global Virtual Time Calculation Apparatus For Multi-Stage Networks" by P. L. Childs et al U.S. Ser. No. 07/748,295, filed Aug. 21, 1991 (IBM Docket EN991050) issued Oct. 5, 1993, as U.S. Pat. No. 5,250,943 and
in addition, those concurrently herewith as related applications:
"Priority Broadcast And Multi-Cast For Unbuffered Multi-Stage Networks" by H. T. Olnowich et al U.S. Ser. No. 07/799,262, filed Nov. 27, 1991 (IBM Docket EN991016B)
The "Dual Priority Switching Apparatus for Simplex Networks" with H. T. Olnowich, P. Kogge et al U.S. Ser. No. 07/800,652, filed Nov. 27, 1991 IBM Docket EN991016A)
"Multi-Function Network" by H. T. Olnowich et al U.S. Ser. No. 07/799,497, filed Nov. 27, 1991 (IBM Docket EN991017)
"Multi-Media Serial Line Switching Adapter for Parallel Networks and Heterogeneous and Homologous Computer System", H. T. Olnowich, U.S. Ser. No. 07/799,602, filed Nov. 27, 1991 (IBM Docket EN991119)
These co-pending applications and the present application are owned by one and the same assignee, namely, International Business Machines Corporation of Armonk, N.Y.
The descriptions set forth in these co-pending applications are hereby incorporated into the present application by this reference.
BACKGROUND OF THE INVENTIONS
Computer systems which can use an array of processors which can execute programs in parallel have been developed.
VLSI technology can now place multiple processors (each with its own memory) on either a single chip or multiple chips in very close proximity. Such parallel arrays of processors can be configured into either Single Instruction Steam Multiple
Data Stream (SIMD), or Multiple Instruction Stream Multiple Data Stream (MIMD), or Single Instruction Stream Single Data Stream (SISD) configurations, but to date, neither multiple types of modes have been employed, and there has been no machine which
has provided a form where the mode can be changed dynamically and efficiently during program execution.
Some of the work appears to have be based in part on my book entitled "The Architecture of Pipelined Computers", published by Hemisphere Publishing Corporation in 1981 under ISBN 0-89116-494-4. This work has an historical perspective which still
is useful after a decade of progress in the field, see pages 11-20.
There continue to be developments in the field of different mode oriented machines. For example for the SIMD mode, recently U.S. Pat. No. 4,992,933 entitled SIMD ARRAY PROCESSOR WITH GLOBAL INSTRUCTION CONTROL AND REPROGRAMMABLE INSTRUCTION
DECODER issued on Feb. 12, 1991 to James L. Taylor with respect to an array processor which provided a multi-dimensional array of processing elements and which provided a mechanism where the processing elements may be simultaneously updated in a SIMD
fashion in response to a global load instruction which forces an interrupt to all processing elements.
Most advanced machines today are MIMD. U.S. Pat. No. 4,916,652 issued to Schwarz and Vassiliadis on Apr. 10, 1990 and entitled DYNAMIC MULTIPLE INSTRUCTION STREAM MULTIPLE DATA MULTIPLE PIPELINE APPARATUS FOR FLOATING POINT SINGLE INSTRUCTION
STREAM SINGLE DATA ARCHITECTURES addresses implementing a MIMD machine via multiple functional pipelines, and interleaving the different instruction streams into these pipelines. This patent contemplated switching the machine from MIMD to SISD for a
shod period of time to handle some complex instruction for floating point operation.
There are others which have interrupted the MIMD mode of a machine. U.S. Pat. No. 4,873,626 issued Oct. 10, 1989 and U.S. Pat. No. 4,891,787 issued Jan. 2, 1990, both to David K. Gifford, describe a Parallel MIMD processing system with a
processor array having a SIMD/MIMD instruction processing system. These two patents define a single CPU that is an overall controller to multiple groups of processors (PEs) and memories, where each group has an interconnection path of some sort. A
parallel bus interconnects the master CPU to the groups. This machine has proved that all PEs can be running independant program code in MIMD fashion. Like the other above patent there is the capability of interrupting the processing of the PEs which
are controlled by the single Master Control Processor.
Currently, most SIMD processors (e.g. the Connection Machine CM-2) are either stand alone units, or operate as a front end or back end of an MIMD mainframe. Each processor is established to perform a specific function, and schemes like those
which require interrupts require a substantial amount of overhead to perform limited specialized mode operations. However, most computer algorithms or programs may have some strong match to efficient parallel execution in one of several modes (SIMD or
MIMD). Furthermore, virtually all algorithms would benefit from a machine architecture that permitted different modes of execution for different pads of a problem. The existing proposals have not adequately addressed this need.
SUMMARY OF THE INVENTION
The inventions which are described herein enable a machine architecture permits different modes of execution for different pads of a problem. In addition, the machines use the same set of system resources to enable multimode applications. The
computer system which I have described is a multi-processor computer system having multiple groups of processors (processing elements) and memories, where processors are intercoupled via an interconnection path, and operating means for controlling the
execution of instructions by the processors of the system. In accordance with the preferred embodiment the processors can be configured to execute instructions in the SIMD and/or MIMD mode dynamically. This change of mode can be on an instruction by
instruction basis. The processors can be physically identical and yet perform multi-mode functions.
In my preferred embodiment each computer processing element will have at a minimum an instruction register, a program counter, and a condition code register. With the described architecture, I have provided controls which enable these resources
of common processing elements which are necessary in MIMD mode operations, to be used and useful in SIMD. The dynamic switching aspect of my invention utilizes the instruction register of a computer processing element to directly control instruction
processing utilizing other common elements as dual purpose resources in the SIMD mode of operation.
Accordingly, I have provided that the program counter is assigned a base register function.
I have also provided that the condition code register is assigned a local enable function.
I have also provided that the instruction register is utilized to pipeline SIMD instructions.
Each of the processors have a value in the instruction register which is utilized for dynamically indicating the mode of operation set for execution of the current instruction. An instruction in the instruction set can be broadcast to selected
processors of the system to dynamically switch the selected set of processors of the system to a desired mode of operation.
In my preferred embodiment there is provided a path between the instructions registers of each processing elements of a group of processors executing a stream of instructions, enabling SIMD operations in a plurality of processors of the group by
broadcast over the broadcast path so provided.
In an alternative embodiment an interconnection network can perform the broadcast functions between processors. In this embodiment, which can be combined with a direct broadcast path between processors, a multi-stage interconnection network
provides alternative paths to the instruction registers. This alternative interconnection network is a dynamic multi-stage two sided switching network enabling point to point coupling of processors without blocking.
In addition, there can be several inter-dynamic groups of processing modes operating on the same computer system dynamically. No known system allows such a configuration.
These and other improvements, illustrating all the architectural approaches, are set forth in the following detailed description. For a better understanding of the inventions, together with advantages and features, reference may be made to the
co-pending applications for other developments referenced above. However, specifically as to the improvements, advantages and features described herein, reference will be made in the description which follows and to the below-described drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1(a) illustrates the SIMD mode organization of the common resource system.
FIG. 1(b) illustrates the MIMD mode organization of the common resource system.
FIG. 2 shows by way of illustration an assumed timing of the operational resources of the system.
FIG. 3 illustrates new possible features of the system which includes a direct signal coupling across processing elements for broadcast in accordance with the preferred embodiment.
FIG. 4 illustrates an alternative embodiment of the system of FIG. 3 with broadcast being handled by the interconnection network utilizing the preferred Multi-Stage ALLNODE interconnection network providing alternative paths to the instruction
registers.
FIG. 5 illustrates SIMD timing in accordance with the preferred embodiment.
(Note: For convenience of illustration, FIGURES may be separated in pads and as a convention we place the top of the FIGURE as the first sheet, with subsequent
sheets proceeding down and across when viewing the FIGURE, in the event that multiple sheets are used.)
Our detailed description follows as pads explaining our preferred embodiments of our inventions provided by way of example.
DETAILED DESCRIPTION OF THE INVENTIONS
In accordance with my inventions, the computer system and its architecture which will be described enable a machine processing element to execute with the same resources different pads of a problem in an appropriate mode, for example the SIMD
mode of FIG. 1(a) or the MIMD mode of FIG. 1(b). In addition within the meaning of the FIG. 1(a) is SISD, as will be described. In my preferred embodiment, the machines use the same set of system resources to enable multi-mode applications but these
resources are dynamically reconfigured. The computer system which I have described is a multi-processor computer system having multiple groups of processors 1 . . . N. The processing elements have (in the preferred RISC configuration) a control unit, a
data flow unit, and a memory. The processors are intercoupled via an interconnection path 101, which can be any of the common forms of interconnection networks, such a a crossbar, or a circuit switching network, a binary hypercube or other connection
network. In my preferred embodiment, as well as in my alternative embodiment, the best connection network for the purpose of the described architecture and accordingly, the one I prefer, is one of those based upon the ALLNODE switch described in the
U.S. Patent Application entitled "Broadcast/Switching Apparatus For Executing Broadcast/Multi-Cast" by H. T. Olnowich et al U.S. Ser. No. 07/748,316, filed Aug. 21, 1991 (IBM Docket EN991030A), and which is a multi-stage network functioning as a
parallel connection medium suitable for connecting RS/6000 and other processors together asynchronously, allowing processor nodes to be linked to each other with instructions sent at the same time or over-lapped in any manner. The network is implemented
in VLSI and provides a dynamic non-blocking via alternative path regular, equidistant within the chip, multistage two sided chip, with pods for connection to coupled processors, either within a chip, on a board, or over a communication path. The system
can provide a dual priority scheme for simplex networks as described in the above referenced application of which I am an inventor. The related applications include those mentioned above which are incorporated herein by reference. This switch enables
some of the features of my invention.
The computer processor will have those resources which are applicable for MIMD processing, including a program counter 103 which may be assigned base register function in the SIMD mode, a condition code register, which can be assigned a local
enable function, and an instruction register 105 which is utilize in the SIMD mode to pipeline SIMD instructions.
The architecture which I describe provides operating means for controlling the execution of instructions by the processors of the system. In accordance with the preferred embodiment the processors can be configured to execute instructions in the
SIMD and/or MIMD mode dynamically. This change of mode can be on an instruction by instruction basis. The processors can be physically identical and yet perform multi-mode functions.
Since, in my preferred embodiments the computer processing elements will have an instruction register 105, a program counter 103, and a condition code register, the computer system will have the controls which enable these resources of common
processing elements which are necessary in MIMD mode operations, to be used and useful in SIMD.
For the purpose of this discussion, we assume, as is my preferred embodiment, an individual processor will have the features summarized in FIG. 2. These features are characteristic of many possible RISC architectures, and they can be found in
the RS/6000 RISC processors sold by International Business Machines Corporation, which I prefer to use. However, within the scope of the invention, the individual processing elements will have:
1. An instruction set where all memory references for data are performed via LOAD and STORE instructions, as is common in RISC-like instruction set architectures;
2. Addressing for such data accesses will be base plus displacement, but addition and index register updates should be applied after a memory operation has begun, as a post address update;
3. All instructions that perform computational operations, such as adds or subtracts, are register to register, and can execute in one or more execution cycles without need of memory references (which cycle time is illustrated in this disclosure
as being an assumed one cycle time for the purpose of simplicity of exposition);
4. All instructions will fit in exactly one memory word, except for immediate instructions which take two (the second being data);
5. All instructions should fit in one machine cycle, with the address of the memory operand selectable from a machine register at the start of the cycle, and the register to receive results in a read capable of latching it in at the end of the
cycle.
All of these features can be found in a typical RISC machine today, and they can be also implemented in more complex architectures with some of the advances which are currently being developed. The point here is that while I prefer the
simplified RISC machines which is described in detail, there is nothing in this description of features that could not be modified to work with other more conventional architectures.
The architectural extensions.
The architectural extensions which enable the computer system to dynamically switch at the instruction by instruction level between SIMD (as an example) and MIMD modes will be described for two possible modes.
I assume a processor array configured generally in accordance with FIG. 3. The instruction word to enter each processor's instruction register 105 (IR) may come either from one broadcast over a global bus 107 (or in the alternative embodiment
over the network 101) from the processor which is labelled as controlling (PE#1), or from the processor's own memory.
In accordance with my invention, each processor's instruction register 105 recognizes a new processing mode bit (PMB) 109 which indicates whether this processor in in the SIMD or MIMD mode. This bit controls where new instructions originate.
Also, the format of each instruction includes a separate parallel execution type bit (PET) which has two values, a value indicating "local" operation or "array" operation.
Finally, the system is provided with a "switch mode" instruction in the instruction set that flips the processing mode bit of the processors which execute switched mode instructions which follow the "switch mode" instruction in the instruction
stream.
These architectural extensions can enable the dynamic switching between modes instruction by instruction, and enable the machines of the computer system to operate in the SIMD and/or MIMD mode as befits the needs of an algorithm which needs to be
executed by the computer system.
Example of SIMD Mode operations
Each of the processors have a value in the instruction register which is utilized for dynamically indicating the mode of operation set for execution of the current instruction. An instruction in the instruction set can be broadcast to selected
processors of the system to dynamically switch the selected set of processors of the system to a desired mode of operation. This example illustrates the SIMD operation.
At power up all but processor labeled PE#1 have their processing mode bit (PMB) set to SIMD. Processor #1 is set to MIMD and during the configuration cannot change from this MIMD mode.
In this mode, processor #1 functions as a controller for the system which is configured, and acts as a controller and fetches instructions. All other processors latch into their instruction registers a copy of each instruction as processor #1
fetches it. Thus at the end of the instruction fetch all processors have in their IR the same instruction for execution.
Now at the beginning of the next machine cycle each processor looks at both its PMB and the instruction's parallel execution type bit (PET) (as found in the IR). For processor #1 (which is in the MIMD mode), a PET of "local" causes the processor
to decode and execute the instruction as a normal instruction. The next machine cycle is thus devoted to the instruction execution in processor #1, with its memory available for a load and store. It is possible that the machine design will permit
prefetching and the instruction does not need memory, in which case this cycle may be a fetch of the next instruction. In the SIMD mode all other processors will see the "local" PET and ignore the instruction. They will g | | |