WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Data processing apparatus and method employing collision detection and prediction    
United States Patent4760519   
Link to this pagehttp://www.wikipatents.com/4760519.html
Inventor(s)Papworth; David B. (Framingham, MA); Ardini, Jr.; Joseph L. (Needham, MA)
AbstractA data processing system for processing a sequence of program instructions has a pipeline structure including an instruction pipeline and an execution pipeline. Each pipeline has a plurality of serially operating stages. The stages read from and modify memory at various stages of instruction processing. Collisions between data read from a register in the instruction pipeline phase of operation in response to a first instruction and write data written into the register during the execution phase of operation in response to an earlier instruction can be detected and predicted. In response thereto, the new data can be substituted directly for the modified data in the pipeline itself to provide continued valid operation. In addition, the apparatus and method provide for altering the flow of the instructions through the pipeline in order to accommodate newly generated data and to avoid invalid operation.



 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Inventor     Papworth; David B. (Framingham, MA); Ardini, Jr.; Joseph L. (Needham, MA)
Owner/Assignee     Prime Computer, Inc. (Natick, MA)
Patent assignment
All assignments
Publication Date     July 26, 1988
Application Number     06/908,927
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     September 15, 1986
US Classification     712/217 712/218 712/238 712/240
Int'l Classification     G06F 009/38 G06F 011/28 G06F 009/42 G06F 013/00
Examiner     Harkcom; Gary V.
Assistant Examiner     Lynt; C. H.
Attorney/Law Firm     Lahive & Cockfield
Address
Parent Case     This is a continuation of application Ser. No. 579,197, filed Feb. 10, 1984, now abandoned.
Priority Data     Jul 11, 1983[WO]PCT/US83/01052
USPTO Field of Search     364/200 MS File 364/900 MS File
Patent Tags     data processing employing collision detection and prediction
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
4471433
Matsumoto
712/238
Sep,1984

[0 after 0 votes]
4467414
Akagi
711/119
Aug,1984

[0 after 0 votes]
4402042
Guttag
712/207
Aug,1983

[0 after 0 votes]
4373180
Linde
712/239
Feb,1983

[0 after 0 votes]
4370711
Smith
712/240
Jan,1983

[0 after 0 votes]
4365311
Fukunaga
713/500
Dec,1982

[0 after 0 votes]
4316244
Grondalski
711/168
Feb,1982

[0 after 0 votes]
4305124
Marro
712/42
Dec,1981

[0 after 0 votes]
4287561
Liptay
712/217
Sep,1981

[0 after 0 votes]
4251859
Momose
712/219
Feb,1981

[0 after 0 votes]
4217640
Porter
714/34
Aug,1980

[0 after 0 votes]
4208716
Porter
711/3
Jun,1980

[0 after 0 votes]
4200927
Hughes
712/235
Apr,1980

[0 after 0 votes]
4187539
Eaton
712/211
Feb,1980

[0 after 0 votes]
4176394
Kaminski
712/227
Nov,1979

[0 after 0 votes]
4124893
Joyce
712/230
Nov,1978

[0 after 0 votes]
4075688
Lynch, Jr.
711/200
Feb,1978

[0 after 0 votes]
4025771
Lynch, Jr.
708/521
May,1977

[0 after 0 votes]
3940741
Horikoshi
712/238
Feb,1976

[0 after 0 votes]
3840861
Amdahl
713/501
Oct,1974

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A data processing system for processing a sequence of program instructions, said system comprising

an instruction pipeline having a plurality of serially operating instruction stages for reading instructions from storage and for forming therefrom plural address data for use during execution of said instructions,

an execution pipeline connected to said instruction pipeline and having a plurality of serially operating execution stages for receiving said address data and for employing said address data formed by said instruction pipeline for referencing stored data to be employed for executing said instructions,

a pipeline control unit connected to said instruction pipeline and said execution pipeline for operating said instruction pipeline and said execution pipeline,

means coupled to said instruction pipeline and said execution pipeline for predicting collisions between read data read from a register in the instruction pipeline phase of operation in response to a sequentially later program instruction and write data written in registers during the execution phase of operation in response to a sequentially earlier program instruction, wherein the execution phase can include a plurality of execution cycles each of which modifies values in zero, one, or more registers, and wherein said first instruction requires one of said modified values to continue valid operation, said prediction means comprising

means for predicting a data destination address for said sequentially earlier program instruction,

means for comparing the predicted destination address with an actual data read address of said sequentially later program instruction and for generating a predicted collision signal when the two addresses match, and

said pipeline control unit having means connected with said comparing means and responsive to said predicted collision signal for inserting a delay between said sequentially later program instruction and said sequentially earlier program instruction for forming the write data address before the read address data is used to access data.

2. The data processing system of claim 1 wherein said data destination address predicting means comprises

means for examining selected bits of an instruction for deriving said predicted destination address for the instruction.

3. The apparatus of claim 2 further wherein said microcode instructions provide for storing any register update on a final multi-instruction microinstruction in a selected storage register.

4. A data processing system for processing a sequence of program instructions comprising

an instruction pipeline having a plurality of serially operating instruction stages for reading instructions from storage and for forming therefrom plural address data for use during execution of said instructions,

an execution pipeline connected to the instruction pipeline having a plurality of serially operating execution stages for receiving said address data and for employing said address data formed by said instruction pipeline for referencing stored data for use during execution of said instructions,

a pipeline control unit connected to the instruction pipeline and the execution pipeline for operating said instruction pipeline and said execution pipeline,

detecting means coupled to said instruction pipeline and to said execution pipeline for detecting collisions between data read from a register in the instruction pipeline phase of operation in response to a first instruction and write data written in registers during the execution phase of operation in response to an earlier instruction, wherein said execution phase can include a plurality of execution cycles each of which modifies values in zero, one, or more registers and wherein said first instruction requires one of said modified values to continue valid operation,

said detecting means comprising,

means for storing said modified values generated during the execution phase and write register addresses associated therewith,

means for comparing the associated write register address of each modified value with a read register address read by the instruction pipeline, and

means connected with the comparing means and responsive to a match between said compared register addresses for directing the modified value being written at said register address to replace in at least one of said instruction and execution pipelines data previously designated for use by said first instruction during said instruction phase of operation.

5. The data processing system of claim 4 wherein said directing means further comprises

a selector coupled to said comparing means and responsive to the comparison of said addresses for selecting either of the data read during the instruction phase or said modified value, and

a storage register coupled to said execution pipeline for storing the last written modified value generated during said execution phase of operation.

6. The data processing system of claim 5 further comprising

a microcode storage element, and

means coupled to said storage element for reading microinstructions from said storage element for controlling operation of said detecting means.

7. The data processing system of claim 4 wherein said directing means further comprises

means for detecting collisions between a portion of the modified register and a corresponding portion of the read data, and

said directing means replaces only said portion of the read data with the portion of the modified data value.

8. The data processing system of claim 7 wherein said register portion is one-half of the read data.

9. The data processing system of claim 4 further wherein said detecting means comprises

means for generating a pipeline collision signal in response to a detected collision, and

means for generating an execution stage signal indicating that generation of the modified data has not been completed, and

wherein said pipeline control unit comprises

means coupled to the detecting means and responsive to said pipeline collision signal and said execution stage signal for inhibiting operation of all pipeline stages other than those required to generate said modified data until said modified data has been generated.

10. The data processing system of claim 4 wherein said directing means includes

means coupled to a plurality of pipeline stages for directing said modified data to said plurality of pipeline stages, each said stage then using said modified data in place of said originally read data.

11. In a method of operating a data processing system for processing a sequence of program instructions, the data processing system having an instruction pipeline having a plurality of serially operating instruction stages for reading instructions from storage and for forming therefrom plural address data for use during execution of the instructions, and an execution pipeline connected to the instruction pipeline and having a plurality of serially operating execution stages for receiving the address data and for employing the address data formed by the instruction pipeline for referencing stored data to be employed for executing the instructions, the improvement comprising

a collision detection method comprising the steps of

detecting collisions between read data read from a register during the instruction pipeline phase of operation in response to a sequentially later instruction and write data written in registers during the execution phase of operation in response to a sequentially earlier instruction, wherein the execution phase can include a plurality of execution cycles each of which modifies values in zero, one, or more registers, and wherein the first instruction requires one of the modified values to continue valid operation,

said detection step further comprising the steps of

storing the modifed values generated during the execution phase and write register addresses associated therewith,

comparing the associated write register address of each modified value with a read register address read by the instruction pipeline, and

responding to a match between said compared addresses for selecting the modified value being written into the matching register address to replace in at least one of said instruction and execution pipelines data previously designated for use by said first instruction during the instruction phase of operation.

12. The collision detection method of claim 11 further comprising the steps of

generating a pipeline collision signal in response to a detected collision,

generating an execution stage signal indicating that generation of the modified data has not been completed, and

inhibiting operation of all pipeline stages other than those required to generate said modified data until said modified data generation has been completed.

13. The collision detection method of claim 11 further comprising the step of

directing the modified data to a plurality of pipeline stages, each stage using the modified data in place cf the originally read data during the instruction phase of operation.

14. In a method for operating a data processing system for processing a sequence of program instructions, the data processing system having an instruction pipeline having a plurality of serially operating instruction stages for reading instructions from storage, and for forming therefrom plural address data for use during execution of the instructions, and an execution pipeline, connected to the instruction pipeline, and having a plurality of serially operating execution stages for receiving the address data and for employing the address data formed by the instruction pipeline for referencing stored data to be employed for execution said instructions, the improvement comprising

a collision prediction method comprising the steps of

predicting the data destination address written by the sequentially earlier of two instructions which may potentially collide,

comparing the predicted destination address with an actual data read address of a sequentially later occurring instruction, and

delaying processing of said later instruction for a time for forming the write data address of said sequentially earlier instruction, and

comparing said formed write data address with data read address thereby determining whether a collision between said two instructions.

15. The collision prediction method of claim 14 further comprising the step of

examining selected bits of an instruction for deriving said predicted destination address for the instruction.

16. The method of claim 15 further comprising the step of

storing any register update on a final multi-instruction microinstruction in a selected storage register.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

The present invention relates to the field of digital computers and, in particular, to apparatus and methods for processing instructions in high speed data processing systems.

Data processing systems generally include a central processor, an associated storage system (or main memory), and peripheral devices and associated interfaces. Typically, the main memory consists of relatively low cost, high-capacity digital storage devices. The peripheral devices may be, for example, non-volatile semi-permanent storage media, such as magnetic disks and magnetic tape drives. In order to carry out tasks, the central processor of such systems executes a succession of instructions which operate on data. The succession of instructions and the data those instructions reference are referred to as a program.

In operation of such systems, programs are initially brought to an intermediate storage area, usually in the main memory. The central processor may then interface directly to the main memory to execute the stored program. However, this procedure places limitations on performance due principally to the relatively long times required in accessing that main memory. To overcome these limitations a high speed (i.e. relatively fast access) storage system, in some cases called a cache, is used for holding currently used portions of programs within the central processor itself. The cache interfaces with main memory through memory control hardware which handles program transfers between the central processor, main memory and the peripheral device interfaces.

One form of computer, typically a "mainframe" computer has been developed in the prior art to concurrently hardware process a succession of instructions in a so-called "pipeline" processor. In such pipeline processors each instruction is executed in part at each of a succession of stages. After the instruction has been processed at each of the stages, the execution is complete. With this configuration, as an instruction is passed from one stage to the next, that instruction is replaced by the next instruction in the program. Thus, the stages together form a "pipeline" which, at any given time, is executing, in part, a succession of instructions. Such instruction pipelines for processing a plurality of instructions in parallel are found in several mainframe computers. These processors consist of single pipelines of varying length and employ hardwired logic for all data manipulation. The large quantity of control logic in such machines makes them extremely fast, but also very expensive.

Another form of computer system, typically a "minicomputer," incorporates microcode control of instruction execution. Generally, under microcode control, each instruction is fully executed before execution of the next instruction begins. Microcode-controlled execution does not provide as high performance (principally in terms of speed) as hardwired control, but the microcode control does permit significant cost advantages compared to hardwired systems. As a result, microcode control of instruction execution has been employed in many cost-sensitive machines. Microcode reduces the total quantity of hardware in the processor and also allows much more flexibility in terms of adapting to changes which may be required during system operation. Unfortunately, the conventional pipeline techniques for instruction execution are not compatible with the multiple steps which must be performed to execute some instructions in a microcode-controlled environment.

Accordingly, it is an object of the present invention to provide an improved computer system.

Another object is to provide performance characteristics heretofore associated only with mainframes while maintaining a cost profile consistent with the minicomputers.

It is yet another object to provide a computer system incorporating pipelined instruction processing and microcode-controlled instruction execution.

SUMMARY OF THE INVENTION

The invention relates to a data processing system and pipeline control method for processing a sequence of program instructions in a computer. The data processing system has an instruction pipeline having a plurality of serially operating instruction stages for reading instructions from storage and for forming therefrom plural address data to be employed during execution of the program instructions. The data processing system further has an execution pipeline having a plurality of serially operating execution stages for receiving the address data and for employing that data, formed by the instruction pipeline for referencing stored data to be employed for executing the program instructions.

The data processing system further features circuitry for detecting collisions between pipeline stages and for halting operation of one or more of the pipeline stages to provide a separation between the colliding instructions. Similarly, where a latter stage of the pipeline alters the data to be used by an earlier stage, an appropriate "flushing" of the pipeline can be accomplished should a previously employed address be incorrect, or the instruction pipeline can be halted until the data is properly available from the execution pipeline (registered by-pass).

In another aspect, the data processing system relates to an instruction pipeline and an execution pipeline each having a plurality of serially operating instruction stages and a pipeline control unit for operating the instruction and execution pipelines. In this aspect of the invention there is featured circuitry for detecting collisions between read data from a register in the instruction pipeline phase of operation, read in response to a first program instruction, and write data written in registers during the execution phase of operation in response to an earlier instruction. During the execution phase, a plurality of execution cycles, during each of which a register can be modified, occur, and the first instruction will require one of the modified values to continue valid operation. The detection circuitry features storage for the modified values generated during the execution phase and storage for the write register addresses associated therewith. Circuitry is provided for comparing the associated write register address for each modified value with the read register address employed by the instruction pipeline. When a match is found, directing circuitry replaces the modified value, to be written at the matched address for the data previously designated to be used during the associated instruction phase of operation.

In selected embodiments of this aspect of the invention, only a portion of the originally read data may be modified and replaced. In other aspects of the invention, operation of all pipeline stages, except those required to generate modified data, can take place so that the modified data can be generated prior to a continuation of operation along the instruction, and portions of the execution, pipelines.

The invention also features circuitry for detecting potential collisions between data read from a register in the instruction pipeline phase of operation and data which may be written at a later time. According to this aspect of the invention, the instructions themselves are examined and potential collisions predicted. In response to the prediction of a potential conflict, the pipeline control unit provides delay between the potentially colliding instructions so that if a collision takes place, it can be handled in due course by the system structure described above.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing and other objects of this invention, the various features thereof, as well as the invention itself, may be more fully understood from the following description, when read together with the accompanying drawings in which:

FIG. 1 shows, in block diagram form, an exemplary computer system embodying the present invention.

FIG. 1A depicts, in block diagram form, the instruction processor, including the two three-stage pipelines, showing overlap and flow between stages, and the pipeline control unit, of the central processor of the system of FIG. 1;

FIG. 2 depicts the five hardware units that form the instruction processor of FIG. 2, showing major data paths for the processing of instructions;

FIG. 3 shows, in block diagram form, the pipeline control unit of FIG. 2;

FIG. 3A shows, in block diagram form, the decode logic for the pipeline control unit of FIG. 4;

FIG. 4 shows, in detailed block diagram form, the pipelines of FIG. 1;

FIG. 5 depicts the flow of instructions through the two pipelines, with examples of alteration to normal processing flow; FIG. 6 illustrates the clock generation of the ID stage of the IP pipelines of FIG. 1A;

FIG. 7 depicts a block diagram of the Shared Program Cache of FIG. 1A;

FIG. 8 depicts a block diagram of the Instruction Pre-Processor of FIG. 1A;

FIG. 9 depicts a block diagram of the Micro-Control Store of FIG. 1A; and

FIG. 10 depicts a combined block diagram of the two Execution units of FIG. 1A;

FIG. 11 shows, in block diagram form, the branch cache of the system of FIG. 4; and

FIG. 12 shows, in block diagram form, the register bypass network of the Instruction Pre-Processor of FIG. 8.

DESCRIPTION OF THE PREFERRED EMBODIMENT

FIG. 1 shows a computer system embodying the present invention. The system includes a central processor, main memory, peripheral interface and exemplary peripheral devices.

This system of FIG. 1 processes computer data instructions in the central processor which includes instruction pre-processing hardware, local program storage, micro-control store, and execution hardware. The central processor includes two independent pipelines; the Instruction Pipeline (IP) and the Execution Pipeline (EP). In the preferred form, each pipeline is three stages in length (where the processing time associated with each stage is nominally the same), with the last stage of the IP being overlapped with the first stage of the EP. With this configuration, an instruction requires a minimum of five stage times for completion. All control for advancing instructions through all required stages originates from a Pipeline Control Unit (PCU) in the central processor. The PCU controls the stages to be clocked dynamically, based on pipeline status information gathered from all stages.

This form of the invention processes instructions defined in the System Architecture Reference Guide, 2d Ed. (PRC3060-182) Revision 18,2, published by Prime Computer, Inc., Natick, Mass., and supports the machine architecture, which includes a plurality of addressing modes, defined in the Reference Guide. In keeping with this architecture, words are 16 bits in length, and double words are 32 bits in length. This form of the invention is optimized to perform address formations including BR+X+D, BR+GRH+D and RP+X+D, where BR (Base Register) is a 32-bit starting address pointer, X (Index) is a 16-bit register, GRH (high side of General Register) is a 16-bit quantity, D (the displacement) is contained explicitly in the instruction and may be either 9 or 16 bits, and RP is the current value of the program counter.

PRINCIPLES OF PIPELINE OPERATION

Pipeline Stage

FIG. 1A shows, in functional block diagram form, two three-stage pipelines, an Instruction Pipeline (IP) and an Execution Pipeline (EP), together with the pipeline control unit (PCU) in the central processor. The Instruction Pipeline includes an Instruction Fetch (IF) stage 2, an Instruction Decode (ID) stage 3, and an Address Generation (AG) stage 4. The Execution Pipeline (EP) includes a Control Formation (CF) stage 5, an Operand Execute (OE) stage 6, and an Execute Store (ES) stage 7. The PCU 1 is depicted in detailed block diagram form in FIGS. 3 and 3A and the IF, ID, AG, CF, OE and ES stages are depicted in detailed block diagram form in FIG. 4.

FIG. 2 shows an embodiment of the IP, EP and PCU of FIG. 1A in terms of five hardware units: Instruction Pre-Processor (IPP) 8, Shared Program Cache (SPC) 9, Execution-1 board (EX1) 10, Execution-2 board (EX2) 11, and Mirco-Control Store (MCS) 12. The hardware units of FIG. 2 are representative of groupings of the various elements of the IP and EP of FIG. 4. The respective hardware units are shown in detailed form in FIGS. 7-10. In alternative embodiments, other groupings of the various elements of the IP and EP may be used.

Briefly, in the illustrated grouping of FIG. 2, the Shared Program Cache 9 contains local storage and provides instructions by way of bus 13 to the Instruction Pre-Processor 8, and provides memory operands by way of bus 14 to the Execution-1 board 10. The IPP 8 supplies memory operand addresses by way of bus 15 to the SPC 9, register operands and immediate data by way of bus 17 to EX1 10, and control decode addresses by way of bus 19 to the Micro-Control Store 12. EXl 10 operates on memory operands received by way of bus 14 from the SPC 9 and register file operands received by way of bus 16 from the Execution-2 board 11, and transfers partial results by way of bus 18 to EX2 11 for post-processing and storage. EX2 11 also performs multiplication operations. The MCS 12 provides microprogrammed algorithmic control for the four blocks 8-11, while the PCU 1 provides pipeline stage manipulation for all blocks 8-12.

The pipeline stage operations are completed within the various hardware units 8-12 as follows:

IF (Instruction Fetch): A Look-ahead program counter on SPC 9 is loaded into a local general address register; instruction(s) are accessed from a high speed local memory (cache).

ID (Instruction Decode): Instruction data is transferred from SPC to IPP 8; IPP 8 decodes instructions, forming micro-control store entry point information for MCS 12, and accessing registers for address generation in IPP 8.

AG (Address Generation): IPP 8 forms instruction operand address and transfers value to SPC 9 address register.

CF (Control Formation): MCS 12 accesses local control store word and distributes control information to all boards.

OE (Operand Execute): SPC 9 accesses memory data operands in cache; EX1 10 receives memory data operands from SPC 9, register operands from IPP 8, and begins arithmetic operations.

ES (Execute Store): EX1 10 and EX2 11 complete arithmetic operation and store results.

The Address Generation and Control Formation stages are overlapped in time within the data system. The IP and EP operate synchronously under the supervision of the pipeline control unit (PCU) 1, which interfaces to each stage with two enable lines (ENCxxl and ENCxx2) that provide two distinct clock phases within each stage, as indicated in FIG. 1A. The notation "xx" refers to a respective one of the reference designations IF, ID, AG, CF, OE and ES. The six ENCxx2 lines denote the respective stage operations are complete and the data (or control) processed in those stages are ready for passing to the next stage.

Clocking of Pipeline Stages

Timing and clocking in the dual pipelines (IP and EP) are synchronized by two signals--the master clock MCLK and the enable-end-of-phase signal ENEOP. ENEOP is produced by the Pipeline Control Unit 1 and notifies all boards of the proper time to examine the stage clock enable signal lines (ENCxx1 and ENCxx2) in order to produce phase 1 and phase 2 stage clocks from the master clock MCLK. (See FIG. 6). Pipeline stages always consist of two phases. Phase 1 lasts for exactly two MCLK pulses while phase 2 can last for an arbitrary number of MCLK pulses, as described below, depending on the conditions present in both the IP and the EP.

An example of how MCLK and ENEOP and the stage clock enables interact on each board to form the clocks which define the stage boundaries is shown in FIG. 6 for the Instruction Decode stage 2. Register 22 generates clock signals when enabled by ENEOP. When ENCID1 is present the clock CID1 is generated; when ENCID2 is present, the clock CID2 is generated.

PIPELINE CONTROL UNIT

The Pipeline Control Unit 1 shown in FIGS. 3 and 3A controls the flow of instructions through the dual pipelines (IP and EP) by generating the enable signals for all clocks which define stage boundaries and relative overlap of the IP and EP. The PCU 1 includes stage clock enable decode logic 23 and the Pipeline State Register (PSR) 24. PCU 1 receives as inputs:

1. Instruction information and exception and register conditions from the IPP 8

2. Exception and cache conditions from the SPC 9

3. Microcode specified timing conditions related to the length of stage OE and the overlap of stage OE and CF from the MCS 12

4. Exception conditions from EX1 10 and EX2 11.

The PCU 1 has complete control of all stage boundaries. With that control:

1. The PCU 1 can hold the IP while cycling multi-microcode through the EP.

2. The PCU 1 can alter the flow of instructions based on control information provided by microcode.

3. The PCU 1 can extend all stages if extra time is required for a particular stage to finish its operation.

4. The PCU 1 can alter the relative overlap of stages OE and CF of the EP in order to allow different types of microcode sequencing (as described below in conjunction with EX1,2).

5. The PCU 1 can flush out instructions in the IP and recycle the IP to load new instructions upon detecting incorrect flow (such as an incorrect flow prediction provided by Branch Cache 34).

6. The PCU 1 can idle the EP with no-operation (NOP) cycles, while cycling the IP, for example, when IRP 27,33 in the SPC 9 is reloaded after an incorrect program flow sequence.

7. The PCU 1 can suspend all pipeline operations during non-overlappable operations such as "cache miss" access to main memory.

8. The PCU 1 can introduce separation between sequential instructions in the IP under certain conditions, such as "collisions" between instructions.

9. The PCU 1 can keep an instruction held in the IF stage upon detecting an instruction-related exception, and then allow the other instructions currently in the pipeline to complete processing so that the exception can be processed in the correct order.

The Pipeline Control Unit (PCU) 1 which controls the clocking of the stages in the IP and EP is shown in detail in FIG. 3A. Condition signals received from the IPP 8, SPC 9, MSC 12, EX1 10, and EX2 11 hardware units are utilized to produce enable signals for clocks in the IF 2, ID 3, AG 4, CF 5, OE 6, and ES 7 stages of the dual pipelines (IP and EP). There are two major elements in PCU 1 which produce the clock enable signals ENCxx1,2: the pipeline state register (PSR) 24 (including state registers 180,182,184,186,188,190) and the stage clock enable decode logic 23 (including conbinatorial logic blocks (181,183,185,187,189,191). The state registers 180,182,184,186,188,190 indicate that the respective pipeline stages are ready to be enabled. if there are no conditions received by the PCU 1 which should inhibit the stage from proceeding. When the stages are in operation, the state registers 180,182,184,186,188,190 provide a timing r