Bubble compression in a pipelined central processing unit (CPU) of a computer system is provided. A bubble represents a stage in the pipeline that cannot perform any useful work due to the lack of data from an earlier pipeline stage. When a particular pipeline stage has stalled, the CPU instructions that have already passed through the stage continue to move ahead and leave behind vacant stages or bubbles. If a bubble is introduced into a pipeline and the pipeline subsequently stalls, the disclosed CPU takes advantage of this stalled condition to compress the previously introduced bubble.
A pipelined computer system employs a queue stage to receive the output of one pipeline stage when a stall occurs in the next stage or downstream of the next stage. This avoids stalling earlier stages of the pipeline. Subsequently, the pipeline advances through the queue, until a bubble occurs. When a bubble is subsequently generated upstream and enters the queue stage, a multiplexer switches the input of the next stage to receive the output of the one stage instead of from the queue stage, and the content of the queue is overwritten. By this mechanism, the delays inherent in processing branches can be reduced.
A method and apparatus for executing a condensed instruction stream by a processor including receiving an instruction including an instruction identifier and multiple of instruction synonyms within the instruction, generating at least one full width instruction for each instruction synonym, and executing by the processor the generated full width instructions.
A method and system are disclosed for processing instructions within a data processing system including a processor having a plurality of execution units. According to the method of the present invention, a number of instructions stored within a memory within the data processing system are retrieved from memory. A selected instruction among the number of instructions is decoded to determine if the selected instruction would be noneffective if executed by the processor. In a preferred embodiment of the present invention, noneffective instructions include instructions with invalid opcodes and instructions that would not change the value of any data register within the processor. In response to determining that the selected instruction would be noneffective if executed by the processor, the selected instruction is recoded into a specified instruction format prior to dispatching the selected instruction to one of the number of execution units. Detecting noneffective instructions prior to dispatch reduces the decode logic required within the dispatcher and enhances processor performance.
Each of a plurality of devices or agents connected to a computer system bus is provided with a mechanism for unilaterally and dynamically limiting the depth of a pipeline of the bus. Each agent includes a state machine which indicates whether the bus is in a throttled state, a stalled state or a free state. When in a free state, an agent having control of the bus may transmit any number of bus transactions and the depth of the pipeline may therefore increase. In the throttled state, the agent may transmit only a single bus transaction from the throttled state. The state machine always transitions either to the stalled state or to the free state. In the stalled state, no agents may transmit transactions onto the bus and the depth of the pipeline therefore cannot increase and instead may decrease with time as previously issued transactions are drained from the bus. Wired-OR logic is employed for allowing an agent to transmit a state transition signal to all other agents on the bus changing the state of the various state machines. Only a single state transition signal is required to completely control the state of the state machines. By employing wired-OR logic, any particular agent is capable of switching the state machines into a stalled state to prevent new bus transactions from being issued to the bus. In this manner, each agent is capable of unilaterally restricting or limiting the depth of the pipeline. Hardware or software is provided within each agent to control the state machine in a manner such that all state machines remain synchronized with each indicating the same state at substantially the same time.
In a typical operating system, one-third of a program consists of branch instructions. This means a performance of a processor of a typical operating system depends greatly on whether or not an instruction before and after a branch instruction can be executed in parallel. In order to provide a high performance processor with parallel processing, provided is a structure with a plurality of operating units and a plurality of registers where a set of registers are specified with the same address. A selection sequence of registers is stored by a plurality of selection sequence storages. Contents of registers are determined or not depending on the information stored in a plurality of determination identification storages. A register is specified by a register selector according to the contents of the selection sequence storages. This register selector is also used to update the contents of the selection sequence storages. The contents of the determination identification storages are rewritten by a determination identifier when the contents of a register proves to be a correct result.