A microprocessor configured to dynamically switch its floating point load pipeline length from one stage in length to more than one stage in length is disclosed. The microprocessor may perform normal loads and detect denormal loads in a single clock cycle. The microprocessor temporarily stores each scheduled floating point instruction in a reissue buffer for at least one clock cycle. When a denormal load instruction is detected, the microprocessor is configured to add one or more stages to the floating point load pipeline to allow the denormal value to complete the conversion to an internal format. The longer pipeline is then used for all loads that follow the denormal load until there is an idle clock cycle or an abort occurs. At that point, the pipeline reverts back to its original shorter state. In addition, the microprocessor may be configured to cancel instructions scheduled assuming the denormal load would take only one clock cycle to complete. The canceled instruction is then "replayed" during a later clock cycle from the reissue buffer. A method for performing denormal loads and a computer system are also disclosed.
A deep-pipeline system substantially reduces the overhead of setup delays and pipeline delays by dynamically controlling access of a plurality of configuration register sets by both a host central processing unit (CPU) and the stages of the pipelines. A master configuration register set is loaded with configuration parameters by the host CPU in response to an index count provided by a setup-index counter. A plurality of other counters are employed to track timing events in the system. In one embodiment, a run-index counter provides a run-index count to the first stage of the pipeline that is propagated along the stages, enabling configuration register sets to transfer configuration parameters to the stages of the pipeline when required to enable processing of a task. In an alternative embodiment, a plurality of D flip-flops sequentially propagates a state for successive registers, so that the setup-index counter is not required.
A floating point unit comprising: 1) an execution pipeline comprising a plurality of execution stages for executing floating point operations in a series of sequential steps; and 2) a try-again reservation station for storing a plurality of instructions to be loaded into the execution pipeline. Detection of a denormal result in the execution pipeline causes the execution pipeline to store the denormal result in a register array associated with the floating point unit and causes the execution pipeline to store a denormal result instruction in the try-again reservation station. The try-again reservation station subsequently re-loads the denormal result instruction into the execution pipeline and the de-normal result instruction retrieves the denormal result from the register array for additional processing.
A method for providing a 16-bit floating point data representation where the 16-bit floating point data representation may be operated upon by a microprocessors native floating point instruction set. The method contemplates the use a variety of techniques for converting the 16-bit floating point number into a representative native floating point value. Thereafter, the native microprocessor floating point instruction set may perform operations upon the converted data. Upon completion, the native floating point data representation may be converted back into the 16-bit floating point value.
A variable speed floating point unit comprising: 1) an execution pipeline comprising a plurality of execution stages capable of executing floating point operations in a series of sequential steps; and 2) a clock controller capable of receiving an input clock signal and generating a variable speed output clock signal capable of clocking the execution pipeline. The clock controller adjusts a speed of the variable speed output clock signal according to a level of queued opcodes waiting to be executed in the execution pipeline.
A method and apparatus for retaining flag values when an associated data value dies. A first storage circuit includes a free list for storing physical register names (PRNs) and indications indicative of whether a physical register associated with a PRN was assigned to store a logical register result and flag results of a first instruction and a logical register result and a subsequent instruction which overwrites the logical register result but not the flags. A second storage circuit stores PRNs separate from the free list. The first and second storage circuits output first and second PRNs to a selection circuit. If the first indication (associated with the first PRN) is in a first state, the selection circuit may provide the first PRN to a mapper for assignment to a logical register. If the first indication is in a second state, the second PRN may be provided to the mapper.