In order to overcome the problem that conditionally executed instructions are executed as no-operation instructions if their condition is not fulfilled, leading to poor utilization efficiency of the hardware and lowering the effective performance, the processor decodes a number of instructions that is greater than the number of provided computing units and judges their execution conditions with an instruction issue control portion before the execution stage, Instructions for which the condition is false are invalidated, and subsequent valid instructions are assigned so that the computing units (hardware) is used efficiently. A compiler performs scheduling such that the number of instructions whose execution condition is true does not exceed the upper limit of the degree of parallelism of the hardware. The number of instructions arranged in parallel at each cycle may exceed the degree of parallelism of the hardware.
A scheduling algorithm is provided for selecting the placement of instructions with internal slack into a schedule of instructions within a loop. The algorithm achieves this by pinning nodes with internal slack to corresponding nodes on the critical path of the code that have similar properties in terms of the data dependency graph, such as earliest time and latest time. The effect is that nodes with internal slack are more often optimally placed in the schedule, reducing the need for rotating registers or register copy instructions. The benefit of the present invention can primarily be seen when performing instruction scheduling or software pipelining on loop code, but can also apply to other forms of instruction scheduling when greater control of placement of nodes with internal slack is desired.