|
Claims  |
|
|
I claim:
1. An hierarchically-scalable cellular array comprising:
a plurality of cells;
a first routing resource for interconnecting each cell to its adjacent
neighboring cells, said first routing resource including a first plurality
of lines having a length n, where n is a predetermined distance;
a second routing resource including a second plurality of lines having a
length N.times.n where N>1, said second routing resource connecting said
cells into blocks of cells and interconnecting said blocks of cells; and
a third routing resource including a third plurality of lines having a
length N.sup.m .times.n where n.gtoreq.2 and where N.sup.m
.times.n.ltoreq.T, the total length of said cellular array, said third
routing resource interconnecting groups of said blocks of cells.
2. The cellular array of claim 1 wherein n is approximately one cell width.
3. The cellular array of claim 1 wherein N is 4.
4. The cellular array of claim 3 wherein m is 2.
5. The cellular array of claim 3 wherein m is 3.
6. The cellular array of claim 1 further including a fourth routing
resource including a fourth plurality of lines having a length N.sup.r
.times.n where r.gtoreq.3, and r=m+1.
7. A programmable logic structure comprising:
an array of cells arranged in rows and columns;
a plurality of switches, said plurality of switches partitioning said array
into a first plurality of cell blocks,
wherein each cell includes:
four input lines for providing four input signals to said cell, each of
said input lines coupled to an adjacent cell or switch; and
four output lines for providing four output signals from said cell, each of
said output lines coupled to an adjacent cell or switch,
wherein said array further includes four intermediate, directed input lines
for providing an additional four input signals to said cell, wherein each
intermediate input line provides signals to the cells in either a row or a
column of one of said blocks.
8. The logic structure of claim 7 wherein said array further includes a
plurality of flyover lines, each flyover line coupled between two of said
plurality of switches, wherein said plurality of flyover lines determines
a second plurality of cell blocks.
9. In an integrated circuit having a two-dimensional array of logic cells,
a switch comprising:
a set of multiplexers, each of which receives a plurality of input signals
and provides an output signal;
for each multiplexer, multiplexer control means for causing said
multiplexer to select one of said input signals to provide said output
signal, said input signals being taken from the following wires:
a wire extending to the west from said switch and having a length
approximately equal to a first multiple of said length of one of said
cells;
a wire extending to the east from said switch and having a length
approximately equal to said first multiple of said length of one of said
cells;
a wire extending to the west from said switch and having a length
approximately equal to a second multiple of said length of one of said
cells; and
a wire extending to the east from said switch and having a length
approximately equal to said second multiple of said length of one of said
cells;
said output signals being placed on the following wires:
a wire extending to the west from said switch and having a length
approximately equal to a first multiple of said length of one of said
cells;
a wire extending to the east from said switch and having a length
approximately equal to said first multiple of said length of one of said
cells;
a wire extending to the west from said switch and having a length
approximately equal to a second multiple of said length of one of said
cells; and
a wire extending to the east from said switch and having a length
approximately equal to said second multiple of said length of one of said
cells.
10. The switch of claim 9 wherein said first multiple is 1 and second
multiple is 4.
11. The switch of claim 9 wherein said first multiple is 4 and second
multiple is 16.
12. The switch of claim 9 wherein said first multiple is 16 and second
multiple is 64.
13. The switch of claim 9 wherein said multiplexers further receive input
signals taken from the following wires:
a wire extending to the west from said switch and having a length
approximately equal to a third multiple of said length of one of said
cells; and
a wire extending to the east from said switch and having a length
approximately equal to said third multiple of said length of one of said
cells.
14. The switch of claim 13 wherein said first multiple is 1, said second
multiple is 4, and said third multiple is
15. The switch of claim 13 wherein said first multiple is 4, said second
multiple is 16, and said third multiple is 64.
16. A programmable integrated circuit comprising:
a two-dimensional array of logic cells, each cell comprising means for
generating as an output signal a selected logic function of a plurality of
logic cell input signals;
a plurality of switches, said switches positioned so as to group said cells
into cell blocks;
each cell including:
four short input wires for providing four of said input signals, said four
short input wires carrying output signals from the cells or the switches
which comprise nearest neighbors on four sides of said cell; four medium
input wires for providing
four of said input signals, said four medium input wires extending in four
compass directions between switches which define a cell block in which
said cell is positioned, wherein a plurality of said medium input wires
are simultaneously accessable by said cell; and
four short output wires for providing said output signal, said four short
output wires carrying output signals to the cells or the switches which
comprise nearest neighbors on four sides of said cell.
17. The programmable integrated circuit of claim 16 wherein each of said
medium input wires carries a signal from a switch in one of the four
compass directions to said cell and to other cells positioned between two
of said switches making up said cell block.
18. The programmable integrated circuit of claim 17 wherein said medium
wires are approximately four times the length of said short wires.
19. A programmable logic structure comprising:
an array of cells;
a plurality of switches arranged to group said cells into cell blocks such
that for each cell block at least one west switch is positioned to the
west of that block, and at least one east switch is positioned to the east
of that block;
a neighbor wire connecting from an output of said west switch to an input
of the nearest cell east of said west switch;
a flyover wire connecting from an output of said west switch to an input of
said east switch;
a neighbor wire connecting from an output of said east switch to an input
of the nearest cell west of said east switch; and
a flyover wire connecting from an output of said east switch to an input of
said west switch.
20. A programmable logic structure of claim 19 wherein said plurality of
switches is further arranged such that for each cell block at least one
north switch is positioned to the north of that block, and at least one
south switch is positioned to the south of that block, wherein said
programmable logic structure further comprises:
a neighbor wire connecting from an output of said north switch to an input
of the nearest cell south of said north switch;
a south flyover wire connecting from an output of said north switch to an
input of said south switch;
a short wire connecting from an output of said south switch to an input of
the nearest cell north of said south switch; and
a north flyover wire connecting from an output of said south switch to an
input of said north switch.
21. A programmable logic structure as in claim 20 in which each of said
cells comprises a plurality of smaller cells, each of said smaller cells
being defined by:
a plurality of switches, said switches arranged to group said smaller cells
into smaller cell blocks such that for each smaller cell block at least
one west switch is positioned to the west of that block, at least one east
switch is positioned to the east of that block, at least one north switch
is positioned to the north of that block, and at least one south switch is
positioned to the south of that block;
said smaller cells being connected to said switches by:
a neighbor wire connecting an output of said west switch to an input of the
nearest smaller cell east of said west switch;
a neighbor wire connecting an output of the nearest smaller cell east of
said west switch to an input of said west switch;
an east flyover wire connecting from an output of said west switch to an
input of said east switch;
a neighbor wire connecting from an output of said east switch to an input
of the nearest smaller cell west of said east switch; and
a neighbor wire connecting an output of the nearest smaller cell west of
said east switch to an input of said east switch;
a west flyover wire connecting from an output of said east switch to an
input of said west switch;
a neighbor wire connecting from an output of said north switch to an input
of the nearest smaller cell south of said north switch;
a neighbor wire connecting an output of the nearest smaller cell south of
said north switch to an input of said north switch;
a south flyover wire connecting from an output of said north switch to an
input of said south switch;
a neighbor wire connecting from an output of said south switch to an input
of the nearest smaller cell north of said south switch; and
a neighbor wire connecting an output of the nearest smaller cell north of
said south switch to an input of said south switch;
a north flyover wire connecting from an output of said south switch to an
input of said north switch.
22. An input/output structure for a programmable logic device having a
plurality of logic cells comprising:
a plurality of pads, each pad connected to an external pin of said
programmable logic device;
a plurality of I/O buffers, each I/O buffer connected to one of said
plurality of pads, each I/O buffer including a switch for connecting said
I/O buffer to at least one of said plurality of logic cells, wherein said
switch forms part of a hierarchial interconnect structure receiving input
signals both from wires from neighboring cells and wires from other
switches and providing output signals to both wires in neighboring cells
and wires leading to other switches; and
means for controlling said switch.
23. A method for routing signals in a programmable logic structure
comprising the steps of
arranging an array of cells in rows and columns;
partitioning said array into a plurality of cell blocks with a plurality of
switches;
coupling each input line associated with each cell to an adjacent cell or
switch;
coupling each output line associated with each cell to said adjacent cell
or switch;
providing an additional four directed, intermediate input lines for each
cell, wherein each directed intermediate input line provides signals to
the cells in either a row or a column of one of said plurality of cell
blocks.
24. The method of claim 23 further including the step of providing a
plurality of flyover lines, each flyover line coupling two of said
plurality of switches.
25. The method of claim 24 further including the step of providing a signal
from a first cell to a second cell, wherein said providing includes
transferring said signal via at least one intermediate input line.
26. The method of claim 25 wherein said step of providing further includes
transferring said signal via at least one flyover line.
27. The method of claim 23 further including the steps of:
sending a signal via a first path including a plurality of switching units,
said first path paralleling a direct, second path;
using logic gates to detect whether said signal travels the full length of
said first path; and
if said signal travels said full length, then placing said signal on said
second path.
28. A routing device in a programmable logic device comprising:
a plurality of cells providing a first path, wherein each cell includes:
means for receiving a plurality of input signals and providing an output
signal;
a plurality of memory bits associated with said means for receiving; and
a first logic gate for providing a trigger signal determined by the state
of said plurality of memory bits;
a second path not including cells, said second path paralleling said first
path;
a second logic gate coupled to the output terminals of the first logic
gates to detect whether said signal travels the full length of said first
path; and
means for determining whether a signal is provided on said first path or on
said second path, said means for determining controlled by said second
logic gate. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
FIELD OF THE INVENTION
The present invention relates to a configurable cellular array of
dynamically configurable logic elements, such arrays being generally known
as Field Programmable Gate Arrays (FPGAs).
BACKGROUND OF THE INVENTION
Reprogrammable FPGAs have been available commercially for several years.
The best known commercial family of FPGAs are those from Xilinx Inc. One
class of these devices uses Static Random Access Memory (SRAM) to hold
control bits which control their configurations. Most FPGA devices replace
traditional mask programmed Applications Specific Integrated Circuit
(ASIC) parts which have a fixed configuration. The configuration of the
FPGA is static and is loaded from a non-volatile memory when power is
applied to the system. Nearly all commercially available FPGAs have a
stream-based interface to the control store. (The control store contains
the set of bits which determine what functions the FPGA will implement.)
In a stream-based interface to the control store, a sequence of data is
applied to a port in the FPGA to provide a complete configuration for the
whole device or for a fixed (normally large) sub-section of the FPGA. This
stream-based interface, when combined with an address counter which is
implemented on the FPGA itself, provides an efficient method of loading
the complete device configuration from adjacent EPROM or other
non-volatile memory on power up without any additional overhead circuits.
A stream based interface with an address counter is a suitable programing
interface for an FPGA which is used as a replacement for a standard ASIC.
Some FPGAs can be partly or totally reconfigured using one of a set of
static configurations stored at different addresses in an EPROM, and can
trigger the reconfiguration from within the design implemented on the
FPGA.
Published International Application WO 90/11648, corresponding to U.S. Pat.
No. 5,243,238, discloses an architecture hereafter referred to as CAL I,
which has been implemented in an Algotronix product designated CAL 1024.
CAL I is different from other commercially available FPGAs in that its
control store appears as a standard SRAM to the systems designer, and can
be accessed using address bus, data bus, chip enable, chip select and
read/write signals. Addressing the control store as an SRAM supports a
user program running on the host processor mapping the FPGA control store
(configuration memory) into the memory or address space of the host
processor so that the processor can configure the FPGA to implement a
user-defined circuit. This arrangement, which is implemented in the CAL
1024 FPGA, allows the user to partition an application between the
processor and the FPGA with appropriate sections being implemented by each
device. The control store interface provides an important input/output
(I/O) channel between the FPGA and the processor, although the I/O can
also take place using more traditional techniques via, for example, a
shared data memory area. This latter type of FPGA provides a passive
control store interface because an external agent is required to initiate
configuration or reconfiguration of the device, as required.
Experience with the CAL I architecture and trends within the electronics
industry have made this second passive form of control store interface
increasingly attractive for many applications. Microprocessors or
microcontrollers are now pervasive components of computer systems and most
board level systems contain one. The major benefit of the stream based
"active" FPGA programming approach is that no overhead circuits are
required to initiate reconfiguration. In systems where a microprocessor or
microcontroller is present, the "passive" RAM emulating FPGA interface is
preferable for several reasons:
(1) the FPGA configuration can be stored in the microprocessor's program
and data memory (reducing the number of parts by removing the need for a
separate memory chip),
(2) the existing data and address buses on the board can be used to control
the FPGA (saving printed circuit board area by removing dedicated wires
between the configuration EPROM and the FPGA);
(3) the FPGA control store can be written to and read from by the
microprocessor, and thereby used as an I/O channel between the FPGA and
the microprocessor, thereby potentially saving additional wiring between
the FPGA and the processor buses and freeing the FPGA programmable I/O
pins for communication with external devices, and
(4) the intelligence of the microprocessor can be used to support
compression schemes for the configuration data and other techniques, which
allows more flexibility in reprogramming the FPGA.
In addition, the difference in cost between an "active" FPGA with an
associated EPROM holding its configuration and a passive FPGA with an
active microcontroller chip containing an EPROM and a simple processor is
minimal. The easy reprogrammability makes the passive FPGA attractive,
even if the microcontroller has no other function apart from reprogramming
the FPGA.
Another trend within the Electronics Industry has been the provision of
"support chips" for microprocessors which provide an interface between I/O
devices and a particular microprocessor. Examples of these devices include
Universal Asynchronous Receiver Transmitters (UARTs) for low bandwidth
serial I/O, Programmable Peripheral Interfaces (PPIs) for low bandwidth
parallel I/O and various specialised chips for higher bandwidth
connections to networks and disk drives. These support chips appear to the
processor as locations in the I/O or memory address space to and from
which data are transferred. Some support chips can interrupt the processor
via interrupt lines or take over the bus for Direct Memory Access (DMA)
operations. In many ways a passive FPGA chip can be viewed as a successor
to a support chip, providing an interface to the processor via its control
store on the one hand, and an interface to the external world via a large
number of flexible I/O lines on the other, for example 128 programmable
I/O lines on the Algotronix CAL 1024 device.
A passive FPGA chip has a number of advantages. For example, it is
cost-effective to provide a single FPGA with a library of configurations
instead of providing a number of support chips. In addition, providing a
single FPGA for several functions reduces the number of devices in the
processor manufacturer's catalogue. Also, reconfigurable FPGAs can support
changeable I/O functions, such as when a single external connector can be
used as either a serial or a parallel port. With a passive RAM control
interface, the FPGA is able to support other functions as well.
Each time an FPGA is reconfigured to implement a different set of
functions, the microprocessor must access the configuration memory. One
reconfiguration typically requires many control store accesses, one access
for each word of configuration memory to be changed. Several important
classes of reconfiguration have been identified.
(1) Application swapping occurs when one application terminates and a
completely different application wishes to make use of the FPGA. In this
case the FPGA chip is completely reconfigured, usually from a static
configuration.
(2) Task swapping occurs when the application must configure relatively
large sections of the FPGA to implement a new phase in the computation.
For example, a sorting application might first sort small batches of data
completely using configuration A and then merge those sorts into a
completely sorted stream of data using configuration B. In this case, the
application has knowledge of both configurations and need only change
those resources which are different in configuration B. At a later point,
configuration A may itself be restored.
(3) Data dependent reconfiguration occurs when the configurations of some
cells are computed dynamically based on input data by the application
program, rather than being loaded from a static configuration file. Often
a static configuration is first loaded, then a relatively small sub-set of
cells are reconfigured dynamically (that is, reconfigured while the chip
is operating). An important example of this class of reconfiguration is
where an operand (such as a constant multiplier or a search string) is
folded directly into the logic used to implement the multiply or sort unit
rather than being stored in a register. This technique is advantageous as
it frequently results in smaller and faster operation units.
(4) Access to gate outputs occurs for debugging. The outputs of all the
logic cells on the CAL I FPGA are mapped to bits of the control store.
Debugging programs are available which read back this information on the
display or design layout to show the logic levels on internal wires.
(5) Access to gate outputs for I/O is similar to the previous access to
gate outputs for debugging. But in this particular case only a small
fraction of the logic nodes, namely those which correspond to input and
output registers, will be accessed repeatedly. The ability to rapidly
assemble a word representing input to or the result of a computation from
several bits at different locations in the control store is critical to
the effectiveness of this technique.
It is desirable to reduce the number of accesses required and hence the
time to wholly or partially reconfigure the device. Several systems other
than CAL I have been proposed which allow direct access to internal
signals in an FPGA or an FPGA-like device, for example, as disclosed in
Cellular Logic-in-Memory Arrays, William H. Kautz, IEEE Transactions on
Computers Vol C18 No. 8, August 1969; A Logic in Memory Computer, Harold
S. Stone, IEEE Transactions on Computers, Vol C19 No. 1, January 1970 and
Xilinx U.S. Pat. No. 4,758,985 Microprocessor Oriented Configurable Logic
Element, although all these proposals suffered from major drawbacks and
were not made available commercially.
It is also desirable to improve the means of accessing state information in
designs implemented on FPGAs so that an external processor can perform
word-wide read or write operations on the registers of the user's design
with a single access to the control store. Thus the control store
interface allows high bandwidth communication between the processor and
the FPGA. It is also desirable to provide mechanisms for synchronising
computations between the FPGA and the processor and to provide a mechanism
for extending design configuration files to support dynamic
reconfiguration while allowing use of conventional tools for static
designs to create FPGA configurations.
The architecture of the CAL 1024 was based on 1.5 micrometer technology
available in 1989. One problem with the CAL I architecture in which cells
are connected only to their nearest neighbours was that cells in the
middle of the array became less useful with increasing array size as the
distance and hence delay to the edge of the chip increased. This problem
became more serious as improvements in processing technology meant that
the number of cells implementable per chip increased from 1024 to about
16,384. This resulted in a scalability problem because of increased
delays, and reduced the performance below the desired criteria. Thus,
although scalability of chips using the CAL I architecture can be
achieved, it is at the expense of performance. The limited number of cells
available on a single chip with 1.5 .mu.m technology meant that it was
desirable to ensure scalability over chip boundaries so that large designs
typical of many computational applications could be realised using
multiple chips. The limitations of the then processing technology also
made it essential to optimise the architecture for silicon area and
sometimes this optimisation was at the expense of speed. The original
Algotronix CAL 1024 chips were designed to bring out peripheral array
signals to pads on the edges of the cellular array so that they could be
cascaded into larger cellular arrays on a printed circuit board. Packaging
technology has not evolved as rapidly as chip technology and limitations
on the number of package I/O pins make it uneconomic to produce fully
cascadable versions of the higher cell density chips.
The CAL I architecture suffered from a number of other disadvantages. For
example, in order to access a cell in the existing CAL I FPGA, five to six
processor instructions are needed to calculate the address of the cell;
this again takes time and slows operation. With the existing CAL I cell
array the routing architecture used meant that with increased number of
cells per chip, routing via intermediate cells added considerably to the
delays involved. In addition, in the CAL 1024 device, global signals are
coupled to all the cells in the array so that the cells can be signalled
simultaneously. It logically follows that at high clock frequencies,
global signals could consume high power.
SUMMARY OF THE INVENTION
An object of the present invention is to provide an improved field
programmable gate array which obviates or mitigates at least one of the
aforementioned disadvantages.
A further object of the present invention is to reduce the number of
control store accesses required and the time to wholly or partially
reconfigure the device from one configuration to another.
A further object of the invention is to enable an external processor to
perform word-wide read or write operations on registers of a user's design
with a single access to the control store.
A yet further object of the present invention is to provide a mechanism for
extending design configuration files to support dynamic reconfiguration
while allowing the use of conventional tools for static designs to create
FPGA configurations.
A further object of the invention is to provide mechanisms for the
synchronisation of computations between the FPGA and an external
processor.
A yet further object of the invention is to provide a novel routing
architecture which can be scaled up to operate on arrays having different
numbers of cells to reduce delays involved in routing between cells in
large cell arrays.
Array of Cells with Hierarchical Routing Structure
In accordance with the present invention, a 2-dimensional field
programmable gate array (FPGA) of cells arranged in rows and columns is
provided. Each cell in the array has at least one input and one output
connection at least one bit wide to each of its neighbouring cells. Each
cell also has a programmable routing unit and a programmable function unit
to permit intercellular connections to be made. The programmable function
unit can select one of a plurality of functions of several input signals
for generating a function unit output signal. The routing unit of a cell
directs inputs of the cell to function unit inputs, and also directs
inputs of the cell and the function unit output to neighbouring cells.
Groups of cells in the array are arranged so as to be connected to
additional conductors of a length equal to a predetermined number of
cells. Cells in the array are coupled to the additional conductors via
switches. Typically, four such conductors are provided for each cell, two
conductors arranged in one direction in the array and two conductors
arranged in the orthogonal direction in the array. Each pair of conductors
is arranged such that one conductor in the pair carries signals in one
direction and the other conductor carries signals the opposite direction.
This novel architecture is referred to hereafter as the CAL II
architecture, or simply as CAL II.
A predetermined block of cells, for example a 4.times.4 block of cells,
has the additional conductors of at least cell length 4 (four cells long).
These blocks are arranged into repeating units to form an array of these
cells whereby 16 of such 4.times.4 blocks of cells result in a unit of 16
cells.times.16 cells with each 4.times.4 block having the additional
conductors, the longer conductors hereinafter referred to as flyovers,
associated with each row or column of 4 cells. The 16.times.16 block of
cells may itself have additional flyover conductors.
In larger cellular arrays, the structure of the hierarchical routing can be
extended to any number of levels, a third level using conductors of length
64, and a fourth level using conductors of length 256 and so on.
This arrangement permits scaling of the array with the advantage that the
scaling is logarithmic in terms of distance, thereby significantly
reducing delay between cells. Specifically, a signal travels from an
origin cell to its closest associated switch located at a cell block
boundary, then along appropriate flyovers to the destination cell. Thus,
this structure creates a hierarchical cellular array with a variety of
routing resources whose lengths scale, in one embodiment by a factor of 4
each time, built on top of a simple array of neighbour connections only.
The principal advantage of providing different levels of routing resources
is that it allows the number of conductor segments required to route from
cell to cell on the array to be minimised. For example, if a path is
provided between two points in the array using neighbour interconnect
only, the number of routing segments would be equal to the number of cells
between the two points, whereas with the hierarchical interconnect, the
number of segments increases with the logarithm of the distance between
the cells.
Single-Source Directed Wiring
In one embodiment of a pre-programmable cellular array with hierarchical
routing, all the wires in the array are directed and have a single source.
Thus, 3-state drivers are not used. In one embodiment, all the connections
in the array are completely symmetrical so that if the array is rotated
the structure remains unchanged. Single source wiring has the advantage of
being simpler to implement than multiple-source wires. Multiple-source
wires, while allowing additional flexibility, require a considerable area
overhead, and produce contention when different drivers attempt to place
opposite values on the same wire. Contention dissipates heavy power, which
can result in device failure. Contention is prevented by the present
invention, in which a wire is driven by a single multiplexer output. The
symmetry feature simplifies the CAD software to map user designs onto the
array and allows hierarchical blocks of those designs to be rotated or
reflected to provide better utilisation of the array area.
Preferably, the switches providing connections between the flyovers and the
cells are static RAM controlled multiplexers. Conveniently, the switches
at 4-cell and 16-cell boundaries permit direct neighbour connections as
well as additional connections via the longer flyover conductors.
Automatic Routing Optimization, Portability
The hierarchical routing resources in the improved FPGA can be used in two
principal ways. Firstly, the user can design using simpler neighbour
programming models ignoring the availability of the longer connections. In
such a case, the software will automatically detect nets in the layout
which may benefit from new routing resources, and take advantage of these
resources to speed up performance of the design. Secondly, the user can
design using improved programming models and make explicit assignments to
the extra routing resources. In this case, extra density is achieved by
assigning different nets to various levels of interconnect at the same
point in the cellular array. For example, a length-16 wire could carry a
signal over a sub-unit (for example, several 4.times.4 blocks) without
interfering with the local interconnect in that sub-unit. When flyovers
are used to bypass a block of cells, blocks of the user design might have
to be placed in the FPGA on these 4-cell or 16-cell boundaries. Automatic
addition of flyover routing is easier to use and is independent of the
number of levels of routing provided by a given FPGA chip. Using software
to add the flyovers provides design portability between different chips,
and using improved programming models which use flyovers to bypass a
block, or manually assigning flyover resources as appropriate, allows more
efficient use of the resources provided.
Use of longer routing resources may be achieved using low level CAD
software as described above or using hardware in the chip itself to
automatically route signals to longer wires where possible. This provides
more device portability and allows special "fast" versions of existing
chips to be made with additional longer wires without requiring any
changes to the existing design. This "dynamic" selection of longer routing
wires simplifies the CAD software, allowing it to run faster. Dynamic
selection of longer wires is particularly attractive for applications
which involve dynamically reprogramming FPGA chips.
According to another aspect of the invention the speed of propagation of
signals through an FPGA is improved by automatically mapping onto flyovers
those signals capable of being speeded up using circuitry fabricated on
the FPGA. The method comprising the steps of:
detecting control store bit patterns which correspond to routing a signal
straight through a cell, detecting when a group of cells beneath a flyover
all route the signal in the direction of the flyover by using the 4-input
gate provided for that flyover direction, and taking as input the output
of the 4-input gate of the appropriate neighbour multiplexer,
feeding an output from one of the 4 input gates to switches at both ends of
the flyover, whereby the signal is carried automatically by the flyover as
well as by neighbour routing, and the faster signal on the flyover is
selected by the switch at the end of the flyover.
The method is scalable and can be applied to a group of 4 length-4 flyovers
under a length-16 flyover when this group all route a signal in the
direction of the length-16 flyover. This is done using a 4-input gate
which takes as inputs the outputs of the 4-input gates used for receiving
signals from the neighbour cells.
The type of gate used depends on the control bits being detected. For
example, a NOR gate is used for detecting bits 0,0 in an East to West
direction in a West routing multiplexer. Alternatively, to detect a bit
pattern of 1,1, a NAND gate and associated logic circuitry are used.
Block Correspondence Allows Easy Reconfiguration
An important feature of the present invention, is that a rectangular area
of cells specified as a hierarchical block of the user's design (for
example, a 4.times.4 block or a 16.times.16 block of cells) corresponds
directly, i.e. by a straight-forward physical relationship, with one or
more rectangular areas in the configuration memory (control store) of the
CAL II FPGA device representing instances of that block. This means that a
block of the user's design can be dynamically replaced by another block of
the same size, for example, a register can be replaced by a counter of
equal size. Thus, in accordance with the present invention, the host
processor must reconfigure only the corresponding area of the control
store RAM. The binary data for both blocks can be pre-calculated from the
user's design, and the actual replacement can be done very rapidly using a
block transfer operation, as is well known in the art. During dynamic
reconfiguration, registers can be initialised either to a default
associated with a block definition or to restore the previous state of the
unit whose configuration is being restored or to a convenient value
decided by the application program performing the reconfiguration.
Wildcard Feature
According to a further aspect of the present invention an FPGA is | | |