WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Hierarchically connectable configurable cellular array    
United States Patent5469003   
Link to this pagehttp://www.wikipatents.com/5469003.html
Inventor(s)Kean; Thomas A. (Edinburgh, GB6)
AbstractAn field programmable gate array (FPGA) of cells arranged in rows and columns is interconnected by a hierarchical routing structure. Switches separate the cells into blocks and into blocks of blocks with routing lines interconnecting the switches to form the hierarchy. Also, select units for allowing memory bits to be addressed both individually and in large and arbitrary groups are disclosed. Further a control store for configuring the FPGA is addressed as an SRAM and can be dynamically reconfigured during operation.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5469003
Hierarchically connectable configurable cellular array - US Patent 5469003 Drawing
Hierarchically connectable configurable cellular array
Inventor     Kean; Thomas A. (Edinburgh, GB6)
Owner/Assignee     Xilinx, Inc. (San Jose, CA)
Patent assignment
All assignments
Publication Date     November 21, 1995
Application Number     08/148,793
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     November 5, 1993
US Classification     326/39 326/38
Int'l Classification     H03K 019/177
Examiner     Hudspeth; David R.
Assistant Examiner    
Attorney/Law Firm     Young; Edel M.
Address
Parent Case    
Priority Data     Nov 05, 1992[GB]9223226
USPTO Field of Search     307/443 307/465 307/465.1 307/468 307/475 307/303.2 326/38 326/39
Patent Tags     hierarchically connectable configurable cellular array
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5338984
Sutherland
326/41
Aug,1994

[0 after 0 votes]
5243238
Kean
326/41
Sep,1993

[0 after 0 votes]
5187393
El Gamal
326/41
Feb,1993

[0 after 0 votes]
5144166
Camarota
326/41
Sep,1992

[0 after 0 votes]
5073729
Greene

Dec,1991

[0 after 0 votes]
5047983
Iwai
365/200
Sep,1991

[0 after 0 votes]
5015883
Waller
326/50
May,1991

[0 after 0 votes]
5003200
Sakamoto
326/38
Mar,1991

[0 after 0 votes]
4973956
Lin
340/2.2
Nov,1990

[0 after 0 votes]
4930107
Chan
365/185.22
May,1990

[0 after 0 votes]
4758985
Carter
365/94
Jul,1988

[0 after 0 votes]
4706216
Carter
365/94
Nov,1987

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


I claim:

1. An hierarchically-scalable cellular array comprising:

a plurality of cells;

a first routing resource for interconnecting each cell to its adjacent neighboring cells, said first routing resource including a first plurality of lines having a length n, where n is a predetermined distance;

a second routing resource including a second plurality of lines having a length N.times.n where N>1, said second routing resource connecting said cells into blocks of cells and interconnecting said blocks of cells; and

a third routing resource including a third plurality of lines having a length N.sup.m .times.n where n.gtoreq.2 and where N.sup.m .times.n.ltoreq.T, the total length of said cellular array, said third routing resource interconnecting groups of said blocks of cells.

2. The cellular array of claim 1 wherein n is approximately one cell width.

3. The cellular array of claim 1 wherein N is 4.

4. The cellular array of claim 3 wherein m is 2.

5. The cellular array of claim 3 wherein m is 3.

6. The cellular array of claim 1 further including a fourth routing resource including a fourth plurality of lines having a length N.sup.r .times.n where r.gtoreq.3, and r=m+1.

7. A programmable logic structure comprising:

an array of cells arranged in rows and columns;

a plurality of switches, said plurality of switches partitioning said array into a first plurality of cell blocks,

wherein each cell includes:

four input lines for providing four input signals to said cell, each of said input lines coupled to an adjacent cell or switch; and

four output lines for providing four output signals from said cell, each of said output lines coupled to an adjacent cell or switch,

wherein said array further includes four intermediate, directed input lines for providing an additional four input signals to said cell, wherein each intermediate input line provides signals to the cells in either a row or a column of one of said blocks.

8. The logic structure of claim 7 wherein said array further includes a plurality of flyover lines, each flyover line coupled between two of said plurality of switches, wherein said plurality of flyover lines determines a second plurality of cell blocks.

9. In an integrated circuit having a two-dimensional array of logic cells, a switch comprising:

a set of multiplexers, each of which receives a plurality of input signals and provides an output signal;

for each multiplexer, multiplexer control means for causing said multiplexer to select one of said input signals to provide said output signal, said input signals being taken from the following wires:

a wire extending to the west from said switch and having a length approximately equal to a first multiple of said length of one of said cells;

a wire extending to the east from said switch and having a length approximately equal to said first multiple of said length of one of said cells;

a wire extending to the west from said switch and having a length approximately equal to a second multiple of said length of one of said cells; and

a wire extending to the east from said switch and having a length approximately equal to said second multiple of said length of one of said cells;

said output signals being placed on the following wires:

a wire extending to the west from said switch and having a length approximately equal to a first multiple of said length of one of said cells;

a wire extending to the east from said switch and having a length approximately equal to said first multiple of said length of one of said cells;

a wire extending to the west from said switch and having a length approximately equal to a second multiple of said length of one of said cells; and

a wire extending to the east from said switch and having a length approximately equal to said second multiple of said length of one of said cells.

10. The switch of claim 9 wherein said first multiple is 1 and second multiple is 4.

11. The switch of claim 9 wherein said first multiple is 4 and second multiple is 16.

12. The switch of claim 9 wherein said first multiple is 16 and second multiple is 64.

13. The switch of claim 9 wherein said multiplexers further receive input signals taken from the following wires:

a wire extending to the west from said switch and having a length approximately equal to a third multiple of said length of one of said cells; and

a wire extending to the east from said switch and having a length approximately equal to said third multiple of said length of one of said cells.

14. The switch of claim 13 wherein said first multiple is 1, said second multiple is 4, and said third multiple is

15. The switch of claim 13 wherein said first multiple is 4, said second multiple is 16, and said third multiple is 64.

16. A programmable integrated circuit comprising:

a two-dimensional array of logic cells, each cell comprising means for generating as an output signal a selected logic function of a plurality of logic cell input signals;

a plurality of switches, said switches positioned so as to group said cells into cell blocks;

each cell including:

four short input wires for providing four of said input signals, said four short input wires carrying output signals from the cells or the switches which comprise nearest neighbors on four sides of said cell; four medium input wires for providing

four of said input signals, said four medium input wires extending in four compass directions between switches which define a cell block in which said cell is positioned, wherein a plurality of said medium input wires are simultaneously accessable by said cell; and

four short output wires for providing said output signal, said four short output wires carrying output signals to the cells or the switches which comprise nearest neighbors on four sides of said cell.

17. The programmable integrated circuit of claim 16 wherein each of said medium input wires carries a signal from a switch in one of the four compass directions to said cell and to other cells positioned between two of said switches making up said cell block.

18. The programmable integrated circuit of claim 17 wherein said medium wires are approximately four times the length of said short wires.

19. A programmable logic structure comprising:

an array of cells;

a plurality of switches arranged to group said cells into cell blocks such that for each cell block at least one west switch is positioned to the west of that block, and at least one east switch is positioned to the east of that block;

a neighbor wire connecting from an output of said west switch to an input of the nearest cell east of said west switch;

a flyover wire connecting from an output of said west switch to an input of said east switch;

a neighbor wire connecting from an output of said east switch to an input of the nearest cell west of said east switch; and

a flyover wire connecting from an output of said east switch to an input of said west switch.

20. A programmable logic structure of claim 19 wherein said plurality of switches is further arranged such that for each cell block at least one north switch is positioned to the north of that block, and at least one south switch is positioned to the south of that block, wherein said programmable logic structure further comprises:

a neighbor wire connecting from an output of said north switch to an input of the nearest cell south of said north switch;

a south flyover wire connecting from an output of said north switch to an input of said south switch;

a short wire connecting from an output of said south switch to an input of the nearest cell north of said south switch; and

a north flyover wire connecting from an output of said south switch to an input of said north switch.

21. A programmable logic structure as in claim 20 in which each of said cells comprises a plurality of smaller cells, each of said smaller cells being defined by:

a plurality of switches, said switches arranged to group said smaller cells into smaller cell blocks such that for each smaller cell block at least one west switch is positioned to the west of that block, at least one east switch is positioned to the east of that block, at least one north switch is positioned to the north of that block, and at least one south switch is positioned to the south of that block;

said smaller cells being connected to said switches by:

a neighbor wire connecting an output of said west switch to an input of the nearest smaller cell east of said west switch;

a neighbor wire connecting an output of the nearest smaller cell east of said west switch to an input of said west switch;

an east flyover wire connecting from an output of said west switch to an input of said east switch;

a neighbor wire connecting from an output of said east switch to an input of the nearest smaller cell west of said east switch; and

a neighbor wire connecting an output of the nearest smaller cell west of said east switch to an input of said east switch;

a west flyover wire connecting from an output of said east switch to an input of said west switch;

a neighbor wire connecting from an output of said north switch to an input of the nearest smaller cell south of said north switch;

a neighbor wire connecting an output of the nearest smaller cell south of said north switch to an input of said north switch;

a south flyover wire connecting from an output of said north switch to an input of said south switch;

a neighbor wire connecting from an output of said south switch to an input of the nearest smaller cell north of said south switch; and

a neighbor wire connecting an output of the nearest smaller cell north of said south switch to an input of said south switch;

a north flyover wire connecting from an output of said south switch to an input of said north switch.

22. An input/output structure for a programmable logic device having a plurality of logic cells comprising:

a plurality of pads, each pad connected to an external pin of said programmable logic device;

a plurality of I/O buffers, each I/O buffer connected to one of said plurality of pads, each I/O buffer including a switch for connecting said I/O buffer to at least one of said plurality of logic cells, wherein said switch forms part of a hierarchial interconnect structure receiving input signals both from wires from neighboring cells and wires from other switches and providing output signals to both wires in neighboring cells and wires leading to other switches; and

means for controlling said switch.

23. A method for routing signals in a programmable logic structure comprising the steps of

arranging an array of cells in rows and columns;

partitioning said array into a plurality of cell blocks with a plurality of switches;

coupling each input line associated with each cell to an adjacent cell or switch;

coupling each output line associated with each cell to said adjacent cell or switch;

providing an additional four directed, intermediate input lines for each cell, wherein each directed intermediate input line provides signals to the cells in either a row or a column of one of said plurality of cell blocks.

24. The method of claim 23 further including the step of providing a plurality of flyover lines, each flyover line coupling two of said plurality of switches.

25. The method of claim 24 further including the step of providing a signal from a first cell to a second cell, wherein said providing includes transferring said signal via at least one intermediate input line.

26. The method of claim 25 wherein said step of providing further includes transferring said signal via at least one flyover line.

27. The method of claim 23 further including the steps of:

sending a signal via a first path including a plurality of switching units, said first path paralleling a direct, second path;

using logic gates to detect whether said signal travels the full length of said first path; and

if said signal travels said full length, then placing said signal on said second path.

28. A routing device in a programmable logic device comprising:

a plurality of cells providing a first path, wherein each cell includes:

means for receiving a plurality of input signals and providing an output signal;

a plurality of memory bits associated with said means for receiving; and

a first logic gate for providing a trigger signal determined by the state of said plurality of memory bits;

a second path not including cells, said second path paralleling said first path;

a second logic gate coupled to the output terminals of the first logic gates to detect whether said signal travels the full length of said first path; and

means for determining whether a signal is provided on said first path or on said second path, said means for determining controlled by said second logic gate.
 Description Submit all comments and votes
 


FIELD OF THE INVENTION

The present invention relates to a configurable cellular array of dynamically configurable logic elements, such arrays being generally known as Field Programmable Gate Arrays (FPGAs).

BACKGROUND OF THE INVENTION

Reprogrammable FPGAs have been available commercially for several years. The best known commercial family of FPGAs are those from Xilinx Inc. One class of these devices uses Static Random Access Memory (SRAM) to hold control bits which control their configurations. Most FPGA devices replace traditional mask programmed Applications Specific Integrated Circuit (ASIC) parts which have a fixed configuration. The configuration of the FPGA is static and is loaded from a non-volatile memory when power is applied to the system. Nearly all commercially available FPGAs have a stream-based interface to the control store. (The control store contains the set of bits which determine what functions the FPGA will implement.) In a stream-based interface to the control store, a sequence of data is applied to a port in the FPGA to provide a complete configuration for the whole device or for a fixed (normally large) sub-section of the FPGA. This stream-based interface, when combined with an address counter which is implemented on the FPGA itself, provides an efficient method of loading the complete device configuration from adjacent EPROM or other non-volatile memory on power up without any additional overhead circuits. A stream based interface with an address counter is a suitable programing interface for an FPGA which is used as a replacement for a standard ASIC. Some FPGAs can be partly or totally reconfigured using one of a set of static configurations stored at different addresses in an EPROM, and can trigger the reconfiguration from within the design implemented on the FPGA.

Published International Application WO 90/11648, corresponding to U.S. Pat. No. 5,243,238, discloses an architecture hereafter referred to as CAL I, which has been implemented in an Algotronix product designated CAL 1024. CAL I is different from other commercially available FPGAs in that its control store appears as a standard SRAM to the systems designer, and can be accessed using address bus, data bus, chip enable, chip select and read/write signals. Addressing the control store as an SRAM supports a user program running on the host processor mapping the FPGA control store (configuration memory) into the memory or address space of the host processor so that the processor can configure the FPGA to implement a user-defined circuit. This arrangement, which is implemented in the CAL 1024 FPGA, allows the user to partition an application between the processor and the FPGA with appropriate sections being implemented by each device. The control store interface provides an important input/output (I/O) channel between the FPGA and the processor, although the I/O can also take place using more traditional techniques via, for example, a shared data memory area. This latter type of FPGA provides a passive control store interface because an external agent is required to initiate configuration or reconfiguration of the device, as required.

Experience with the CAL I architecture and trends within the electronics industry have made this second passive form of control store interface increasingly attractive for many applications. Microprocessors or microcontrollers are now pervasive components of computer systems and most board level systems contain one. The major benefit of the stream based "active" FPGA programming approach is that no overhead circuits are required to initiate reconfiguration. In systems where a microprocessor or microcontroller is present, the "passive" RAM emulating FPGA interface is preferable for several reasons:

(1) the FPGA configuration can be stored in the microprocessor's program and data memory (reducing the number of parts by removing the need for a separate memory chip),

(2) the existing data and address buses on the board can be used to control the FPGA (saving printed circuit board area by removing dedicated wires between the configuration EPROM and the FPGA);

(3) the FPGA control store can be written to and read from by the microprocessor, and thereby used as an I/O channel between the FPGA and the microprocessor, thereby potentially saving additional wiring between the FPGA and the processor buses and freeing the FPGA programmable I/O pins for communication with external devices, and

(4) the intelligence of the microprocessor can be used to support compression schemes for the configuration data and other techniques, which allows more flexibility in reprogramming the FPGA.

In addition, the difference in cost between an "active" FPGA with an associated EPROM holding its configuration and a passive FPGA with an active microcontroller chip containing an EPROM and a simple processor is minimal. The easy reprogrammability makes the passive FPGA attractive, even if the microcontroller has no other function apart from reprogramming the FPGA.

Another trend within the Electronics Industry has been the provision of "support chips" for microprocessors which provide an interface between I/O devices and a particular microprocessor. Examples of these devices include Universal Asynchronous Receiver Transmitters (UARTs) for low bandwidth serial I/O, Programmable Peripheral Interfaces (PPIs) for low bandwidth parallel I/O and various specialised chips for higher bandwidth connections to networks and disk drives. These support chips appear to the processor as locations in the I/O or memory address space to and from which data are transferred. Some support chips can interrupt the processor via interrupt lines or take over the bus for Direct Memory Access (DMA) operations. In many ways a passive FPGA chip can be viewed as a successor to a support chip, providing an interface to the processor via its control store on the one hand, and an interface to the external world via a large number of flexible I/O lines on the other, for example 128 programmable I/O lines on the Algotronix CAL 1024 device.

A passive FPGA chip has a number of advantages. For example, it is cost-effective to provide a single FPGA with a library of configurations instead of providing a number of support chips. In addition, providing a single FPGA for several functions reduces the number of devices in the processor manufacturer's catalogue. Also, reconfigurable FPGAs can support changeable I/O functions, such as when a single external connector can be used as either a serial or a parallel port. With a passive RAM control interface, the FPGA is able to support other functions as well.

Each time an FPGA is reconfigured to implement a different set of functions, the microprocessor must access the configuration memory. One reconfiguration typically requires many control store accesses, one access for each word of configuration memory to be changed. Several important classes of reconfiguration have been identified.

(1) Application swapping occurs when one application terminates and a completely different application wishes to make use of the FPGA. In this case the FPGA chip is completely reconfigured, usually from a static configuration.

(2) Task swapping occurs when the application must configure relatively large sections of the FPGA to implement a new phase in the computation. For example, a sorting application might first sort small batches of data completely using configuration A and then merge those sorts into a completely sorted stream of data using configuration B. In this case, the application has knowledge of both configurations and need only change those resources which are different in configuration B. At a later point, configuration A may itself be restored.

(3) Data dependent reconfiguration occurs when the configurations of some cells are computed dynamically based on input data by the application program, rather than being loaded from a static configuration file. Often a static configuration is first loaded, then a relatively small sub-set of cells are reconfigured dynamically (that is, reconfigured while the chip is operating). An important example of this class of reconfiguration is where an operand (such as a constant multiplier or a search string) is folded directly into the logic used to implement the multiply or sort unit rather than being stored in a register. This technique is advantageous as it frequently results in smaller and faster operation units.

(4) Access to gate outputs occurs for debugging. The outputs of all the logic cells on the CAL I FPGA are mapped to bits of the control store. Debugging programs are available which read back this information on the display or design layout to show the logic levels on internal wires.

(5) Access to gate outputs for I/O is similar to the previous access to gate outputs for debugging. But in this particular case only a small fraction of the logic nodes, namely those which correspond to input and output registers, will be accessed repeatedly. The ability to rapidly assemble a word representing input to or the result of a computation from several bits at different locations in the control store is critical to the effectiveness of this technique.

It is desirable to reduce the number of accesses required and hence the time to wholly or partially reconfigure the device. Several systems other than CAL I have been proposed which allow direct access to internal signals in an FPGA or an FPGA-like device, for example, as disclosed in Cellular Logic-in-Memory Arrays, William H. Kautz, IEEE Transactions on Computers Vol C18 No. 8, August 1969; A Logic in Memory Computer, Harold S. Stone, IEEE Transactions on Computers, Vol C19 No. 1, January 1970 and Xilinx U.S. Pat. No. 4,758,985 Microprocessor Oriented Configurable Logic Element, although all these proposals suffered from major drawbacks and were not made available commercially.

It is also desirable to improve the means of accessing state information in designs implemented on FPGAs so that an external processor can perform word-wide read or write operations on the registers of the user's design with a single access to the control store. Thus the control store interface allows high bandwidth communication between the processor and the FPGA. It is also desirable to provide mechanisms for synchronising computations between the FPGA and the processor and to provide a mechanism for extending design configuration files to support dynamic reconfiguration while allowing use of conventional tools for static designs to create FPGA configurations.

The architecture of the CAL 1024 was based on 1.5 micrometer technology available in 1989. One problem with the CAL I architecture in which cells are connected only to their nearest neighbours was that cells in the middle of the array became less useful with increasing array size as the distance and hence delay to the edge of the chip increased. This problem became more serious as improvements in processing technology meant that the number of cells implementable per chip increased from 1024 to about 16,384. This resulted in a scalability problem because of increased delays, and reduced the performance below the desired criteria. Thus, although scalability of chips using the CAL I architecture can be achieved, it is at the expense of performance. The limited number of cells available on a single chip with 1.5 .mu.m technology meant that it was desirable to ensure scalability over chip boundaries so that large designs typical of many computational applications could be realised using multiple chips. The limitations of the then processing technology also made it essential to optimise the architecture for silicon area and sometimes this optimisation was at the expense of speed. The original Algotronix CAL 1024 chips were designed to bring out peripheral array signals to pads on the edges of the cellular array so that they could be cascaded into larger cellular arrays on a printed circuit board. Packaging technology has not evolved as rapidly as chip technology and limitations on the number of package I/O pins make it uneconomic to produce fully cascadable versions of the higher cell density chips.

The CAL I architecture suffered from a number of other disadvantages. For example, in order to access a cell in the existing CAL I FPGA, five to six processor instructions are needed to calculate the address of the cell; this again takes time and slows operation. With the existing CAL I cell array the routing architecture used meant that with increased number of cells per chip, routing via intermediate cells added considerably to the delays involved. In addition, in the CAL 1024 device, global signals are coupled to all the cells in the array so that the cells can be signalled simultaneously. It logically follows that at high clock frequencies, global signals could consume high power.

SUMMARY OF THE INVENTION

An object of the present invention is to provide an improved field programmable gate array which obviates or mitigates at least one of the aforementioned disadvantages.

A further object of the present invention is to reduce the number of control store accesses required and the time to wholly or partially reconfigure the device from one configuration to another.

A further object of the invention is to enable an external processor to perform word-wide read or write operations on registers of a user's design with a single access to the control store.

A yet further object of the present invention is to provide a mechanism for extending design configuration files to support dynamic reconfiguration while allowing the use of conventional tools for static designs to create FPGA configurations.

A further object of the invention is to provide mechanisms for the synchronisation of computations between the FPGA and an external processor.

A yet further object of the invention is to provide a novel routing architecture which can be scaled up to operate on arrays having different numbers of cells to reduce delays involved in routing between cells in large cell arrays.

Array of Cells with Hierarchical Routing Structure

In accordance with the present invention, a 2-dimensional field programmable gate array (FPGA) of cells arranged in rows and columns is provided. Each cell in the array has at least one input and one output connection at least one bit wide to each of its neighbouring cells. Each cell also has a programmable routing unit and a programmable function unit to permit intercellular connections to be made. The programmable function unit can select one of a plurality of functions of several input signals for generating a function unit output signal. The routing unit of a cell directs inputs of the cell to function unit inputs, and also directs inputs of the cell and the function unit output to neighbouring cells. Groups of cells in the array are arranged so as to be connected to additional conductors of a length equal to a predetermined number of cells. Cells in the array are coupled to the additional conductors via switches. Typically, four such conductors are provided for each cell, two conductors arranged in one direction in the array and two conductors arranged in the orthogonal direction in the array. Each pair of conductors is arranged such that one conductor in the pair carries signals in one direction and the other conductor carries signals the opposite direction. This novel architecture is referred to hereafter as the CAL II architecture, or simply as CAL II.

A predetermined block of cells, for example a 4.times.4 block of cells, has the additional conductors of at least cell length 4 (four cells long). These blocks are arranged into repeating units to form an array of these cells whereby 16 of such 4.times.4 blocks of cells result in a unit of 16 cells.times.16 cells with each 4.times.4 block having the additional conductors, the longer conductors hereinafter referred to as flyovers, associated with each row or column of 4 cells. The 16.times.16 block of cells may itself have additional flyover conductors.

In larger cellular arrays, the structure of the hierarchical routing can be extended to any number of levels, a third level using conductors of length 64, and a fourth level using conductors of length 256 and so on.

This arrangement permits scaling of the array with the advantage that the scaling is logarithmic in terms of distance, thereby significantly reducing delay between cells. Specifically, a signal travels from an origin cell to its closest associated switch located at a cell block boundary, then along appropriate flyovers to the destination cell. Thus, this structure creates a hierarchical cellular array with a variety of routing resources whose lengths scale, in one embodiment by a factor of 4 each time, built on top of a simple array of neighbour connections only.

The principal advantage of providing different levels of routing resources is that it allows the number of conductor segments required to route from cell to cell on the array to be minimised. For example, if a path is provided between two points in the array using neighbour interconnect only, the number of routing segments would be equal to the number of cells between the two points, whereas with the hierarchical interconnect, the number of segments increases with the logarithm of the distance between the cells.

Single-Source Directed Wiring

In one embodiment of a pre-programmable cellular array with hierarchical routing, all the wires in the array are directed and have a single source. Thus, 3-state drivers are not used. In one embodiment, all the connections in the array are completely symmetrical so that if the array is rotated the structure remains unchanged. Single source wiring has the advantage of being simpler to implement than multiple-source wires. Multiple-source wires, while allowing additional flexibility, require a considerable area overhead, and produce contention when different drivers attempt to place opposite values on the same wire. Contention dissipates heavy power, which can result in device failure. Contention is prevented by the present invention, in which a wire is driven by a single multiplexer output. The symmetry feature simplifies the CAD software to map user designs onto the array and allows hierarchical blocks of those designs to be rotated or reflected to provide better utilisation of the array area.

Preferably, the switches providing connections between the flyovers and the cells are static RAM controlled multiplexers. Conveniently, the switches at 4-cell and 16-cell boundaries permit direct neighbour connections as well as additional connections via the longer flyover conductors.

Automatic Routing Optimization, Portability

The hierarchical routing resources in the improved FPGA can be used in two principal ways. Firstly, the user can design using simpler neighbour programming models ignoring the availability of the longer connections. In such a case, the software will automatically detect nets in the layout which may benefit from new routing resources, and take advantage of these resources to speed up performance of the design. Secondly, the user can design using improved programming models and make explicit assignments to the extra routing resources. In this case, extra density is achieved by assigning different nets to various levels of interconnect at the same point in the cellular array. For example, a length-16 wire could carry a signal over a sub-unit (for example, several 4.times.4 blocks) without interfering with the local interconnect in that sub-unit. When flyovers are used to bypass a block of cells, blocks of the user design might have to be placed in the FPGA on these 4-cell or 16-cell boundaries. Automatic addition of flyover routing is easier to use and is independent of the number of levels of routing provided by a given FPGA chip. Using software to add the flyovers provides design portability between different chips, and using improved programming models which use flyovers to bypass a block, or manually assigning flyover resources as appropriate, allows more efficient use of the resources provided.

Use of longer routing resources may be achieved using low level CAD software as described above or using hardware in the chip itself to automatically route signals to longer wires where possible. This provides more device portability and allows special "fast" versions of existing chips to be made with additional longer wires without requiring any changes to the existing design. This "dynamic" selection of longer routing wires simplifies the CAD software, allowing it to run faster. Dynamic selection of longer wires is particularly attractive for applications which involve dynamically reprogramming FPGA chips.

According to another aspect of the invention the speed of propagation of signals through an FPGA is improved by automatically mapping onto flyovers those signals capable of being speeded up using circuitry fabricated on the FPGA. The method comprising the steps of:

detecting control store bit patterns which correspond to routing a signal straight through a cell, detecting when a group of cells beneath a flyover all route the signal in the direction of the flyover by using the 4-input gate provided for that flyover direction, and taking as input the output of the 4-input gate of the appropriate neighbour multiplexer,

feeding an output from one of the 4 input gates to switches at both ends of the flyover, whereby the signal is carried automatically by the flyover as well as by neighbour routing, and the faster signal on the flyover is selected by the switch at the end of the flyover.

The method is scalable and can be applied to a group of 4 length-4 flyovers under a length-16 flyover when this group all route a signal in the direction of the length-16 flyover. This is done using a 4-input gate which takes as inputs the outputs of the 4-input gates used for receiving signals from the neighbour cells.

The type of gate used depends on the control bits being detected. For example, a NOR gate is used for detecting bits 0,0 in an East to West direction in a West routing multiplexer. Alternatively, to detect a bit pattern of 1,1, a NAND gate and associated logic circuitry are used.

Block Correspondence Allows Easy Reconfiguration

An important feature of the present invention, is that a rectangular area of cells specified as a hierarchical block of the user's design (for example, a 4.times.4 block or a 16.times.16 block of cells) corresponds directly, i.e. by a straight-forward physical relationship, with one or more rectangular areas in the configuration memory (control store) of the CAL II FPGA device representing instances of that block. This means that a block of the user's design can be dynamically replaced by another block of the same size, for example, a register can be replaced by a counter of equal size. Thus, in accordance with the present invention, the host processor must reconfigure only the corresponding area of the control store RAM. The binary data for both blocks can be pre-calculated from the user's design, and the actual replacement can be done very rapidly using a block transfer operation, as is well known in the art. During dynamic reconfiguration, registers can be initialised either to a default associated with a block definition or to restore the previous state of the unit whose configuration is being restored or to a convenient value decided by the application program performing the reconfiguration.

Wildcard Feature

According to a further aspect of the present invention an FPGA is