WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Method for fusing images and apparatus therefor    
United States Patent5488674   
Link to this pagehttp://www.wikipatents.com/5488674.html
Inventor(s)Burt; Peter J. (Mercer County, NJ); van der Wal; Gooitzen S. (Mercer, NJ); Kolczynski; Raymond J. (Mercer, NJ); Hingorani; Rajesh (Mercer, NJ)
AbstractA method for fusing two or more source images to form a composite image with extended information content which may be color augmented and apparatus for forming the composite image from the source images is disclosed. Each source image is decomposed into a number of source images of varying resolution. The decomposed source images are analyzed using directionally sensitive operators to generate a set of oriented basis functions characteristic of the information content of the original images. The oriented basis functions for the composite image are then selected from those of the different source images and the inverse of the decomposition performed to construct the composite image. Color augmentation provides information as to the relative weighting of the contribution of each source to the composite image.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5488674
Method for fusing images and apparatus therefor - US Patent 5488674 Drawing
Method for fusing images and apparatus therefor
Inventor     Burt; Peter J. (Mercer County, NJ); van der Wal; Gooitzen S. (Mercer, NJ); Kolczynski; Raymond J. (Mercer, NJ); Hingorani; Rajesh (Mercer, NJ)
Owner/Assignee     David Sarnoff Research Center, Inc. (Princeton, NJ)
Patent assignment
All assignments
Publication Date     January 30, 1996
Application Number     08/059,610
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     May 12, 1993
US Classification     382/284 345/639 348/584 348/598 382/162
Int'l Classification     H04N 009/76
Examiner     Boudreau; Leo H.
Assistant Examiner     Kelley; Chris
Attorney/Law Firm     Burke; W. J .
Address
Parent Case     This application is a continuation-in-part of original application Ser. No. 07/884098, filed May 15, 1992 now U.S. Pat. No. 5,325,449.
Priority Data    
USPTO Field of Search     382/22 382/41 382/17 382/54 382/162 382/284 382/276 348/239 348/389 348/454 348/584 348/598 395/125 395/134
Patent Tags     fusing images
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5259040
Hanna
382/107
Nov,1993

[0 after 0 votes]
5210799
Rao
382/103
May,1993

[0 after 0 votes]
5140416
Tinkler
348/33
Aug,1992

[0 after 0 votes]
4849746
Dubner
345/685
Jul,1989

[0 after 0 votes]
4703514
van der Wal
382/302
Oct,1987

[0 after 0 votes]
4692806
Anderson
375/240.08
Sep,1987

[0 after 0 votes]
4661986
Adelson
382/154
Apr,1987

[0 after 0 votes]
4639768
Ueno
348/584
Jan,1987

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


We claim:

1. In apparatus for forming a composite image from at least first and second monochrome source images comprising first means for decomposing each of the source images into a set of a plurality of component patterns and deriving a monochrome composite image in response to a weighted average of amplitudes of the component patterns of the source images; the combination therewith of:

second means for assigning amplitudes of a first of opponent colors to first component patterns of a source image in accordance with their normalized weight contribution to said weighted average of the amplitudes of the component patterns of the source images; and

third means for assigning amplitudes of a second of said opponent colors to second component patterns of a source image in accordance with their normalized weight contributions to said weighted average of the amplitudes of the component patterns of the source images; and

fourth means responsive to outputs from said first, second and third means for augmenting said monochrome composite image with said first and second opponent colors thereby to provide a color composite image.

2. The apparatus of claim 1 wherein said color composite image is an NTSC video image in which said fourth means derives the Y luminance channel thereof from the output of said first means, the I chrominance channel from the output of said second means, and the Q chrominance channel from the output of said third means.

3. The apparatus of claim 1 wherein said first means includes a first means for deriving a first set of given different resolution component patterns of said first source image, a second means for deriving a corresponding second set of said given different resolution component patterns of said second source image, and a first reconstruction means responsive to a weighted average of each corresponding pair of component patterns of said first and second sets having the same resolution for deriving said monochrome composite image; and wherein:

said second means includes a second reconstruction means responsive to the normalized weights of a first selected group of said given different resolution component patterns of said first and/or second source images for deriving a first weighting image for said first of said opponent colors; and

said third means includes a third reconstruction means responsive to the normalized weights of a second selected group of said given different resolution component patterns of said first and/or second source images for deriving a second weighting image for said second of said opponent colors.

4. The apparatus of claim 3 wherein said first and second means are Laplacian means and said first reconstruction means is an RE or FSD Laplacian pyramid; and wherein

each of said second and third reconstruction means is a Gaussian reconstruction pyramid deriving respective normalized output amplitudes therefrom.

5. The apparatus of claim 4 wherein said first and second means are gradient pyramids and said first reconstruction means is a gradient pyramid; and wherein

each of said second and third reconstruction means is a gradient reconstruction means deriving respective normalized output amplitudes therefrom.

6. A method for forming a composite image from N source images where N is greater than one comprising the steps of:

a) decomposing each source image I.sub.n into a plurality L of sets of component patterns P.sub.n (m, l), where n is indicative of the N.sup.th source image and m is the number of patterns in each of the L sets;

(b) computing a match measure M.sub.n1,n2 (m, l) for the source images, where n1 is the first and and n2 is the second of each pair of patterns being matched;

(c) computing a saliency measure S.sub.n (m, l) for a component pattern P.sub.n (m, l);

(d) selecting component patterns from the component pattern sets P.sub.n (m, l) using the match measures M.sub.n1,n2 (m, l) and the saliency measures S.sub.n (m, l) to form a set of component patterns P.sub.c (m, l) for the composite image, where c is indicative of the formed component patterns of the set being composite component patterns; and

(e) constructing the composite image from the set of component patterns P.sub.c (m, l).

7. The method of claim 6 wherein

said component patterns P.sub.n (m, l) into which said plurality L of sets of each source image I.sub.n are decomposed by step (a) are oriented component patterns;

whereby oriented component patterns are selected by step (d) to form the set of component patterns P.sub.c (m, l) from which the composite image is constructed by step (e).
 Description Submit all comments and votes
 


The invention relates to an improved method for fusing two or more source images to form a composite image with extended information content and apparatus for forming the composite image from the source images.

BACKGROUND OF THE INVENTION

Image fusion is a process that combines two or more source images to form a single composite image with extended information content. Typically images from different sensors, such as infra-red and visible cameras, computer aided tomography (CAT) and magnetic resonance imaging (MRI) systems, are combined to form the composite image. Multiple images of a given scene taken with different types of sensors, such as visible and infra-red cameras, or images taken with a given type of sensor and scene but under different imaging condition, such as with different scene illumination or camera focus may be combined. Image fusion is successful to the extent that: (1) the composite image retains all useful information from the source images, (2) the composite image does not contain any artifacts generated by the fusion process, and (3) the composite image looks natural, so that it can be readily interpreted through normal visual perception by humans or machines. The term useful information as determined by the user of the composite image determines which features of the different source images are selected for inclusion in the composite image.

The most direct approach to fusion, known in the art, is to align the source images, then sum, or average, across images at each pixel position. This and other pixel-based approaches often field unsatisfactory results since individual source features appear in the composite with reduced contrast or appear jumbled as in a photographic double exposure.

Known pattern selective image fusion tries to overcome these deficiencies by identifying salient features in the source images and preserving these features in the composite at full contrast. Each source image is first decomposed into a set of primitive pattern elements. A set of pattern elements for the composite image is then assembled by selecting salient patterns from the primitive pattern elements of the source images. Finally, the composite image is constructed from its set of primitive pattern elements.

Burt in Multiresolution Image Processing And Analysis, V. 16, pages 20-51, 1981 (hereinafter "BURT") and Anderson et al in U.S. Pat. No. 4,692,806, incorporated herein by reference for its teachings on image decomposition techniques, have disclosed an image decomposition technique in which an original comparatively high-resolution image comprised of a first number of pixels is processed to derive a wide field-of-view, low resolution image comprised of second number of pixels smaller than the first given number. The process for decomposing the image to produce lower resolution images is typically performed using a plurality of low-pass filters of differing bandwidth having a Gaussian roll-off. U.S. Pat. No. 4,703,514, incorporated herein by reference, has disclosed a means for implementing the pyramid process for the analysis of images.

The Laplacian pyramid approach to image fusion is perhaps the best known pattern-selective method. BURT first disclosed the use of image fusion techniques based on the Laplacian pyramid for binocular fusion in human vision. U.S. Pat. No. 4,661,986 disclosed the use of the Laplacian technique for the construction of an image with an extended depth of field from a set of images taken with a fixed camera but with different focal settings. A. Toet in Machine Vision and Applications, V. 3, pages 1-11 (1990) has disclosed a modified Laplacian pyramid that has been used to combine visible and IR images for surveillance applications. More recently M. Pavel et al in Proceedings of the AIAA Conference on Computing in Aerospace, V. 8, Baltimore, October 1991 have disclosed a Laplacian pyramid for combining a camera image with graphically generated imagery as an aid to aircraft landing. Burt et al in ACM Trans. on Graphics, V. 2, pages 217-236 (1983) and in the Proceeding of SPIE, V. 575, pages 173-181 (1985) have developed related Laplacian pyramid techniques to merge images into mosaics for a variety of applications

In effect, a Laplacian transform is used to decompose each source image into regular arrays of Gaussian-like basis functions of many sizes. These patterns are sometimes referred to as basis functions of the pyramid transform, or as wavelets. The multiresolution pyramid of source images permits coarse features to be analyzed at low resolution and fine features to be analyzed at high resolution. Each sample value of a pyramid represents the amplitude associated with a corresponding basis function. In the Laplacian pyramid approach to fusion cited above, the combination process selects the most prominent of these patterns from the source images for inclusion in the fused image. The source pyramids are combined through selection on a sample by sample basis to form a composite pyramid. Current practice is to use a "choose max rule" in this selection; that is, at each sample location in the pyramid source image, the source image sample with the largest value is copied to become the corresponding sample in the composite pyramid. If at a given sample location if there are other source image samples that have ready the same value as the sample with the largest values, these may be averaged to obtain the corresponding sample of the composite pyramid. Finally, the composite image is recovered from the composite pyramid through an inverse Laplacian transform. By way of example, in the approach disclosed in U.S. Pat. No. 4,661,986, the respective source image samples with the largest value, which are copied at each pyramid level, correspond to samples of that one of the source images which is more in focus.

In the case of the Laplacian transform, the component patterns take the form of circularly symmetric Gaussian-like intensity functions. Component patterns of a given scale tend to have large amplitude where there are distinctive features in the image of about that scale. Most image patterns can be described as being made up of edge-like primitives. The edges in turn are represented within the pyramid by collections of component patterns.

While the Laplacian pyramid technique has been found to provide good results, sometimes visible artifacts are introduced into the composite image. These may occur, for example, along extended contours in the scene due to the fact that such higher level patterns are represented in the Laplacian pyramid rather indirectly. An intensity edge is represented in the Laplacian pyramid by Gaussian patterns at all scales with positive values on the lighter side of the edge, negative values on the darker, and zero at the location of the edge itself. If not all of these primitives survive the selection process, the contour is not completely rendered in the composite. An additional shortcoming is due to the fact that the Gaussian-like component patterns have non-zero mean values. Errors in the selection process lead to changes in the average image intensity within local regions of a scene. These artifacts are particularly noticeable when sequences of composite or fused images are displayed. The selection process is intrinsically binary, the basis function from one or the other source image is chosen. If the magnitude of the basis functions vary, for example because of noise in the image or sensor motion, the selection process may alternately select the basis functions from different source images. This leads to unduly perceptible artifacts such as flicker and crawlers.

Further, while the prior art may employ color in the derivation of the fused composite image itself, there is no way in the prior art of retaining the identity of those source images that contributed to particular displayed information in a fused composite image. For example, in a surveillance application, an observer may want to know if the source of a bright feature he sees in the composite image comes from an IR camera source image, so represents a hot object, or comes from a visible camera source, so represents a light colored, or intensely illuminated object.

Thus there is a need for improved methods of image fusion (in addition to the prior-art methods of either averaging or "choose max rule" selection, and the use of color) which overcome these shortcomings in the prior art and provide better image quality and/or saliency for the user in a composite image formed by the image fusion process, particularly when sequences of composite images are displayed.

SUMMARY OF THE INVENTION

A method of the invention for forming a composite image from N source images where N is greater than one comprising the steps of decomposing each source image I.sub.n, n=1 to N, into a plurality L of sets of oriented component patterns P.sub.n (m, l); computing a saliency measure S.sub.n (m, l) for each component pattern P.sub.n (m, l); selecting component patterns from the component pattern sets P.sub.n (m, l) using the saliency measures S.sub.n (m, l) to form a set of oriented component patterns P.sub.c (m, l) for the composite image; and constructing the composite image I.sub.c from the set of oriented component patterns P.sub.c (m, l.

The invention is also an apparatus for forming a composite image from a plurality of source images comprising means for decomposing each source image into a plurality of sets of oriented component patterns; means for computing a saliency measure for each component pattern; means for selecting component patterns from the component pattern sets using the saliency measures to form a set of oriented component patterns of the composite image; and means for constructing the composite image from the set of oriented component patterns.

BRIEF DESCRIPTION OF THE DRAWING

FIG. 1 is a flow chart showing a prior art method for pattern-based image fusion.

FIG. 2 diagrammatically illustrates a method for forming the Gaussian and Laplacian pyramids.

FIG. 3 diagrammatically illustrates a method for reconstructing the original image from the Laplacian pyramid.

FIG. 4 diagrammatically illustrates a method for pattern-based image fusion of the invention.

FIG. 4(a) diagrammatically illustrates a method for pattern-based image fusion using both saliency and match.

FIG. 5 illustrates the implementation of the method of the invention in real-time digital image processing hardware.

FIG. 6 is a schematic circuit diagram of the circuits P5 and P6.

FIG. 7 is a schematic circuit diagram of the circuit P7.

FIGS. 8 (a), (c), (d) and (e) are schematic diagrams of different circuits implementing the weighting function.

FIG. 8 (b) is graphical illustration of a particular weighting function.

FIG. 9 is a timing diagram of when the various images and pyramid levels may be computed in a system with I/O frame stores, and assuming interlace I/O.

FIG. 10(a) is a photograph of a source image from a standard visible light camera.

FIG. 10(b) is a photograph of a source image from an infrared camera.

FIG. 10(c) is a photograph of the fused image obtained using the method of the invention.

FIG. 11 is a block diagram of an illustrative embodiment that converts two separate monochromatic source images into a fused composite colored image.

FIG. 12 is a block diagram diagrammatically illustrating an example of the fusion process shown in FIG. 11.

FIG. 13 is a block diagram of an exemplary pyramid circuit.

DETAILED DESCRIPTION

A flow chart for a prior art pattern-based image fusion is shown in FIG. 1. The source images are assumed to be aligned prior to undertaking the fusion steps. The fusion method comprises the steps of transforming each source image I.sub.n into a feature-based representation where each image I.sub.n is decomposed into a set of component patterns P.sub.n (m), where n=1, 2, . . . , N, the number of source images, and m=1, 2, . . . , M the number of patterns in the set for the n.sup.th source image. Features from the source images are combined to form a set of component patterns P.sub.c (m) representing the composite image assembled from patterns in the source image pattern sets. The composite image I.sub.c is then constructed from its component patterns P.sub.c (m).

The Laplacian pyramid method for image fusion can be described in this framework. Performing the Laplacian transform serves to decompose each source image into a set of approximately circularly symmetric Gaussian-like component patterns. The pyramid is a regular decomposition into a fixed set of components. This set consists of patterns at different scales, represented by the pyramid levels, and different positions in the image, represented by the sample positions within the pyramid levels. Let L.sub.n (i, j, k) be the Laplacian value at location (i, j) in pyramid level k for image n. This value represents the amplitude of a corresponding component pattern P.sub.n (i, j, k) which is a Gaussian-like function.

A flow chart for the generation of the Gaussian and Laplacian pyramids of a source image is shown in FIG. 2. The Gaussian G(0) is the source image. The Gaussian G(0) is then filtered by F1, a low pass filter having a Gaussian rolloff, and subsampled by F2, to remove alternate pixels in each row and alternate rows, to form the first level Gaussian G(1). The lower level Gaussians G(n) are formed successively in the same way. The Laplacian L(n) corresponding to the Gaussian at each level of the pyramid is formed by restoring the subsampled data to the next lowest level of the Gaussian pyramid (by inserting zero-valued samples between the given samples F2' then applying an interpolation filter F1) and subtracting from the Gaussian of the given level. The Laplacian formed in this way is known as the Reduce-Expand (RE) Laplacian. Alternatively, the Laplacian can be formed without subsampling and reinterpolation as shown by the dotted line FIG. 2. This is called a filter-subtract-decimate (FSD) Laplacian. In FIG. 3 a method for reconstructing an image from the Laplacian pyramid is illustrated. In this method the Laplacians are interpolated and summed to reproduce the original image (i.e. the inverse RE Laplacian pyramid transform).

The step of combining component patterns, FIG. 1, uses the choose max rule; that is, the pyramid constructed for the composite image is formed on a sample by sample basis from the source image Laplacian values:

L.sub.c (i,j,k)=max [L.sub.1 (i,j,k), L.sub.2 (i,j,k), . . . , L.sub.N (i,j,k)]

where the function max [] takes the value of that one of its arguments that has the maximum absolute value. The composite image I.sub.c is recovered from its Laplacian pyramid representation P.sub.c through an inverse pyramid transform such as that disclosed by BURT and in U.S. Pat. No. 4,692,806.

A method of the invention for forming a composite image from a plurality of source images, as shown in FIG. 4, comprises the steps of transforming the source images into a feature-based representation by decomposing each source image I.sub.n into a set of component patterns P.sub.n (m) using a plurality of oriented functions; computing a saliency measure for each component pattern; combining the salient features from the source images by assembling patterns from the source image pattern sets P.sub.n (m) guided by the saliency measures S.sub.n (m) associated with the various source images; and constructing the composite image I.sub.c through an inverse transform from its component patterns P.sub.c (m). A saliency estimation process is applied individually to each set of component patterns P.sub.n (m) to determine a saliency measure S.sub.n (m) for each pattern. In general, saliency can be based directly on image data, I.sub.n, and/or on the component pattern representation P.sub.n (m) and/or it can take into account information from other sources. The saliency measures may relate to perceptual distinctiveness of features in the source images, or to other criteria specific to the application for which fusion is being performed (e.g., targets of interest in surveillance).

The invention is a pattern selective method image fusion based upon the use of oriented functions (component patterns) to represent the image and, preferably, an oriented pyramid approach that overcomes the shortcomings in the prior art and provides significantly enhanced performance. Each source image is, preferably, decomposed into a plurality of images I of different resolution (the pyramid of images) and then decomposing each of these images into a plurality of sets of oriented component patterns. The oriented component patterns are, preferably edge-like pattern elements of many scales and orientations using the oriented pyramid. The use of the oriented pyramid improves the retention of edge-like source image patterns in the composite image. A pyramid is used that has component patterns with zero (or near zero) mean value. This ensures that artifacts due to spurious inclusion or exclusion of component patterns are not unduly visible. Component patterns are, preferably, combined through a weighted average rather than a simple selection process. The most prominent of these patterns are selected for inclusion in the composite image at each scale and orientation. A local saliency analysis, where saliency may be based on the local edge energy (or other task-specific measure) in the source images, is performed on each source image to determine the weights used in component combination. Selection is based on the saliency measures S.sub.n (m). The fused image I.sub.c is recovered from P.sub.c through an inverse pyramid transform.

This approach overcomes artifacts that have been observed in pixel-based fusion and in pattern-selective fusion within a Laplacian pyramid. Weights are obtained as a nonlinear sigmoid function of the saliency measures. Image fusion using the gradient pyramid has been found to provide excellent results even where image fusion based on the Laplacian pyramid introduces artifacts.

An alternative method of the invention computes a match measure M.sub.n1, n2 (m, l) between each pair of images represented by their component patterns, P.sub.n1 (m, l) and P.sub.n2 (m, l). These match measures are used in addition to the saliency measures S.sub.n (m, l) in forming the set of component patterns P.sub.c (m, l) of the composite image. This method may be used as well when the source images are decomposed into Laplacian component pattern that is not oriented (L=1).

Several known oriented image transforms satisfy the requirement that the component patterns be oriented and have zero mean. The gradient pyramid has basis functions of many sizes but, unlike the Laplacian pyramid, these are oriented and have zero mean. The gradient pyramids set of component patterns P.sub.n (m) can be represented as P.sub.n (i, j, k, l) where k indicates the pyramid level (or scale), l indicates the orientation, and i, j the index position in the k, l array. The gradient pyramid value D.sub.n (i, j, k, l) is the amplitude associated with the pattern P.sub.n (i, j, k, l). It can be shown that the gradient pyramid represents images in terms of gradient-of-Gaussian basis functions of many scales and orientations. One such basis function is associated with each sample in the pyramid. When these are scaled in amplitude by the sample value, and summed, the original image is recovered exactly. Scaling and summation are implicit in the inverse pyramid transform. It is to be understood that oriented operators other than the gradient can be used, including higher derivative operators, and that the operator can be applied to image features other than amplitude.

An alternative way of analyzing images is to use wavelet image representations. Wavelet image representations, as disclosed for example by Rioul et al in the IEEE Signal Processing Magazine, October, 1991, pages 14-38, are oriented spatial functions, linear combinations of which can be used to define an image. In the case of a wavelet representation, there are at least two sets of wavelets for different orientation. Typically three sets of wavelet basis functions, a set of horizontally oriented functions, a set of vertically oriented functions, and a linear combination functions derived from wavelets having right and left diagonal orientation. Once the sets of oriented basis functions which define the source images are obtained, a set of oriented basis functions for the composite is selected in the same way as for the basis functions generated using the gradient operators and the composite image is then reconstructed from them.

The gradient pyramid for image I is obtained by applying gradient operators to each level of its Gaussian pyramid G(n) as described in Appendix 1. Four such gradients are used for the horizontal, vertical, and orthogonal diagonal directions in the images, respectively. The four gradients are then fused using a selection criterion such as saliency to select the components to be used to form the gradient pyramid representation of the composite image. To reconstruct the composite image from its gradient pyramid representation, the gradient operators are applied a second time to form four oriented second derivative pyramids. These are summed at each level of the pyramid to form a standard Laplacian pyramid from which the composite image is reconstructed through the usual expand and add inverse Laplacian pyramid transform.

A pattern is salient if it carries information that is useful in interpreting the image. In general saliency will depend on the purpose for constructing the composite image and any measure of saliency will be task dependent. However, saliency generally increases with the amplitude of the elementary pattern. Let S.sub.n (i, j, k, l) be the saliency value corresponding to P.sub.n (i, j, k, l). A saliency measure that increases with the prominence of a component pattern can be indicated by its amplitude

S.sub.n (i,j,k,l)=.vertline.D.sub.n (i,j,k,l).vertline..

Here D.sub.n (i, j, k, l) is the amplitude associated with the pattern P.sub.n (i, j, k, l) at position (i, j) of gradient pyramid level k and orientation l. Alternatively, it can be indicated by the prominence of that component and other components within a local neighborhood. This neighborhood is indicated by a weighting function w(i',j'):

S.sub.n (i,j,k,l)=.[.S.sub.i'j' w(i',j')D.sub.n (i-i',j-j',k,l).sup.2 .]..sup.(1/2)

Typically the neighborhood used are the component patterns for the 3.times.3 array of nearest components to the particular component of interest or the 3.times.3 array of picture elements surrounding the picture element of interest, depending upon the way the components are indexed. For example, a 3.times.3 array w(i',j') can be set equal to: ##EQU1##

Another alternative measure bases salience on the occurrence of specific patterns, such as targets in the image. For example, S may be related to correlation of the source image with a filter matched to the target pattern at each sample position.

The gradient pyramid for the composite image I.sub.c is obtained by selecting components from the source pyramid basis functions P.sub.n for each set of oriented functions. Selection is repeated at each sample position based on the saliency measure. The selection rule commonly used in current practice is "choose max", that is, select that source image sample that has the greatest amplitude. However a "soft switch" is preferable to strict selection; that is, when selection is between two component patterns that have quite different saliency, then the one with the larger saliency is chosen, but when selection is between components that have comparable saliency, then the composite sample value is taken to be the weighted average of the source samples.

The combination process is then one in which the amplitude of the combined pattern element is computed as a weighted average of the amplitudes of the source pattern elements for each orientation l.

D.sub.c (i,j,k,l)={S.sub.n W.sub.n (i,j,k,l)D.sub.n (i,j,k,l)}}/{S.sub.n W.sub.n (i,j,k,l)}

The weights used in this average are based on relative saliency measures over the source image. Weights are defined such that image components with higher saliency get disproportionately higher weight. As an example, let A be the total saliency at a given position

A(i,j,k,l)=S.sub.n S.sub.n (i,j,k,l)

where N is the number of source images.

For appropriately selected constants a and b, 0<a<b<1, let ##EQU2## where

T.sub.n ={S.sub.n (i,j,k,l)/A(i,j,k,l)}

is the normalized saliency at the (i, j) position, l.sup.th orientation of the k.sup.th pyramid level for the n.sup.th source image.

This sigmoid like function accentuates the difference between weights of elements that have nearly average saliency while fixing the weights for a given element at near zero or near one if its salience is significantly below or above average, respectively.

The final step in forming the composite image I.sub.c is its reconstruction from its gradient pyramid representation P.sub.c. The details of the computation of the inverse gradient pyramid transform are given in Appendix 1.

An alternative method of the invention for forming a composite image from a plurality of source images is shown in FIG. 4(a). In this case fusion is shown for two source images, but the method can be generalized to more than two source images. In this method a match measure, M.sub.12 (i, j, k, l), is computed between source images within a local neighborhood, w(i', j'). Typically this neighborhood weighting function is the same as that used in computing the salience measures S.sub.n (i, j, k, l). The match measure can be based, for example, on a local correlation, C.sub.12 (i, j, k, 1):

C.sub.12 (i,j,k,l)=.SIGMA..sub.i',j' {w(i',j')D.sub.1 (i-i',j-j',k,l).times.D.sub.2 (i-i',j-j',k,l)}

A match measure normalized between -1 and +1 is given by

M.sub.12 (i,j,k,l)=2C.sub.12 (i,j,k,l)/{S,(i,j,k,l)+S.sub.2 (i,j,k,l)}

The composite image pattern elements are again forward as a weighted average. For the case of two source images.

D.sub.c (i,j,k,l)=w.sub.1 (i,j,k,l)D.sub.1 (i,j,k,l)+w.sub.2 D.sub.2 (i,j,k,l)

In the present implementation the weights w.sub.1 and w.sub.2 are based both on the match and saliency measures. Suppose for example, that

S.sub.1 (i,j,k,l)>S.sub.2 (i,j,k,l)

for a given pattern element. If M.sub.12 (i,j,k,l)<a, then w.sub.1 =1 and w.sub.2 =0. Else it M.sub.12 (i,j,k,l)>a then

W.sub.1 =1/2+1/2.[.(1-M)/(1-a).].

and

W.sub.2 =1-W.sub.1

Here "a" is a parameter of the fusion process that can be set between -1 and +1. If S.sub.1 (i,j,k,l)<S.sub.2 (i,j,k,l) in the above example then the values assigned to W.sub.1 and W.sub.2 are interchanged. This alternative implementation of the invention can be used with non-oriented component patterns, such as those of the Laplacian pyramid, as well as with oriented patterns, such as those of the gradient pyramid.

The invention is also apparatus for forming a composite image from a plurality of source images comprising means for transforming the source images into a feature-based representation by decomposing each source image I.sub.n into a set of component patterns P.sub.n (m) using a plurality of oriented functions; means for computing a saliency measure for each component pattern; means for forming the component patterns P.sub.c (m) of the composite image by assembling patterns from the source image pattern sets P.sub.n (m) guided by the saliency measures S.sub.n (m) associated with the various source images; and means for constructing the composite image through an inverse transform from its component patterns P.sub.c (m).

Apparatus for implementing the method of the invention is shown in FIGS. 5-8. The apparatus is shown in terms of two source images but it is understood that any number of source images can be used with appropriate modification of the apparatus.

The frame stores FS1 and FS2, if necessary, are used to convert input source images generated in an interlaced format to a progressive scan format for subsequent processing and to adjust timing. A television camera output is typically in interlaced format.

The combination of pyramid circuit P1 and frame store FS3 are used to compute the k-level Gaussian pyramid representation G.sub.a (k) of the input source image I.sub.a and the combination of circuit P2 and frame store FS4 are used to compute the n-level Gaussian pyramid representation G.sub.b (k) of the input source image I.sub.b. The circuits P1 and P2 provide the low pass filter with a Gaussian rolloff and the pixel subsampling (removal/decimation of alternate pixels in each row and each row of the filtered image). The next operation on each level of the Gaussian pyramids G(k) is a filter (1+w') which is performed by circuit P3 and circuit P4 to form G.sub.a.sup.f (k) and G(k).sub.b.sup.f, respectively. The purpose of this pre-filter P3 and post-filter P8 are to adjust overall filter characteristics to provide an exact correspondence between intermediate results in the gradient pyramid transform and the Laplacian transform. Alternatively, this filter may be applied at other points in the sequence of transform steps. Other similar filters can be used to obtain approximate results. w' is a three by three binomial filter: ##EQU3## And the filter P3 has the form: ##EQU4##

Next, each of the filtered Gaussian pyramids G.sub.a.sup.f (k) and G.sub.b.sup.f (k) is filtered with four oriented gradient filters representing the horizontal d.sub.h, vertical d.sub.v, right diagonal d.sub.rd, and left diagonal d.sub.ld filters respectively. ##EQU5## These operations are performed by circuits P5 and P6, producing the eight oriented gradient pyramids D.sub.a (k, h), D.sub.a (k, rd), D.sub.a (k, v), D.sub.a (k, rd), D.sub.b (k, h), D.sub.b (k, rd), D.sub.b (k, v), D.sub.b (k, rd). It is to be understood that while the gradient operators shown here use only the two nearest neighbor samples in the particular direction, a larger number of neighbors can be used in the gradient calculation.

In FIG. 6, circuits P5 and P6 comprise four subtractors 61, 62, 63 and 64. The input signal is connected directly to an input of subtractor 61 and through a single pixel delay 65 to the second input of subtractor 61. The output of subtractor 61 is d.sub.h. The input signal is connected directly to an input of subtractor 62 and through a single line delay 66 to the second input of subtractor 62. The output of subtractor 62 is d.sub.v. The input signal is connected through pixel delay 65 to an input of subtractor 63 and through line delay 66 to the second input of subtractor 63. The output of subtractor 61 is d.sub.rd. The input signal is connected directly to an input of subtractor 64 and through line delay 66 and pixel delay 65 to the second input of subtractor 64. The output of subtractor 61 is d.sub.ld. P5 and P6 can be implemented using a commercial Field Programmable Gate Array circuit (FPGA) such as XC3042 manufactured by Xilinx, Inc., San Jose, Calif. 95124.

The fusion function combines two or more images into a composite image as shown schematically in FIG. 8(a). Here the fusion function is computed on the four oriented gradient pyramids of the source images. It can also be applied to the Laplacian pyramid directly, but with less effectiveness.

The functional dependence of W.sub.n on the total salience A for source image I.sub.n.sup.a is shown in FIG. 8(b) for the case of two input images. The functions: ##EQU6## can be implemented with a single look-up-table (LUT) of size 64K.times.8 if the input and output images are 8 bits as illustrated in FIG. 8(c).

As examples, saliency may be based on absolute sample value or on a local root mean square average where

S.sub.n (i,j,k,l)=[S.sub.i'j' w(i',j')D.sub.n (i-i',j-j',k,l).sup.2 ].sup.(1/2).

In FIG. 8(e) an implementation of the local average method as shown in FIG. 4(a) is illustrated. A match measure, M.sub.12 (i,j,k,l), is computed between source images D.sub.1 (i,j,k,l) and D.sub.2 (i,j,k,l) within a local neighborhood, w(i',j'). Typically this neighborhood weighting function is the same as that used in computing the salience measures S.sub.n (i,j,k,l). The composite image pattern elements are again forward as a weighted average. For the case of two source images.

D.sub.c (i,j,k,l)=w.sub.1 (i,j,k,l)D.sub.1 (i,j,k,l)+w.sub.2 D.sub.2 (i,j,k,l)

The local correlation, C.sub.12 (i,j,k,l) is

C.sub.12 (i,j,k,l)=.SIGMA..sub.i',j' [w(i',j')D.sub.1 (i-i',j-j',k,l).times.D.sub.2 (i-i',j-j',k,l)}

and the match measure is:

M.sub.12 (i,j,k,l)=2C.sub.12 (i,j,k,l)/{S,(i,j,k,l)+S.sub.2 (i,j,k,l)}

The appropriate weighting function is then selected from a lookup table in the IF function. The weights are preferably selected as follows.

If S.sub.1 (i,j,k,l)>S.sub.2 (i,j,k,l) for a given pattern element and if M.sub.12 (i,j, k,l)<a, then w.sub.1 =1 and w.sub.2 =0. Else if M.sub.12 (i,j,k,l)>a then

W.sub.1 =1/2+1/2.[.(1-M)/(1-a).].

and

W.sub.2 1-W.sub.1

Here "a" is a parameter of the fusion process that can be set between -1 and +1. If S.sub.1 (i,j,k,l)<S.sub.2 (i,j,k,l) in the above example then the values assigned to W.sub.1 and W.sub.2 are interchanged.

Subsequently, a weighted sum of the oriented gradient pyramids are computed in each of the orientations separately, resulting in the four composite oriented gradient pyramids D.sub.