|
Description  |
|
|
The invention relates to an improved method for fusing two or more source
images to form a composite image with extended information content and
apparatus for forming the composite image from the source images.
BACKGROUND OF THE INVENTION
Image fusion is a process that combines two or more source images to form a
single composite image with extended information content. Typically images
from different sensors, such as infra-red and visible cameras, computer
aided tomography (CAT) and magnetic resonance imaging (MRI) systems, are
combined to form the composite image. Multiple images of a given scene
taken with different types of sensors, such as visible and infra-red
cameras, or images taken with a given type of sensor and scene but under
different imaging condition, such as with different scene illumination or
camera focus may be combined. Image fusion is successful to the extent
that: (1) the composite image retains all useful information from the
source images, (2) the composite image does not contain any artifacts
generated by the fusion process, and (3) the composite image looks
natural, so that it can be readily interpreted through normal visual
perception by humans or machines. The term useful information as
determined by the user of the composite image determines which features of
the different source images are selected for inclusion in the composite
image.
The most direct approach to fusion, known in the art, is to align the
source images, then sum, or average, across images at each pixel position.
This and other pixel-based approaches often field unsatisfactory results
since individual source features appear in the composite with reduced
contrast or appear jumbled as in a photographic double exposure.
Known pattern selective image fusion tries to overcome these deficiencies
by identifying salient features in the source images and preserving these
features in the composite at full contrast. Each source image is first
decomposed into a set of primitive pattern elements. A set of pattern
elements for the composite image is then assembled by selecting salient
patterns from the primitive pattern elements of the source images.
Finally, the composite image is constructed from its set of primitive
pattern elements.
Burt in Multiresolution Image Processing And Analysis, V. 16, pages 20-51,
1981 (hereinafter "BURT") and Anderson et al in U.S. Pat. No. 4,692,806,
incorporated herein by reference for its teachings on image decomposition
techniques, have disclosed an image decomposition technique in which an
original comparatively high-resolution image comprised of a first number
of pixels is processed to derive a wide field-of-view, low resolution
image comprised of second number of pixels smaller than the first given
number. The process for decomposing the image to produce lower resolution
images is typically performed using a plurality of low-pass filters of
differing bandwidth having a Gaussian roll-off. U.S. Pat. No. 4,703,514,
incorporated herein by reference, has disclosed a means for implementing
the pyramid process for the analysis of images.
The Laplacian pyramid approach to image fusion is perhaps the best known
pattern-selective method. BURT first disclosed the use of image fusion
techniques based on the Laplacian pyramid for binocular fusion in human
vision. U.S. Pat. No. 4,661,986 disclosed the use of the Laplacian
technique for the construction of an image with an extended depth of field
from a set of images taken with a fixed camera but with different focal
settings. A. Toet in Machine Vision and Applications, V. 3, pages 1-11
(1990) has disclosed a modified Laplacian pyramid that has been used to
combine visible and IR images for surveillance applications. More recently
M. Pavel et al in Proceedings of the AIAA Conference on Computing in
Aerospace, V. 8, Baltimore, October 1991 have disclosed a Laplacian
pyramid for combining a camera image with graphically generated imagery as
an aid to aircraft landing. Burt et al in ACM Trans. on Graphics, V. 2,
pages 217-236 (1983) and in the Proceeding of SPIE, V. 575, pages 173-181
(1985) have developed related Laplacian pyramid techniques to merge images
into mosaics for a variety of applications
In effect, a Laplacian transform is used to decompose each source image
into regular arrays of Gaussian-like basis functions of many sizes. These
patterns are sometimes referred to as basis functions of the pyramid
transform, or as wavelets. The multiresolution pyramid of source images
permits coarse features to be analyzed at low resolution and fine features
to be analyzed at high resolution. Each sample value of a pyramid
represents the amplitude associated with a corresponding basis function.
In the Laplacian pyramid approach to fusion cited above, the combination
process selects the most prominent of these patterns from the source
images for inclusion in the fused image. The source pyramids are combined
through selection on a sample by sample basis to form a composite pyramid.
Current practice is to use a "choose max rule" in this selection; that is,
at each sample location in the pyramid source image, the source image
sample with the largest value is copied to become the corresponding sample
in the composite pyramid. If at a given sample location if there are other
source image samples that have ready the same value as the sample with the
largest values, these may be averaged to obtain the corresponding sample
of the composite pyramid. Finally, the composite image is recovered from
the composite pyramid through an inverse Laplacian transform. By way of
example, in the approach disclosed in U.S. Pat. No. 4,661,986, the
respective source image samples with the largest value, which are copied
at each pyramid level, correspond to samples of that one of the source
images which is more in focus.
In the case of the Laplacian transform, the component patterns take the
form of circularly symmetric Gaussian-like intensity functions. Component
patterns of a given scale tend to have large amplitude where there are
distinctive features in the image of about that scale. Most image patterns
can be described as being made up of edge-like primitives. The edges in
turn are represented within the pyramid by collections of component
patterns.
While the Laplacian pyramid technique has been found to provide good
results, sometimes visible artifacts are introduced into the composite
image. These may occur, for example, along extended contours in the scene
due to the fact that such higher level patterns are represented in the
Laplacian pyramid rather indirectly. An intensity edge is represented in
the Laplacian pyramid by Gaussian patterns at all scales with positive
values on the lighter side of the edge, negative values on the darker, and
zero at the location of the edge itself. If not all of these primitives
survive the selection process, the contour is not completely rendered in
the composite. An additional shortcoming is due to the fact that the
Gaussian-like component patterns have non-zero mean values. Errors in the
selection process lead to changes in the average image intensity within
local regions of a scene. These artifacts are particularly noticeable when
sequences of composite or fused images are displayed. The selection
process is intrinsically binary, the basis function from one or the other
source image is chosen. If the magnitude of the basis functions vary, for
example because of noise in the image or sensor motion, the selection
process may alternately select the basis functions from different source
images. This leads to unduly perceptible artifacts such as flicker and
crawlers.
Further, while the prior art may employ color in the derivation of the
fused composite image itself, there is no way in the prior art of
retaining the identity of those source images that contributed to
particular displayed information in a fused composite image. For example,
in a surveillance application, an observer may want to know if the source
of a bright feature he sees in the composite image comes from an IR camera
source image, so represents a hot object, or comes from a visible camera
source, so represents a light colored, or intensely illuminated object.
Thus there is a need for improved methods of image fusion (in addition to
the prior-art methods of either averaging or "choose max rule" selection,
and the use of color) which overcome these shortcomings in the prior art
and provide better image quality and/or saliency for the user in a
composite image formed by the image fusion process, particularly when
sequences of composite images are displayed.
SUMMARY OF THE INVENTION
A method of the invention for forming a composite image from N source
images where N is greater than one comprising the steps of decomposing
each source image I.sub.n, n=1 to N, into a plurality L of sets of
oriented component patterns P.sub.n (m, l); computing a saliency measure
S.sub.n (m, l) for each component pattern P.sub.n (m, l); selecting
component patterns from the component pattern sets P.sub.n (m, l) using
the saliency measures S.sub.n (m, l) to form a set of oriented component
patterns P.sub.c (m, l) for the composite image; and constructing the
composite image I.sub.c from the set of oriented component patterns
P.sub.c (m, l.
The invention is also an apparatus for forming a composite image from a
plurality of source images comprising means for decomposing each source
image into a plurality of sets of oriented component patterns; means for
computing a saliency measure for each component pattern; means for
selecting component patterns from the component pattern sets using the
saliency measures to form a set of oriented component patterns of the
composite image; and means for constructing the composite image from the
set of oriented component patterns.
BRIEF DESCRIPTION OF THE DRAWING
FIG. 1 is a flow chart showing a prior art method for pattern-based image
fusion.
FIG. 2 diagrammatically illustrates a method for forming the Gaussian and
Laplacian pyramids.
FIG. 3 diagrammatically illustrates a method for reconstructing the
original image from the Laplacian pyramid.
FIG. 4 diagrammatically illustrates a method for pattern-based image fusion
of the invention.
FIG. 4(a) diagrammatically illustrates a method for pattern-based image
fusion using both saliency and match.
FIG. 5 illustrates the implementation of the method of the invention in
real-time digital image processing hardware.
FIG. 6 is a schematic circuit diagram of the circuits P5 and P6.
FIG. 7 is a schematic circuit diagram of the circuit P7.
FIGS. 8 (a), (c), (d) and (e) are schematic diagrams of different circuits
implementing the weighting function.
FIG. 8 (b) is graphical illustration of a particular weighting function.
FIG. 9 is a timing diagram of when the various images and pyramid levels
may be computed in a system with I/O frame stores, and assuming interlace
I/O.
FIG. 10(a) is a photograph of a source image from a standard visible light
camera.
FIG. 10(b) is a photograph of a source image from an infrared camera.
FIG. 10(c) is a photograph of the fused image obtained using the method of
the invention.
FIG. 11 is a block diagram of an illustrative embodiment that converts two
separate monochromatic source images into a fused composite colored image.
FIG. 12 is a block diagram diagrammatically illustrating an example of the
fusion process shown in FIG. 11.
FIG. 13 is a block diagram of an exemplary pyramid circuit.
DETAILED DESCRIPTION
A flow chart for a prior art pattern-based image fusion is shown in FIG. 1.
The source images are assumed to be aligned prior to undertaking the
fusion steps. The fusion method comprises the steps of transforming each
source image I.sub.n into a feature-based representation where each image
I.sub.n is decomposed into a set of component patterns P.sub.n (m), where
n=1, 2, . . . , N, the number of source images, and m=1, 2, . . . , M the
number of patterns in the set for the n.sup.th source image. Features from
the source images are combined to form a set of component patterns P.sub.c
(m) representing the composite image assembled from patterns in the source
image pattern sets. The composite image I.sub.c is then constructed from
its component patterns P.sub.c (m).
The Laplacian pyramid method for image fusion can be described in this
framework. Performing the Laplacian transform serves to decompose each
source image into a set of approximately circularly symmetric
Gaussian-like component patterns. The pyramid is a regular decomposition
into a fixed set of components. This set consists of patterns at different
scales, represented by the pyramid levels, and different positions in the
image, represented by the sample positions within the pyramid levels. Let
L.sub.n (i, j, k) be the Laplacian value at location (i, j) in pyramid
level k for image n. This value represents the amplitude of a
corresponding component pattern P.sub.n (i, j, k) which is a Gaussian-like
function.
A flow chart for the generation of the Gaussian and Laplacian pyramids of a
source image is shown in FIG. 2. The Gaussian G(0) is the source image.
The Gaussian G(0) is then filtered by F1, a low pass filter having a
Gaussian rolloff, and subsampled by F2, to remove alternate pixels in each
row and alternate rows, to form the first level Gaussian G(1). The lower
level Gaussians G(n) are formed successively in the same way. The
Laplacian L(n) corresponding to the Gaussian at each level of the pyramid
is formed by restoring the subsampled data to the next lowest level of the
Gaussian pyramid (by inserting zero-valued samples between the given
samples F2' then applying an interpolation filter F1) and subtracting from
the Gaussian of the given level. The Laplacian formed in this way is known
as the Reduce-Expand (RE) Laplacian. Alternatively, the Laplacian can be
formed without subsampling and reinterpolation as shown by the dotted line
FIG. 2. This is called a filter-subtract-decimate (FSD) Laplacian. In FIG.
3 a method for reconstructing an image from the Laplacian pyramid is
illustrated. In this method the Laplacians are interpolated and summed to
reproduce the original image (i.e. the inverse RE Laplacian pyramid
transform).
The step of combining component patterns, FIG. 1, uses the choose max rule;
that is, the pyramid constructed for the composite image is formed on a
sample by sample basis from the source image Laplacian values:
L.sub.c (i,j,k)=max [L.sub.1 (i,j,k), L.sub.2 (i,j,k), . . . , L.sub.N
(i,j,k)]
where the function max [] takes the value of that one of its arguments that
has the maximum absolute value. The composite image I.sub.c is recovered
from its Laplacian pyramid representation P.sub.c through an inverse
pyramid transform such as that disclosed by BURT and in U.S. Pat. No.
4,692,806.
A method of the invention for forming a composite image from a plurality of
source images, as shown in FIG. 4, comprises the steps of transforming the
source images into a feature-based representation by decomposing each
source image I.sub.n into a set of component patterns P.sub.n (m) using a
plurality of oriented functions; computing a saliency measure for each
component pattern; combining the salient features from the source images
by assembling patterns from the source image pattern sets P.sub.n (m)
guided by the saliency measures S.sub.n (m) associated with the various
source images; and constructing the composite image I.sub.c through an
inverse transform from its component patterns P.sub.c (m). A saliency
estimation process is applied individually to each set of component
patterns P.sub.n (m) to determine a saliency measure S.sub.n (m) for each
pattern. In general, saliency can be based directly on image data,
I.sub.n, and/or on the component pattern representation P.sub.n (m) and/or
it can take into account information from other sources. The saliency
measures may relate to perceptual distinctiveness of features in the
source images, or to other criteria specific to the application for which
fusion is being performed (e.g., targets of interest in surveillance).
The invention is a pattern selective method image fusion based upon the use
of oriented functions (component patterns) to represent the image and,
preferably, an oriented pyramid approach that overcomes the shortcomings
in the prior art and provides significantly enhanced performance. Each
source image is, preferably, decomposed into a plurality of images I of
different resolution (the pyramid of images) and then decomposing each of
these images into a plurality of sets of oriented component patterns. The
oriented component patterns are, preferably edge-like pattern elements of
many scales and orientations using the oriented pyramid. The use of the
oriented pyramid improves the retention of edge-like source image patterns
in the composite image. A pyramid is used that has component patterns with
zero (or near zero) mean value. This ensures that artifacts due to
spurious inclusion or exclusion of component patterns are not unduly
visible. Component patterns are, preferably, combined through a weighted
average rather than a simple selection process. The most prominent of
these patterns are selected for inclusion in the composite image at each
scale and orientation. A local saliency analysis, where saliency may be
based on the local edge energy (or other task-specific measure) in the
source images, is performed on each source image to determine the weights
used in component combination. Selection is based on the saliency measures
S.sub.n (m). The fused image I.sub.c is recovered from P.sub.c through an
inverse pyramid transform.
This approach overcomes artifacts that have been observed in pixel-based
fusion and in pattern-selective fusion within a Laplacian pyramid. Weights
are obtained as a nonlinear sigmoid function of the saliency measures.
Image fusion using the gradient pyramid has been found to provide
excellent results even where image fusion based on the Laplacian pyramid
introduces artifacts.
An alternative method of the invention computes a match measure M.sub.n1,
n2 (m, l) between each pair of images represented by their component
patterns, P.sub.n1 (m, l) and P.sub.n2 (m, l). These match measures are
used in addition to the saliency measures S.sub.n (m, l) in forming the
set of component patterns P.sub.c (m, l) of the composite image. This
method may be used as well when the source images are decomposed into
Laplacian component pattern that is not oriented (L=1).
Several known oriented image transforms satisfy the requirement that the
component patterns be oriented and have zero mean. The gradient pyramid
has basis functions of many sizes but, unlike the Laplacian pyramid, these
are oriented and have zero mean. The gradient pyramids set of component
patterns P.sub.n (m) can be represented as P.sub.n (i, j, k, l) where k
indicates the pyramid level (or scale), l indicates the orientation, and
i, j the index position in the k, l array. The gradient pyramid value
D.sub.n (i, j, k, l) is the amplitude associated with the pattern P.sub.n
(i, j, k, l). It can be shown that the gradient pyramid represents images
in terms of gradient-of-Gaussian basis functions of many scales and
orientations. One such basis function is associated with each sample in
the pyramid. When these are scaled in amplitude by the sample value, and
summed, the original image is recovered exactly. Scaling and summation are
implicit in the inverse pyramid transform. It is to be understood that
oriented operators other than the gradient can be used, including higher
derivative operators, and that the operator can be applied to image
features other than amplitude.
An alternative way of analyzing images is to use wavelet image
representations. Wavelet image representations, as disclosed for example
by Rioul et al in the IEEE Signal Processing Magazine, October, 1991,
pages 14-38, are oriented spatial functions, linear combinations of which
can be used to define an image. In the case of a wavelet representation,
there are at least two sets of wavelets for different orientation.
Typically three sets of wavelet basis functions, a set of horizontally
oriented functions, a set of vertically oriented functions, and a linear
combination functions derived from wavelets having right and left diagonal
orientation. Once the sets of oriented basis functions which define the
source images are obtained, a set of oriented basis functions for the
composite is selected in the same way as for the basis functions generated
using the gradient operators and the composite image is then reconstructed
from them.
The gradient pyramid for image I is obtained by applying gradient operators
to each level of its Gaussian pyramid G(n) as described in Appendix 1.
Four such gradients are used for the horizontal, vertical, and orthogonal
diagonal directions in the images, respectively. The four gradients are
then fused using a selection criterion such as saliency to select the
components to be used to form the gradient pyramid representation of the
composite image. To reconstruct the composite image from its gradient
pyramid representation, the gradient operators are applied a second time
to form four oriented second derivative pyramids. These are summed at each
level of the pyramid to form a standard Laplacian pyramid from which the
composite image is reconstructed through the usual expand and add inverse
Laplacian pyramid transform.
A pattern is salient if it carries information that is useful in
interpreting the image. In general saliency will depend on the purpose for
constructing the composite image and any measure of saliency will be task
dependent. However, saliency generally increases with the amplitude of the
elementary pattern. Let S.sub.n (i, j, k, l) be the saliency value
corresponding to P.sub.n (i, j, k, l). A saliency measure that increases
with the prominence of a component pattern can be indicated by its
amplitude
S.sub.n (i,j,k,l)=.vertline.D.sub.n (i,j,k,l).vertline..
Here D.sub.n (i, j, k, l) is the amplitude associated with the pattern
P.sub.n (i, j, k, l) at position (i, j) of gradient pyramid level k and
orientation l. Alternatively, it can be indicated by the prominence of
that component and other components within a local neighborhood. This
neighborhood is indicated by a weighting function w(i',j'):
S.sub.n (i,j,k,l)=.[.S.sub.i'j' w(i',j')D.sub.n (i-i',j-j',k,l).sup.2
.]..sup.(1/2)
Typically the neighborhood used are the component patterns for the
3.times.3 array of nearest components to the particular component of
interest or the 3.times.3 array of picture elements surrounding the
picture element of interest, depending upon the way the components are
indexed. For example, a 3.times.3 array w(i',j') can be set equal to:
##EQU1##
Another alternative measure bases salience on the occurrence of specific
patterns, such as targets in the image. For example, S may be related to
correlation of the source image with a filter matched to the target
pattern at each sample position.
The gradient pyramid for the composite image I.sub.c is obtained by
selecting components from the source pyramid basis functions P.sub.n for
each set of oriented functions. Selection is repeated at each sample
position based on the saliency measure. The selection rule commonly used
in current practice is "choose max", that is, select that source image
sample that has the greatest amplitude. However a "soft switch" is
preferable to strict selection; that is, when selection is between two
component patterns that have quite different saliency, then the one with
the larger saliency is chosen, but when selection is between components
that have comparable saliency, then the composite sample value is taken to
be the weighted average of the source samples.
The combination process is then one in which the amplitude of the combined
pattern element is computed as a weighted average of the amplitudes of the
source pattern elements for each orientation l.
D.sub.c (i,j,k,l)={S.sub.n W.sub.n (i,j,k,l)D.sub.n (i,j,k,l)}}/{S.sub.n
W.sub.n (i,j,k,l)}
The weights used in this average are based on relative saliency measures
over the source image. Weights are defined such that image components with
higher saliency get disproportionately higher weight. As an example, let A
be the total saliency at a given position
A(i,j,k,l)=S.sub.n S.sub.n (i,j,k,l)
where N is the number of source images.
For appropriately selected constants a and b, 0<a<b<1, let
##EQU2##
where
T.sub.n ={S.sub.n (i,j,k,l)/A(i,j,k,l)}
is the normalized saliency at the (i, j) position, l.sup.th orientation of
the k.sup.th pyramid level for the n.sup.th source image.
This sigmoid like function accentuates the difference between weights of
elements that have nearly average saliency while fixing the weights for a
given element at near zero or near one if its salience is significantly
below or above average, respectively.
The final step in forming the composite image I.sub.c is its reconstruction
from its gradient pyramid representation P.sub.c. The details of the
computation of the inverse gradient pyramid transform are given in
Appendix 1.
An alternative method of the invention for forming a composite image from a
plurality of source images is shown in FIG. 4(a). In this case fusion is
shown for two source images, but the method can be generalized to more
than two source images. In this method a match measure, M.sub.12 (i, j, k,
l), is computed between source images within a local neighborhood, w(i',
j'). Typically this neighborhood weighting function is the same as that
used in computing the salience measures S.sub.n (i, j, k, l). The match
measure can be based, for example, on a local correlation, C.sub.12 (i, j,
k, 1):
C.sub.12 (i,j,k,l)=.SIGMA..sub.i',j' {w(i',j')D.sub.1
(i-i',j-j',k,l).times.D.sub.2 (i-i',j-j',k,l)}
A match measure normalized between -1 and +1 is given by
M.sub.12 (i,j,k,l)=2C.sub.12 (i,j,k,l)/{S,(i,j,k,l)+S.sub.2 (i,j,k,l)}
The composite image pattern elements are again forward as a weighted
average. For the case of two source images.
D.sub.c (i,j,k,l)=w.sub.1 (i,j,k,l)D.sub.1 (i,j,k,l)+w.sub.2 D.sub.2
(i,j,k,l)
In the present implementation the weights w.sub.1 and w.sub.2 are based
both on the match and saliency measures. Suppose for example, that
S.sub.1 (i,j,k,l)>S.sub.2 (i,j,k,l)
for a given pattern element. If M.sub.12 (i,j,k,l)<a, then w.sub.1 =1 and
w.sub.2 =0. Else it M.sub.12 (i,j,k,l)>a then
W.sub.1 =1/2+1/2.[.(1-M)/(1-a).].
and
W.sub.2 =1-W.sub.1
Here "a" is a parameter of the fusion process that can be set between -1
and +1. If S.sub.1 (i,j,k,l)<S.sub.2 (i,j,k,l) in the above example then
the values assigned to W.sub.1 and W.sub.2 are interchanged. This
alternative implementation of the invention can be used with non-oriented
component patterns, such as those of the Laplacian pyramid, as well as
with oriented patterns, such as those of the gradient pyramid.
The invention is also apparatus for forming a composite image from a
plurality of source images comprising means for transforming the source
images into a feature-based representation by decomposing each source
image I.sub.n into a set of component patterns P.sub.n (m) using a
plurality of oriented functions; means for computing a saliency measure
for each component pattern; means for forming the component patterns
P.sub.c (m) of the composite image by assembling patterns from the source
image pattern sets P.sub.n (m) guided by the saliency measures S.sub.n (m)
associated with the various source images; and means for constructing the
composite image through an inverse transform from its component patterns
P.sub.c (m).
Apparatus for implementing the method of the invention is shown in FIGS.
5-8. The apparatus is shown in terms of two source images but it is
understood that any number of source images can be used with appropriate
modification of the apparatus.
The frame stores FS1 and FS2, if necessary, are used to convert input
source images generated in an interlaced format to a progressive scan
format for subsequent processing and to adjust timing. A television camera
output is typically in interlaced format.
The combination of pyramid circuit P1 and frame store FS3 are used to
compute the k-level Gaussian pyramid representation G.sub.a (k) of the
input source image I.sub.a and the combination of circuit P2 and frame
store FS4 are used to compute the n-level Gaussian pyramid representation
G.sub.b (k) of the input source image I.sub.b. The circuits P1 and P2
provide the low pass filter with a Gaussian rolloff and the pixel
subsampling (removal/decimation of alternate pixels in each row and each
row of the filtered image). The next operation on each level of the
Gaussian pyramids G(k) is a filter (1+w') which is performed by circuit P3
and circuit P4 to form G.sub.a.sup.f (k) and G(k).sub.b.sup.f,
respectively. The purpose of this pre-filter P3 and post-filter P8 are to
adjust overall filter characteristics to provide an exact correspondence
between intermediate results in the gradient pyramid transform and the
Laplacian transform. Alternatively, this filter may be applied at other
points in the sequence of transform steps. Other similar filters can be
used to obtain approximate results. w' is a three by three binomial
filter:
##EQU3##
And the filter P3 has the form:
##EQU4##
Next, each of the filtered Gaussian pyramids G.sub.a.sup.f (k) and
G.sub.b.sup.f (k) is filtered with four oriented gradient filters
representing the horizontal d.sub.h, vertical d.sub.v, right diagonal
d.sub.rd, and left diagonal d.sub.ld filters respectively.
##EQU5##
These operations are performed by circuits P5 and P6, producing the eight
oriented gradient pyramids D.sub.a (k, h), D.sub.a (k, rd), D.sub.a (k,
v), D.sub.a (k, rd), D.sub.b (k, h), D.sub.b (k, rd), D.sub.b (k, v),
D.sub.b (k, rd). It is to be understood that while the gradient operators
shown here use only the two nearest neighbor samples in the particular
direction, a larger number of neighbors can be used in the gradient
calculation.
In FIG. 6, circuits P5 and P6 comprise four subtractors 61, 62, 63 and 64.
The input signal is connected directly to an input of subtractor 61 and
through a single pixel delay 65 to the second input of subtractor 61. The
output of subtractor 61 is d.sub.h. The input signal is connected directly
to an input of subtractor 62 and through a single line delay 66 to the
second input of subtractor 62. The output of subtractor 62 is d.sub.v. The
input signal is connected through pixel delay 65 to an input of subtractor
63 and through line delay 66 to the second input of subtractor 63. The
output of subtractor 61 is d.sub.rd. The input signal is connected
directly to an input of subtractor 64 and through line delay 66 and pixel
delay 65 to the second input of subtractor 64. The output of subtractor 61
is d.sub.ld. P5 and P6 can be implemented using a commercial Field
Programmable Gate Array circuit (FPGA) such as XC3042 manufactured by
Xilinx, Inc., San Jose, Calif. 95124.
The fusion function combines two or more images into a composite image as
shown schematically in FIG. 8(a). Here the fusion function is computed on
the four oriented gradient pyramids of the source images. It can also be
applied to the Laplacian pyramid directly, but with less effectiveness.
The functional dependence of W.sub.n on the total salience A for source
image I.sub.n.sup.a is shown in FIG. 8(b) for the case of two input
images. The functions:
##EQU6##
can be implemented with a single look-up-table (LUT) of size 64K.times.8
if the input and output images are 8 bits as illustrated in FIG. 8(c).
As examples, saliency may be based on absolute sample value or on a local
root mean square average where
S.sub.n (i,j,k,l)=[S.sub.i'j' w(i',j')D.sub.n (i-i',j-j',k,l).sup.2
].sup.(1/2).
In FIG. 8(e) an implementation of the local average method as shown in FIG.
4(a) is illustrated. A match measure, M.sub.12 (i,j,k,l), is computed
between source images D.sub.1 (i,j,k,l) and D.sub.2 (i,j,k,l) within a
local neighborhood, w(i',j'). Typically this neighborhood weighting
function is the same as that used in computing the salience measures
S.sub.n (i,j,k,l). The composite image pattern elements are again forward
as a weighted average. For the case of two source images.
D.sub.c (i,j,k,l)=w.sub.1 (i,j,k,l)D.sub.1 (i,j,k,l)+w.sub.2 D.sub.2
(i,j,k,l)
The local correlation, C.sub.12 (i,j,k,l) is
C.sub.12 (i,j,k,l)=.SIGMA..sub.i',j' [w(i',j')D.sub.1
(i-i',j-j',k,l).times.D.sub.2 (i-i',j-j',k,l)}
and the match measure is:
M.sub.12 (i,j,k,l)=2C.sub.12 (i,j,k,l)/{S,(i,j,k,l)+S.sub.2 (i,j,k,l)}
The appropriate weighting function is then selected from a lookup table in
the IF function. The weights are preferably selected as follows.
If S.sub.1 (i,j,k,l)>S.sub.2 (i,j,k,l) for a given pattern element and if
M.sub.12 (i,j, k,l)<a, then w.sub.1 =1 and w.sub.2 =0. Else if M.sub.12
(i,j,k,l)>a then
W.sub.1 =1/2+1/2.[.(1-M)/(1-a).].
and
W.sub.2 1-W.sub.1
Here "a" is a parameter of the fusion process that can be set between -1
and +1. If S.sub.1 (i,j,k,l)<S.sub.2 (i,j,k,l) in the above example then
the values assigned to W.sub.1 and W.sub.2 are interchanged.
Subsequently, a weighted sum of the oriented gradient pyramids are computed
in each of the orientations separately, resulting in the four composite
oriented gradient pyramids D.sub. | | |