|
Claims  |
|
|
What is claimed is:
1. A method of processing a plurality of images to generate a
three-dimensional mosaic of a scene comprising the steps of:
providing a plurality of images of the scene; and
registering said images along a non-planar parametric surface to construct
said three-dimensional mosaic containing an image mosaic of registered
images and a shape mosaic, where said image mosaic represents a panoramic
view of the scene and said shape mosaic represents a three-dimensional
geometry of the scene.
2. The method of claim 1 wherein said registering step further comprises
the steps of:
registering each image in said plurality of images along said non-planar
parametric surface to produce registered images;
determining, in response to said registered images, translation parameters
and a parametric motion field useful in aligning the images along the
non-planar parametric surface; and
generating a parallax field representing parallax of objects within the
scene.
3. The method of claim 2 further comprising the step of converting said
plurality of images into a plurality of multi-resolutional pyramids, where
each image pyramid contains a plurality of levels.
4. The method of claim 3 wherein said registering and determining steps are
iterated over each of said levels within said multi-resolutional pyramids
until said plurality of images are registered to a predefined degree of
accuracy.
5. The method of claim 4 wherein said predefined degree of accuracy is a
sum of the squares difference measure integrated over selected regions
within each of said levels of said multi-resolutional pyramids.
6. The method of claim 1 wherein said shape mosaic contains a parallax
motion field.
7. The method of claim 1 wherein said image mosaic and said shape mosaic
are multi-resolutional pyramids.
8. The method of claim 1 further comprising the steps of:
converting said image mosaic and said shape mosaic into multi-resolutional
pyramids;
converting a new image into a multi-resolutional pyramid; and
determining pose parameters for relating the new image with the image
mosaic and the shape mosaic, where the pose parameters contain translation
parameters, a planar motion field, and a parallax motion field for the new
image.
9. The method of claim 8 further comprising the step of generating a
updated image mosaic and an updated shape mosaic, each containing the new
image and the pose parameters.
10. The method of claim 8 further comprising the steps of:
providing an existing three-dimensional mosaic;
determining pose parameters for a new image with respect to said existing
three-dimensional mosaic;
warping said existing three-dimensional mosaic to image coordinates of said
new image to create a synthetic image, where said synthetic image
represents a view of the three-dimensional mosaic from the coordinates of
the new image; and
merging said synthetic image into said new image to produce a new
three-dimensional mosaic that is a combination of said new image and said
existing three-dimensional mosaic.
11. The method of claim 10 further comprising the steps of:
providing a next image that sequentially follows said new image;
detecting changes between said new image, said existing three-dimensional
mosaic, and said next image, where said changes represent motion within
the scene without detecting parallax due to viewpoint change as said
motion.
12. The method of claim 1 further comprising the steps of:
detecting points within said three-dimensional mosaic that are occluded
within the scene by objects in the scene; and
image processing the detected occluded points such that said occluded
points do not produce artifacts in said three-dimensional mosaic.
13. The method of claim 1 further comprising the step of:
estimating a height of points within said three-dimensional mosaic relative
to said parametric surface, where said height of said points form a height
map that represents the height of object points within said scene.
14. The method of claim 1 further comprising the steps of:
providing a plurality of three-dimensional mosaics representing a scene
from different viewpoints, where a three-dimensional mosaic has been
generated at each viewpoint;
warping said plurality of three-dimensional mosaics to a reference
coordinate system;
merging said plurality of three-dimensional mosaics to form a composite
three-dimensional mosaic;
providing coordinates for a new viewpoint of said scene;
determining parameters to relate said new viewpoint coordinates to said
composite three-dimensional mosaic; and
warping said composite three-dimensional mosaic to said viewpoint
coordinates to create a synthetic image, where said synthetic image
represents a new view of the composite three-dimensional mosaic taken from
the new viewpoint.
15. The method of claim 1 further comprising the steps of:
providing a plurality of three-dimensional mosaics representing a scene
from different viewpoints, where a three-dimensional mosaic has been
generated at each viewpoint;
providing coordinates for a new viewpoint of said scene;
determining parameters to relate said new viewpoint coordinates to a
plurality of the three-dimensional mosaics;
warping said plurality of three-dimensional mosaics to said viewpoint
coordinates to create a synthetic image, where said synthetic image
represents a new view of the three-dimensional mosaic taken from the new
viewpoint; and
merging said plurality of three-dimensional mosaics to form said synthetic
image.
16. The method of claim 1 wherein said registering step further comprises
the steps of:
performing a plane-then-parallax process including the steps of registering
each image in said plurality of images along a parametric surface to
produce initially registered images; determining, in response to said
initially registered images, initial translation parameters and a initial
parametric motion field useful in initially aligning the images along the
parametric surface; and generating an initial parallax field representing
parallax of objects within the scene; and
simultaneously registering, using said initial translation parameters,
initial parametric motion field and initial parallax field, said images in
said plurality of images along said parametric surface to produce final
registered images, determining, in response to said final registered
images, final translation parameters and a final parametric motion field
useful in aligning the images along the parametric surface, and generating
a final parallax field representing parallax of objects within the scene.
17. The method of claim 16 further comprising the step of converting said
plurality of images into a plurality of multi-resolutional pyramids, where
each multi-resolutional pyramid contains a plurality of levels.
18. The method of claim 17 wherein said registering, determining and
simultaneously registering steps are iterated over each of said levels
within said multi-resolutional pyramids until said plurality of images are
registered to a predefined degree of accuracy.
19. The method of claim 18 wherein said predefined degree of accuracy is a
sum of the squares difference measure integrated over selected regions
within each of said levels of said multi-resolutional pyramids.
20. The method of claim 16 wherein said image mosaic and said shape mosaic
are multi-resolutional pyramids.
21. A method of processing a plurality of images to generate a
three-dimensional mosaic of a scene comprising the steps of:
providing a plurality of images of the scene;
simultaneously registering said images in said plurality of images along a
parametric surface to produce registered images, determining, in response
to said registered images, translation parameters and a parametric motion
field useful in aligning the images along the parametric surface, and
generating a parallax field representing parallax of objects not lying
within said parametric surface.
22. The method of claim 21 further comprising the step of converting said
plurality of images into a plurality of multi-resolutional pyramids, where
each multi-resolutional pyramid contains a plurality of levels.
23. The method of claim 22 wherein said registering and determining steps
are iterated over each of said levels within said multi-resolutional
pyramids until said plurality of images are registered to a predefined
degree of accuracy.
24. The method of claim 23 wherein said predefined degree of accuracy is a
sum of the squares difference measure integrated over selected regions
within each of said levels of said image pyramids.
25. The method of claim 21 further comprising the steps of:
converting said image mosaic and said shape mosaic into multi-resolutional
pyramids;
converting a new image into a multi-resolutional pyramid; and
determining pose parameters for relating the new image with the image
mosaic and the shape mosaic, where the pose parameters contain translation
parameters, a planar motion field, and a parallax motion field for the new
image.
26. The method of claim 25 further comprising the step of generating a
updated image mosaic and an updated shape mosaic, each containing the new
image and the pose parameters.
27. The method of claim 25 further comprising the steps of:
providing an existing three-dimensional mosaic;
determining pose parameters for a new image with respect to said existing
three-dimensional mosaic;
warping said existing three-dimensional mosaic to image coordinates of said
new image to create a synthetic image, where said synthetic image
represents a view of the three-dimensional mosaic from the coordinates of
the new image; and
merging said synthetic image into said new image to produce a new
three-dimensional mosaic that is a combination of said new image and said
existing three-dimensional mosaic.
28. The method of claim 27 further comprising the steps of:
providing a next image that sequentially follows said new image;
detecting changes between said new image, said existing three-dimensional
mosaic, and said next image, where said changes represent motion within
the scene without detecting parallax due to viewpoint change as said
motion.
29. The method of claim 21 further comprising the steps of:
detecting points within said three-dimensional mosaic that are occluded
within the scene by objects in the scene; and
image processing the detected occluded points such that said occluded
points do not produce artifacts in said three-dimensional mosaic.
30. The method of claim 21 further comprising the step of:
estimating a height of points within said three-dimensional mosaic relative
to said parametric surface, where said height of said points form a height
map that represents the height of object points within said scene.
31. The method of claim 21 further comprising the steps of:
providing a plurality of three-dimensional mosaics representing a scene
from different viewpoints, where a three-dimensional mosaic has been
generated at each viewpoint;
warping said plurality of three-dimensional mosaics to a reference
coordinate system;
merging said plurality of three-dimensional mosaics to form a composite
three-dimensional mosaic;
providing coordinates for a new viewpoint of said scene;
determining parameters to relate said new viewpoint coordinates to said
composite three-dimensional mosaic; and
warping said composite three-dimensional mosaic to said viewpoint
coordinates to create a synthetic image, where said synthetic image
represents a new view of the composite three-dimensional mosaic taken from
the new viewpoint.
32. The method of claim 21 further comprising the steps of:
providing a plurality of three-dimensional mosaics representing a scene
from different viewpoints, where a three-dimensional mosaic has been
generated at each viewpoint;
providing coordinates for a new viewpoint of said scene;
determining parameters to relate said new viewpoint coordinates to a
plurality of the three-dimensional mosaics;
warping said plurality of three-dimensional mosaics to said viewpoint
coordinates to create a synthetic image, where said synthetic image
represents a new view of the three-dimensional mosaic taken from the new
viewpoint; and
merging said plurality of three-dimensional mosaics to form said synthetic
image.
33. The method of claim 21 wherein said registering step further comprises
the steps of:
performing a plane-then-parallax process including the steps of registering
each image in said plurality of images along a parametric surface to
produce initially registered images; determining, in response to said
initially registered images, initial translation parameters and a initial
parametric motion field useful in initially aligning the images along the
parametric surface; and generating an initial parallax field representing
parallax of objects within the scene; and
simultaneously registering, using said initial translation parameters,
initial parametric motion field and initial parallax field, said images in
said plurality of images along said parametric surface to produce final
registered images, determining, in response to said final registered
images, final translation parameters and a final parametric motion field
useful in aligning the images along the parametric surface, and generating
a final parallax field representing parallax of objects within the scene.
34. The method of claim 33 further comprising the step of converting said
plurality of images into a plurality of multi-resolutional pyramids, where
each multi-resolutional pyramid contains a plurality of levels.
35. The method of claim 34 wherein said registering, determining and
simultaneously registering steps are iterated over each of said levels
within said multi-resolutional pyramids until said plurality of images are
registered to a predefined degree of accuracy.
36. The method of claim 35 wherein said predefined degree of accuracy is a
sum of the squares difference measure integrated over selected regions
within each of said levels of said multi-resolutional pyramids.
37. The method of claim 34 wherein said image mosaic and said shape mosaic
are multi-resolutional pyramids.
38. A method of processing a plurality of images to generate a
three-dimensional mosaic of a scene comprising the steps of:
providing a plurality of images of the scene;
registering each image in said plurality of images along a non-planar
parametric surface to produce registered images; and
determining, in response to said registered images, translation parameters
and a parametric motion field useful in aligning the images along the
non-planar parametric surface; and
generating a parallax field representing parallax of objects within the
scene;
constructing, in response to said translation parameters, parametric motion
field, and said parallax field, said three-dimensional mosaic containing
an image mosaic and a shape mosaic, where said image mosaic represents a
panoramic view of the scene and said shape mosaic represents a
three-dimensional geometry of the scene.
39. The method of claim 38 further comprising the step of converting said
plurality of images into a plurality of multi-resolutional pyramids, where
each multi-resolutional pyramid contains a plurality of levels.
40. The method of claim 39 wherein said registering and determining steps
are iterated over each of said levels within said multi-resolutional
pyramids until said plurality of images are registered to a predefined
degree of accuracy.
41. The method of claim 40 wherein said predefined degree of accuracy is a
sum of the squares difference measure integrated over selected regions
within each of said levels of said multi-resolutional pyramids.
42. The method of claim 41 wherein said shape mosaic contains a parametric
motion field and a parallax motion field.
43. The method of claim 38 wherein said image mosaic and said shape mosaic
are multi-resolutional pyramids.
44. A method of processing a plurality of images to generate a
three-dimensional mosaic of a scene comprising the steps of:
providing a plurality of images of the scene;
simultaneously registering said images in said plurality of images along a
parametric surface to produce registered images, determining, in response
to said registered images, translation parameters and a parametric motion
field useful in aligning the images along the parametric surface, and
generating a parallax field representing parallax of objects within the
scene; and
constructing, in response to said translation parameters, parametric motion
field, and said parallax field, said three-dimensional mosaic containing
an image mosaic and a shape mosaic, where said image mosaic represents a
panoramic view of the scene and said shape mosaic represents a
three-dimensional geometry of the scene.
45. The method of claim 44 further comprising the step of converting said
plurality of images into a plurality of multi-resolutional pyramids, where
each multi-resolutional pyramid contains a plurality of levels.
46. The method of claim 45 wherein said registering, and determining steps
are iterated over each of said levels within said multi-resolutional
pyramids until said plurality of images are registered to a predefined
degree of accuracy.
47. The method of claim 46 wherein said predefined degree of accuracy is a
sum of the squares difference measure integrated over selected regions
within each of said levels of said multi-resolutionsal pyramids.
48. The method of claim 44 wherein said image mosaic and said shape mosaic
are multi-resolutional pyramids.
49. An image processing system for generating a three-dimensional mosaic
three-dimensional mosaic of a scene from a plurality of images of the
scene, comprising:
means for storing said plurality of images;
a registration processor, connected to said storing means, for registering
said images along a non-planar parametric surface to construct said
three-dimensional mosaic containing an image mosaic and a shape mosaic,
where said image mosaic represents a panoramic view of the scene and said
shape mosaic represents a three-dimensional geometry of the scene.
50. The system of claim 49 wherein said registration processor further
comprises:
a plane-then-parallax registration processor for aligning said images along
said non-polar parametric surface that extends through the plurality of
images to produce translation parameters and a parametric motion field
used to align the images within the image mosaic Land then for determining
a parallax field representing objects within the scene.
51. An image processing system for generating a three-dimensional mosaic of
a scene from a plurality of images of the scene, comprising:
means for storing said plurality of images;
a plane-and-parallax registration processor for simultaneously aligning
said images along a parametric surface that extends through the plurality
of images to produce translation parameters and a parametric motion field
used to align the images within the image mosaic and for determining a
parallax field representing objects within the scene.
52. An image processing system for generating a three-dimensional mosaic of
a scene from a plurality of images of the scene, comprising:
means for storing said plurality of images;
a plane-then-parallax registration processor for aligning said images along
a parametric surface that extends through the plurality of images to
produce initial translation parameters and an initial parametric motion
field used to align the images within the image mosaic and then for
determining an initial parallax field representing objects within the
scene that do not lie in the parametric surface; and
a plane-and-parallax registration processor, connected to an output of said
plane-then-parallax registration processor, for simultaneously aligning
said images along said parametric surface to produce final translation
parameters and a final parametric motion field used to align the images
within the image mosaic and for determining a final parallax field
representing objects within the scene that do not lie in the parametric
surface.
53. The system of claim 52 further comprising a three-dimensional mosaic
generator, connected to said registration processor, for combining said
images in said plurality of images using said final translation parameters
and said final motion flow field to form said image mosaic and for
generating said shape mosaic containing the final parallax field. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
The invention relates to image processing systems, and more particularly,
the invention relates to an image processing system that combines multiple
images into a mosaic using a parallax-based technique.
BACKGROUND OF THE DISCLOSURE
Until recently, image processing systems have generally processed images,
such as frames of video, still photographs, and the like, on an
individual, image-by-image basis. Each individual frame or photograph is
typically processed by filtering, warping, and applying various parametric
transformations. In order to form a panoramic view of the scene, the
individual images are combined to form a two-dimensional mosaic, i.e., an
image that contains a plurality of individual images. Additional image
processing is performed on the mosaic to ensure that the seams between the
images are invisible such that the mosaic looks like a single large image.
The alignment of the images and the additional processing to remove seams
is typically accomplished manually by a technician using a computer
workstation, i.e., the image alignment and combination processes are
computer aided. In such computer aided image processing systems, the
technician manually selects processed images, manually aligns those
images, and a computer applies various image combining processes to the
images to remove any seams or gaps between the images. Manipulation of the
images is typically accomplished using various computer input devices such
as a mouse, trackball, keyboard and the like. Since manual mosaic
generation is costly, those skilled in the art have developed automated
systems for generating image mosaics.
In automated systems for constructing mosaics, the information within a
mosaic is generally expressed as two-dimensional motion fields. The motion
is represented as a planar motion field, e.g., an affine or projective
motion field. Such a system is disclosed in U.S. patent application Ser.
No. 08/339,491, entitled "Mosaic Based Image Processing System", filed
Nov. 14, 1994 now U.S. Pat. No. 5,649, 032, and herein incorporated by
reference. The image processing approach disclosed in the '491 application
automatically combines multiple image frames into one or more
two-dimensional mosaics. However, that system does not account for
parallax motion that may cause errors in the displacement fields
representing motion in the mosaic.
In other types of image processing systems, multiple images are analyzed in
order to recover photogrammatic information such as relative orientation
estimation, range map recovery and the like without generating a mosaic.
These image analysis techniques assume that the internal camera parameters
(e.g., focal length pixel resolution, aspect ratio, and image center) are
known. In automated image processing systems that use alignment and
photogrammetry, the alignment and photogrammatic process involves two
steps: (1) establishing correspondence between pixels within various
images via some form of area- or feature-based matching scheme, and (2)
analyzing pixel displacement in order to recover three-dimensional scene
information.
Other image processing systems have analyzed image motion within a
three-dimensional scene that is imaged from multiple viewpoints to
determine the range or depth of objects within the scene. Such an approach
is disclosed in K. J. Hanna, "Direct Multi-Resolution Estimation of
Ego-Motion and Structure From Motion", Proceedings of the IEEE Workshop on
Visual Motion, Princeton, N.J., Oct. 7-9, 1991, pp. 156-162, and K. J.
Hanna et al., "Combining Stereo and Motion Analysis for Direct Estimation
of Scene Structure", Proceedings of the Fourth International Conference on
Computer Vision (ICCV'93), Berlin, Germany, May, 1993. The disclosures
within both these papers are incorporated herein by reference. The prior
art methods of generating three-dimensional representations have
heretofore not been used in conjunction with systems that generate
two-dimensional mosaics. Consequently, these approaches are used to
analyze the three-dimensional geometry of a scene, but do not form useful
representations of combinations of images such as mosaics.
Therefore, a need exists in the art for a system that automatically
generates, from a plurality of images, a three-dimensional mosaic that
accurately represents both the two-dimensional image information and the
three-dimensional geometry within a scene.
SUMMARY OF THE INVENTION
The disadvantages associated with the prior art are overcome by the present
invention of a system for generating three-dimensional mosaics from a
plurality of input images. The plurality input images contain at least two
images of a single scene, where at least two of the images have
overlapping regions but, in general, depict the scene from differing
viewpoints. The input images are generated by either a single camera
producing a series of video frames or a plurality of cameras generating
still or video frames from differing viewpoints of the same scene. In
either case, the input images to the system are digital images that are
either digitized by the camera or digitized after the camera generates the
image. The system combines the input images using a parallax-based
approach that generates a three-dimensional mosaic comprising an image
mosaic representing a panoramic view of the scene and a shape mosaic
representing the three-dimensional geometry of the scene. From this
three-dimensional mosaic, any viewpoint of the scene can be synthetically
derived, i.e., viewpoints that are not collocated with the camera(s) that
originally imaged the scene. Furthermore, such a three-dimensional mosaic
can be used to estimate object height within the imaged scene as well as
be used for efficient compression of video information for transmission or
storage.
More specifically, the system generates the three-dimensional mosaic using
a sequence of image processing techniques. First, the images and any
existing three-dimensional mosaic into which the images are to be
incorporated are subsampled to form conventional multi-resolutional image
pyramids. Then, the system uses a sequential image registration process
dubbed a plane-then-parallax (P-then-P) process to compute image alignment
parameters and the parallax motion that exists between images. Lastly, the
full alignment and parallax field generation is achieved using a
simultaneous image registration process dubbed a plane-and-parallax
(P-and-P) process. After each step of processing, the degree of image
alignment is monitored such that, if accurate alignment is attained,
subsequent processing is avoided. In the broadest use of the invention,
either P-then-P or P-and-P processing can be used alone to register the
images. These image registration processes compute both alignment and
motion parameters (e.g., translation parameters for alignment and both a
parallax field and a planar motion field for motion estimation) that are
useful for aligning images to generate an image mosaic and for capturing
the three-dimensional geometry of the scene to generate a shape mosaic. As
such, the result of the registration processes can be used to generate a
three-dimensional mosaic containing a two-dimensional image mosaic and a
shape mosaic. From the information contained in the three-dimensional
mosaic, a synthetic viewpoint of the scene can be generated that would
take into account any parallax within the scene to produce a realistic
view. The system uses pose estimation processing of the three-dimensional
mosaic to achieve an image from the synthetic viewpoint. Also, the system
contains a process for detecting occluded points in the scene such that
these occluded points can be further processed to achieve a realistic
synthetic image.
Such three-dimensional mosaics find use in estimating object height within
a scene; in achieving efficient image and video compression, storage and
retrieval; in detecting object motion or image changes without detecting
parallax motion as an image change; as well as many other applications.
BRIEF DESCRIPTION OF THE DRAWINGS
The teachings of the present invention can be readily understood by
considering the following detailed description in conjunction with the
accompanying drawings, in which:
FIG. 1 depicts a block diagram of an imaging system incorporating an image
processing system of the invention;
FIG. 2 schematically depicts the input images and output mosaics of the
system of FIG. 1;
FIG. 3 is a geometric representation of the relationship amongst a
reference image generated by a reference camera, an inspection image
generated by an inspection camera, and an arbitrary parametric surface
within a scene imaged by the cameras;
FIG. 4 is a flow chart of a P-then-P routine for registering images and
extracting parallax information from the registered images;
FIG. 5 is a flow chart of a P-and-P routine for registering images and
extracting parallax information from the registered images;
FIG. 6 is a functional block diagram of an image processing system of the
present invention;
FIG. 7 is a flow chart of a pose estimation routine;
FIG. 8 is a flow chart of a three-dimensional corrected mosaic construction
routine;
FIG. 9 is a two-dimensional geometric representation of the plane OMP of
FIG. 3 where the scene contains an object that occludes points within the
image; and
FIG. 10 depicts an experimental set-up for estimating heights of objects
within a scene using the system of the present invention.
FIG. 11 depicts a block diagram of an application for the inventive system
that synthesizes a new view of existing three-dimensional mosaics;
To facilitate understanding, identical reference numerals have been used,
where possible, to designate identical elements that are common to the
figures.
DETAILED DESCRIPTION
The present invention is an image processing system that combines a
plurality of images representing an imaged scene to form a
three-dimensional (3D) mosaic, where the 3D mosaic contains an image
mosaic representing a panoramic view of the scene and a shape mosaic
representing the three-dimensional geometry of the scene. The shape mosaic
defines a relationship between any two images by a motion field that is
decomposed into two-dimensional image motion of a two-dimensional,
parametric surface and a residual parallax field. Although many techniques
may be useful in generating the motion fields and the parametric
translation parameters, the following disclosure discusses two
illustrative processes. The first process, known as plane-then-parallax
(P-then-P), initially registers the images along a parametric surface
(plane) in the scene and then determines a parallax field representing the
three-dimensional geometry of the scene. The second illustrative process,
known as plane-and-parallax (P-and-P), simultaneously registers the images
and determines the parallax field. With either process, the results of
registration are translation parameters for achieving image alignment
along the parametric surface, a parallax field representing the
three-dimensional geometry (motion) of the scene with respect to the
parametric surface, and a planar motion field representing motion within
the parametric surface. These results can be used to combine the input
images to form a three-dimensional mosaic.
Image motion of a parametric surface is, in essence, a conventional
representation of a 2D mosaic. Motion of the parametric surface is
generally expressed as a parametric motion field that is estimated using
one of the many available techniques for directly estimating
two-dimensional motion fields. For an overview of such techniques, see
Bergen at al., "Hierarchical Model-Based Motion Estimation," Proceedings
2nd European Conference on Computer Vision-92, Springer-Verlag, Santa
Margherita Ligure, Italy, May 1992. Generally speaking, a direct approach
is sufficient for aligning and combining a plurality of images to form a
two-dimensional mosaic. Such a two-dimensional mosaic represents an
alignment of a two-dimensional parametric surface within a scene captured
by the image sequence. This parametric surface can either be an actual
surface in the scene within which lie most objects of the scene or the
parametric surface can be a virtual surface that is arbitrarily selected
within the scene. All objects within the scene generate what is known as
parallax motion as a camera moves with respect to the parametric surface.
This parametric motion is represented by a parallax motion field (also
referred to herein as a parallax field). The parallax field has value for
objects within the scene that do not lie in the plane of the surface.
Although objects lying in the plane of the surface are represented in the
parallax field, those objects have zero parallax. More particularly, the
parallax field represents the objects that lie in front of and behind the
parametric surface and the distance (height) of these objects from the
surface, i.e., the three-dimensional geometry of the scene. As such, using
the parallax field in combination with the parametric surface and its
planar motion field, the system can generate a three-dimensional
reconstruction of the scene up to an arbitrary collineation. If camera
calibration parameters such as focal length and optical center are known,
then this three-dimensional reconstruction of the scene is Euclidean.
FIG. 1 depicts a block diagram of the image processing system 100 as it is
used to generate 3D mosaics from a plurality of images. The image
processing system is, in general, a general purpose computer that is
programmed to function as an image processing system as described herein.
The system further contains one or more cameras 104.sub.n that image a
scene 102. In the illustrative system two cameras, cameras 104.sub.1 and
104.sub.2. are shown. Each camera, for simplicity, is assumed to be
digital video camera that generates a series of frames of digitized video
information. Alternatively, the cameras could be still cameras,
conventional video cameras, or some other form of imaging sensor such as
an infrared sensor, an ultrasonic sensor, and the like, whose output
signal is separately digitized before the signal is used as an input to
the image processing system 100. In any event, each camera 104.sub.1 and
104.sub.2 generates an image having a distinct view of the scene.
Specifically, the images could be selected frames from each camera imaging
a different view of the scene or the images could be a series of frames
from a single camera as the camera pans across the scene. In either case,
the input signal to the image processing system of the present invention
is at least two images taken from different viewpoints of a single scene.
Each of the images partially overlaps the scene depicted in at least one
other image. The image processing system 100 combines the images into a 3D
mosaic and presents the mosaic to an output device 106. The output device
could be a video compression system, a video storage and retrieval system,
or some other application for the 3D mosaic.
FIG. 2 schematically depicts the input images 200.sub.n to the system of
FIG. 1 and the output 3D mosaic 202 generated by that system in response
to the input images. The input images, as mentioned above, are a series of
images of a scene, where each image depicts the scene from a different
viewpoint. The system aligns the images and combines them to form an image
mosaic 204, e.g., a two-dimensional mosaic having the images aligned along
an arbitrary parametric surface extending through all the images. Aligning
the images to form the image mosaic requires both the parametric
translation parameters and the planar motion field. In addition to the
image mosaic, the system generates a shape mosaic 206 that contains the
motion field that relates the three-dimensional objects within the images
to one another and to the parametric surface. The shape mosaic contains a
parallax motion field 208. The planar motion field represents motion
within the parametric surface that appears in the images from image to
image, while the parallax flow field represents motion due to parallax of
three-dimensional objects in the scene with respect to the parametric
surface.
A. Determining A Residual Parallax Field
Consider two camera views, one denoted as the "reference" camera and the
other denoted the "inspection" camera (e.g., respectively cameras
104.sub.1 and 104.sub.2 of FIG. 1). In general, the image processing
system maps any three-dimensional (3D) point P.sub.1 in the reference
camera coordinate system to a 3D point P.sub.2 in the inspection camera
coordinate system using a rigid body transformation represented by
Equation 1.
P.sub.2 =R(P.sub.1)+T.sub.2 =R(P.sub.1 -T.sub.1) (1)
The mapping vector is represented by a rotation (R) followed by a
translation (T.sub.2) or by a translation (T.sub.1) followed by a rotation
(R). Using perspective projection, the image coordinates (x,y) of a
projected point P are given by the vector p of Equation 2.
##EQU1##
where f is the focal length of the camera.
FIG. 3 is a geometric representation of the relationship amongst a
reference image 302 generated by the reference camera, an inspection image
304 generated by the inspection camera, and an arbitrary parametric
surface 300 within the imaged scene. Let S denote the surface of interest
(a real or virtual parametric surface 300), P denotes an environmental
point (e.g., a location of an object) within the scene that is not located
on S, and O and M denote the center locations (focal points) of each
camera. The image of P on the reference view 302 is p. Let the ray MP
intersect the surface S at location Q. A conventional warping process,
used to align the images 302 and 304 by aligning all points on the surface
S, warps p', the image of P on the inspection image 304, to q, the image
of Q on the reference image 302. Therefore, the residual parallax vector
is pq, which is the image of line PQ. It is immediately obvious from the
figure that vector pq lies on the plane OMP, which is the epipolar plane
passing through p. Since such a vector is generated for any point P in the
scene, it can be said that the collection of all parallax vectors forms a
parallax displacement field. Since the parallax displacement vector
associated with each image point lies along the epipolar plane associated
with that image, the vector is referred to as an epipolar field. This
field has a radial structure, each vector appearing to emanate from a
common origin in the image dubbed the "epipole" (alias focus of expansion
(FOE)). In FIG. 3 the epipole is located at point "t". From FIG. 3, it is
obvious that the epipole t lies at the intersection of the line OM with
the image plane 302. The parallax displacement field is also referred to
herein simply as a parall | | |