WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Method and system for image combination using a parallax-based technique    
United States Patent5963664   
Link to this pagehttp://www.wikipatents.com/5963664.html
Inventor(s)Kumar; Rakesh (Dayton, NJ); Hanna; Keith James (Princeton, NJ); Bergen; James R. (Hopewell, NJ); Anandan; Padmanabhan (Lawrenceville, NJ); Irani; Michal (Princeton Jct., NJ)
AbstractA system for generating three-dimensional mosaics from a plurality of input images representing an imaged scene. The plurality input images contain at least two images of a single scene, where at least two of the images have overlapping regions. The system combines the images using a parallax-based approach that generates a three-dimensional mosaic comprising an image mosaic representing a panoramic view of the scene and a shape mosaic representing the three dimensional geometry of the scene. Specifically, in one embodiment, the system registers the input images along a parametric surface within the imaged scene and derives translation vectors useful in aligning the images into a two-dimensional image mosaic. Once registered, the system generates a shape mosaic representing objects within the scene.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5963664
Method and system for image combination using a parallax-based technique - US Patent 5963664 Drawing
Method and system for image combination using a parallax-based technique
Inventor     Kumar; Rakesh (Dayton, NJ); Hanna; Keith James (Princeton, NJ); Bergen; James R. (Hopewell, NJ); Anandan; Padmanabhan (Lawrenceville, NJ); Irani; Michal (Princeton Jct., NJ)
Owner/Assignee     Sarnoff Corporation (Princeton, NJ)
Patent assignment
All assignments
Publication Date     October 5, 1999
Application Number     08/493,632
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     June 22, 1995
US Classification     382/154 348/47 382/284 382/294
Int'l Classification     G06K 009/60 G06T 017/00
Examiner     Chang; Jon
Assistant Examiner    
Attorney/Law Firm     Burke; William J.
Address
Parent Case    
Priority Data    
USPTO Field of Search     382/284 382/294 382/154 382/42 382/47 382/48
Patent Tags     image combination parallax-based technique
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5682198
Katayama
348/47
Oct,1997

[0 after 0 votes]
5568384
Robb
715/532
Oct,1996

[0 after 0 votes]
5550937
Bell
382/293
Aug,1996

[0 after 0 votes]
5530774
Fogel
382/154
Jun,1996

[0 after 0 votes]
5202928
Tomita
382/154
Apr,1993

[0 after 0 votes]
5187754
Currin
382/284
Feb,1993

[0 after 0 votes]
4797942
Burt
382/284
Jan,1989

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A method of processing a plurality of images to generate a three-dimensional mosaic of a scene comprising the steps of:

providing a plurality of images of the scene; and

registering said images along a non-planar parametric surface to construct said three-dimensional mosaic containing an image mosaic of registered images and a shape mosaic, where said image mosaic represents a panoramic view of the scene and said shape mosaic represents a three-dimensional geometry of the scene.

2. The method of claim 1 wherein said registering step further comprises the steps of:

registering each image in said plurality of images along said non-planar parametric surface to produce registered images;

determining, in response to said registered images, translation parameters and a parametric motion field useful in aligning the images along the non-planar parametric surface; and

generating a parallax field representing parallax of objects within the scene.

3. The method of claim 2 further comprising the step of converting said plurality of images into a plurality of multi-resolutional pyramids, where each image pyramid contains a plurality of levels.

4. The method of claim 3 wherein said registering and determining steps are iterated over each of said levels within said multi-resolutional pyramids until said plurality of images are registered to a predefined degree of accuracy.

5. The method of claim 4 wherein said predefined degree of accuracy is a sum of the squares difference measure integrated over selected regions within each of said levels of said multi-resolutional pyramids.

6. The method of claim 1 wherein said shape mosaic contains a parallax motion field.

7. The method of claim 1 wherein said image mosaic and said shape mosaic are multi-resolutional pyramids.

8. The method of claim 1 further comprising the steps of:

converting said image mosaic and said shape mosaic into multi-resolutional pyramids;

converting a new image into a multi-resolutional pyramid; and

determining pose parameters for relating the new image with the image mosaic and the shape mosaic, where the pose parameters contain translation parameters, a planar motion field, and a parallax motion field for the new image.

9. The method of claim 8 further comprising the step of generating a updated image mosaic and an updated shape mosaic, each containing the new image and the pose parameters.

10. The method of claim 8 further comprising the steps of:

providing an existing three-dimensional mosaic;

determining pose parameters for a new image with respect to said existing three-dimensional mosaic;

warping said existing three-dimensional mosaic to image coordinates of said new image to create a synthetic image, where said synthetic image represents a view of the three-dimensional mosaic from the coordinates of the new image; and

merging said synthetic image into said new image to produce a new three-dimensional mosaic that is a combination of said new image and said existing three-dimensional mosaic.

11. The method of claim 10 further comprising the steps of:

providing a next image that sequentially follows said new image;

detecting changes between said new image, said existing three-dimensional mosaic, and said next image, where said changes represent motion within the scene without detecting parallax due to viewpoint change as said motion.

12. The method of claim 1 further comprising the steps of:

detecting points within said three-dimensional mosaic that are occluded within the scene by objects in the scene; and

image processing the detected occluded points such that said occluded points do not produce artifacts in said three-dimensional mosaic.

13. The method of claim 1 further comprising the step of:

estimating a height of points within said three-dimensional mosaic relative to said parametric surface, where said height of said points form a height map that represents the height of object points within said scene.

14. The method of claim 1 further comprising the steps of:

providing a plurality of three-dimensional mosaics representing a scene from different viewpoints, where a three-dimensional mosaic has been generated at each viewpoint;

warping said plurality of three-dimensional mosaics to a reference coordinate system;

merging said plurality of three-dimensional mosaics to form a composite three-dimensional mosaic;

providing coordinates for a new viewpoint of said scene;

determining parameters to relate said new viewpoint coordinates to said composite three-dimensional mosaic; and

warping said composite three-dimensional mosaic to said viewpoint coordinates to create a synthetic image, where said synthetic image represents a new view of the composite three-dimensional mosaic taken from the new viewpoint.

15. The method of claim 1 further comprising the steps of:

providing a plurality of three-dimensional mosaics representing a scene from different viewpoints, where a three-dimensional mosaic has been generated at each viewpoint;

providing coordinates for a new viewpoint of said scene;

determining parameters to relate said new viewpoint coordinates to a plurality of the three-dimensional mosaics;

warping said plurality of three-dimensional mosaics to said viewpoint coordinates to create a synthetic image, where said synthetic image represents a new view of the three-dimensional mosaic taken from the new viewpoint; and

merging said plurality of three-dimensional mosaics to form said synthetic image.

16. The method of claim 1 wherein said registering step further comprises the steps of:

performing a plane-then-parallax process including the steps of registering each image in said plurality of images along a parametric surface to produce initially registered images; determining, in response to said initially registered images, initial translation parameters and a initial parametric motion field useful in initially aligning the images along the parametric surface; and generating an initial parallax field representing parallax of objects within the scene; and

simultaneously registering, using said initial translation parameters, initial parametric motion field and initial parallax field, said images in said plurality of images along said parametric surface to produce final registered images, determining, in response to said final registered images, final translation parameters and a final parametric motion field useful in aligning the images along the parametric surface, and generating a final parallax field representing parallax of objects within the scene.

17. The method of claim 16 further comprising the step of converting said plurality of images into a plurality of multi-resolutional pyramids, where each multi-resolutional pyramid contains a plurality of levels.

18. The method of claim 17 wherein said registering, determining and simultaneously registering steps are iterated over each of said levels within said multi-resolutional pyramids until said plurality of images are registered to a predefined degree of accuracy.

19. The method of claim 18 wherein said predefined degree of accuracy is a sum of the squares difference measure integrated over selected regions within each of said levels of said multi-resolutional pyramids.

20. The method of claim 16 wherein said image mosaic and said shape mosaic are multi-resolutional pyramids.

21. A method of processing a plurality of images to generate a three-dimensional mosaic of a scene comprising the steps of:

providing a plurality of images of the scene;

simultaneously registering said images in said plurality of images along a parametric surface to produce registered images, determining, in response to said registered images, translation parameters and a parametric motion field useful in aligning the images along the parametric surface, and generating a parallax field representing parallax of objects not lying within said parametric surface.

22. The method of claim 21 further comprising the step of converting said plurality of images into a plurality of multi-resolutional pyramids, where each multi-resolutional pyramid contains a plurality of levels.

23. The method of claim 22 wherein said registering and determining steps are iterated over each of said levels within said multi-resolutional pyramids until said plurality of images are registered to a predefined degree of accuracy.

24. The method of claim 23 wherein said predefined degree of accuracy is a sum of the squares difference measure integrated over selected regions within each of said levels of said image pyramids.

25. The method of claim 21 further comprising the steps of:

converting said image mosaic and said shape mosaic into multi-resolutional pyramids;

converting a new image into a multi-resolutional pyramid; and

determining pose parameters for relating the new image with the image mosaic and the shape mosaic, where the pose parameters contain translation parameters, a planar motion field, and a parallax motion field for the new image.

26. The method of claim 25 further comprising the step of generating a updated image mosaic and an updated shape mosaic, each containing the new image and the pose parameters.

27. The method of claim 25 further comprising the steps of:

providing an existing three-dimensional mosaic;

determining pose parameters for a new image with respect to said existing three-dimensional mosaic;

warping said existing three-dimensional mosaic to image coordinates of said new image to create a synthetic image, where said synthetic image represents a view of the three-dimensional mosaic from the coordinates of the new image; and

merging said synthetic image into said new image to produce a new three-dimensional mosaic that is a combination of said new image and said existing three-dimensional mosaic.

28. The method of claim 27 further comprising the steps of:

providing a next image that sequentially follows said new image;

detecting changes between said new image, said existing three-dimensional mosaic, and said next image, where said changes represent motion within the scene without detecting parallax due to viewpoint change as said motion.

29. The method of claim 21 further comprising the steps of:

detecting points within said three-dimensional mosaic that are occluded within the scene by objects in the scene; and

image processing the detected occluded points such that said occluded points do not produce artifacts in said three-dimensional mosaic.

30. The method of claim 21 further comprising the step of:

estimating a height of points within said three-dimensional mosaic relative to said parametric surface, where said height of said points form a height map that represents the height of object points within said scene.

31. The method of claim 21 further comprising the steps of:

providing a plurality of three-dimensional mosaics representing a scene from different viewpoints, where a three-dimensional mosaic has been generated at each viewpoint;

warping said plurality of three-dimensional mosaics to a reference coordinate system;

merging said plurality of three-dimensional mosaics to form a composite three-dimensional mosaic;

providing coordinates for a new viewpoint of said scene;

determining parameters to relate said new viewpoint coordinates to said composite three-dimensional mosaic; and

warping said composite three-dimensional mosaic to said viewpoint coordinates to create a synthetic image, where said synthetic image represents a new view of the composite three-dimensional mosaic taken from the new viewpoint.

32. The method of claim 21 further comprising the steps of:

providing a plurality of three-dimensional mosaics representing a scene from different viewpoints, where a three-dimensional mosaic has been generated at each viewpoint;

providing coordinates for a new viewpoint of said scene;

determining parameters to relate said new viewpoint coordinates to a plurality of the three-dimensional mosaics;

warping said plurality of three-dimensional mosaics to said viewpoint coordinates to create a synthetic image, where said synthetic image represents a new view of the three-dimensional mosaic taken from the new viewpoint; and

merging said plurality of three-dimensional mosaics to form said synthetic image.

33. The method of claim 21 wherein said registering step further comprises the steps of:

performing a plane-then-parallax process including the steps of registering each image in said plurality of images along a parametric surface to produce initially registered images; determining, in response to said initially registered images, initial translation parameters and a initial parametric motion field useful in initially aligning the images along the parametric surface; and generating an initial parallax field representing parallax of objects within the scene; and

simultaneously registering, using said initial translation parameters, initial parametric motion field and initial parallax field, said images in said plurality of images along said parametric surface to produce final registered images, determining, in response to said final registered images, final translation parameters and a final parametric motion field useful in aligning the images along the parametric surface, and generating a final parallax field representing parallax of objects within the scene.

34. The method of claim 33 further comprising the step of converting said plurality of images into a plurality of multi-resolutional pyramids, where each multi-resolutional pyramid contains a plurality of levels.

35. The method of claim 34 wherein said registering, determining and simultaneously registering steps are iterated over each of said levels within said multi-resolutional pyramids until said plurality of images are registered to a predefined degree of accuracy.

36. The method of claim 35 wherein said predefined degree of accuracy is a sum of the squares difference measure integrated over selected regions within each of said levels of said multi-resolutional pyramids.

37. The method of claim 34 wherein said image mosaic and said shape mosaic are multi-resolutional pyramids.

38. A method of processing a plurality of images to generate a three-dimensional mosaic of a scene comprising the steps of:

providing a plurality of images of the scene;

registering each image in said plurality of images along a non-planar parametric surface to produce registered images; and

determining, in response to said registered images, translation parameters and a parametric motion field useful in aligning the images along the non-planar parametric surface; and

generating a parallax field representing parallax of objects within the scene;

constructing, in response to said translation parameters, parametric motion field, and said parallax field, said three-dimensional mosaic containing an image mosaic and a shape mosaic, where said image mosaic represents a panoramic view of the scene and said shape mosaic represents a three-dimensional geometry of the scene.

39. The method of claim 38 further comprising the step of converting said plurality of images into a plurality of multi-resolutional pyramids, where each multi-resolutional pyramid contains a plurality of levels.

40. The method of claim 39 wherein said registering and determining steps are iterated over each of said levels within said multi-resolutional pyramids until said plurality of images are registered to a predefined degree of accuracy.

41. The method of claim 40 wherein said predefined degree of accuracy is a sum of the squares difference measure integrated over selected regions within each of said levels of said multi-resolutional pyramids.

42. The method of claim 41 wherein said shape mosaic contains a parametric motion field and a parallax motion field.

43. The method of claim 38 wherein said image mosaic and said shape mosaic are multi-resolutional pyramids.

44. A method of processing a plurality of images to generate a three-dimensional mosaic of a scene comprising the steps of:

providing a plurality of images of the scene;

simultaneously registering said images in said plurality of images along a parametric surface to produce registered images, determining, in response to said registered images, translation parameters and a parametric motion field useful in aligning the images along the parametric surface, and generating a parallax field representing parallax of objects within the scene; and

constructing, in response to said translation parameters, parametric motion field, and said parallax field, said three-dimensional mosaic containing an image mosaic and a shape mosaic, where said image mosaic represents a panoramic view of the scene and said shape mosaic represents a three-dimensional geometry of the scene.

45. The method of claim 44 further comprising the step of converting said plurality of images into a plurality of multi-resolutional pyramids, where each multi-resolutional pyramid contains a plurality of levels.

46. The method of claim 45 wherein said registering, and determining steps are iterated over each of said levels within said multi-resolutional pyramids until said plurality of images are registered to a predefined degree of accuracy.

47. The method of claim 46 wherein said predefined degree of accuracy is a sum of the squares difference measure integrated over selected regions within each of said levels of said multi-resolutionsal pyramids.

48. The method of claim 44 wherein said image mosaic and said shape mosaic are multi-resolutional pyramids.

49. An image processing system for generating a three-dimensional mosaic three-dimensional mosaic of a scene from a plurality of images of the scene, comprising:

means for storing said plurality of images;

a registration processor, connected to said storing means, for registering said images along a non-planar parametric surface to construct said three-dimensional mosaic containing an image mosaic and a shape mosaic, where said image mosaic represents a panoramic view of the scene and said shape mosaic represents a three-dimensional geometry of the scene.

50. The system of claim 49 wherein said registration processor further comprises:

a plane-then-parallax registration processor for aligning said images along said non-polar parametric surface that extends through the plurality of images to produce translation parameters and a parametric motion field used to align the images within the image mosaic Land then for determining a parallax field representing objects within the scene.

51. An image processing system for generating a three-dimensional mosaic of a scene from a plurality of images of the scene, comprising:

means for storing said plurality of images;

a plane-and-parallax registration processor for simultaneously aligning said images along a parametric surface that extends through the plurality of images to produce translation parameters and a parametric motion field used to align the images within the image mosaic and for determining a parallax field representing objects within the scene.

52. An image processing system for generating a three-dimensional mosaic of a scene from a plurality of images of the scene, comprising:

means for storing said plurality of images;

a plane-then-parallax registration processor for aligning said images along a parametric surface that extends through the plurality of images to produce initial translation parameters and an initial parametric motion field used to align the images within the image mosaic and then for determining an initial parallax field representing objects within the scene that do not lie in the parametric surface; and

a plane-and-parallax registration processor, connected to an output of said plane-then-parallax registration processor, for simultaneously aligning said images along said parametric surface to produce final translation parameters and a final parametric motion field used to align the images within the image mosaic and for determining a final parallax field representing objects within the scene that do not lie in the parametric surface.

53. The system of claim 52 further comprising a three-dimensional mosaic generator, connected to said registration processor, for combining said images in said plurality of images using said final translation parameters and said final motion flow field to form said image mosaic and for generating said shape mosaic containing the final parallax field.
 Description Submit all comments and votes
 


The invention relates to image processing systems, and more particularly, the invention relates to an image processing system that combines multiple images into a mosaic using a parallax-based technique.

BACKGROUND OF THE DISCLOSURE

Until recently, image processing systems have generally processed images, such as frames of video, still photographs, and the like, on an individual, image-by-image basis. Each individual frame or photograph is typically processed by filtering, warping, and applying various parametric transformations. In order to form a panoramic view of the scene, the individual images are combined to form a two-dimensional mosaic, i.e., an image that contains a plurality of individual images. Additional image processing is performed on the mosaic to ensure that the seams between the images are invisible such that the mosaic looks like a single large image.

The alignment of the images and the additional processing to remove seams is typically accomplished manually by a technician using a computer workstation, i.e., the image alignment and combination processes are computer aided. In such computer aided image processing systems, the technician manually selects processed images, manually aligns those images, and a computer applies various image combining processes to the images to remove any seams or gaps between the images. Manipulation of the images is typically accomplished using various computer input devices such as a mouse, trackball, keyboard and the like. Since manual mosaic generation is costly, those skilled in the art have developed automated systems for generating image mosaics.

In automated systems for constructing mosaics, the information within a mosaic is generally expressed as two-dimensional motion fields. The motion is represented as a planar motion field, e.g., an affine or projective motion field. Such a system is disclosed in U.S. patent application Ser. No. 08/339,491, entitled "Mosaic Based Image Processing System", filed Nov. 14, 1994 now U.S. Pat. No. 5,649, 032, and herein incorporated by reference. The image processing approach disclosed in the '491 application automatically combines multiple image frames into one or more two-dimensional mosaics. However, that system does not account for parallax motion that may cause errors in the displacement fields representing motion in the mosaic.

In other types of image processing systems, multiple images are analyzed in order to recover photogrammatic information such as relative orientation estimation, range map recovery and the like without generating a mosaic. These image analysis techniques assume that the internal camera parameters (e.g., focal length pixel resolution, aspect ratio, and image center) are known. In automated image processing systems that use alignment and photogrammetry, the alignment and photogrammatic process involves two steps: (1) establishing correspondence between pixels within various images via some form of area- or feature-based matching scheme, and (2) analyzing pixel displacement in order to recover three-dimensional scene information.

Other image processing systems have analyzed image motion within a three-dimensional scene that is imaged from multiple viewpoints to determine the range or depth of objects within the scene. Such an approach is disclosed in K. J. Hanna, "Direct Multi-Resolution Estimation of Ego-Motion and Structure From Motion", Proceedings of the IEEE Workshop on Visual Motion, Princeton, N.J., Oct. 7-9, 1991, pp. 156-162, and K. J. Hanna et al., "Combining Stereo and Motion Analysis for Direct Estimation of Scene Structure", Proceedings of the Fourth International Conference on Computer Vision (ICCV'93), Berlin, Germany, May, 1993. The disclosures within both these papers are incorporated herein by reference. The prior art methods of generating three-dimensional representations have heretofore not been used in conjunction with systems that generate two-dimensional mosaics. Consequently, these approaches are used to analyze the three-dimensional geometry of a scene, but do not form useful representations of combinations of images such as mosaics.

Therefore, a need exists in the art for a system that automatically generates, from a plurality of images, a three-dimensional mosaic that accurately represents both the two-dimensional image information and the three-dimensional geometry within a scene.

SUMMARY OF THE INVENTION

The disadvantages associated with the prior art are overcome by the present invention of a system for generating three-dimensional mosaics from a plurality of input images. The plurality input images contain at least two images of a single scene, where at least two of the images have overlapping regions but, in general, depict the scene from differing viewpoints. The input images are generated by either a single camera producing a series of video frames or a plurality of cameras generating still or video frames from differing viewpoints of the same scene. In either case, the input images to the system are digital images that are either digitized by the camera or digitized after the camera generates the image. The system combines the input images using a parallax-based approach that generates a three-dimensional mosaic comprising an image mosaic representing a panoramic view of the scene and a shape mosaic representing the three-dimensional geometry of the scene. From this three-dimensional mosaic, any viewpoint of the scene can be synthetically derived, i.e., viewpoints that are not collocated with the camera(s) that originally imaged the scene. Furthermore, such a three-dimensional mosaic can be used to estimate object height within the imaged scene as well as be used for efficient compression of video information for transmission or storage.

More specifically, the system generates the three-dimensional mosaic using a sequence of image processing techniques. First, the images and any existing three-dimensional mosaic into which the images are to be incorporated are subsampled to form conventional multi-resolutional image pyramids. Then, the system uses a sequential image registration process dubbed a plane-then-parallax (P-then-P) process to compute image alignment parameters and the parallax motion that exists between images. Lastly, the full alignment and parallax field generation is achieved using a simultaneous image registration process dubbed a plane-and-parallax (P-and-P) process. After each step of processing, the degree of image alignment is monitored such that, if accurate alignment is attained, subsequent processing is avoided. In the broadest use of the invention, either P-then-P or P-and-P processing can be used alone to register the images. These image registration processes compute both alignment and motion parameters (e.g., translation parameters for alignment and both a parallax field and a planar motion field for motion estimation) that are useful for aligning images to generate an image mosaic and for capturing the three-dimensional geometry of the scene to generate a shape mosaic. As such, the result of the registration processes can be used to generate a three-dimensional mosaic containing a two-dimensional image mosaic and a shape mosaic. From the information contained in the three-dimensional mosaic, a synthetic viewpoint of the scene can be generated that would take into account any parallax within the scene to produce a realistic view. The system uses pose estimation processing of the three-dimensional mosaic to achieve an image from the synthetic viewpoint. Also, the system contains a process for detecting occluded points in the scene such that these occluded points can be further processed to achieve a realistic synthetic image.

Such three-dimensional mosaics find use in estimating object height within a scene; in achieving efficient image and video compression, storage and retrieval; in detecting object motion or image changes without detecting parallax motion as an image change; as well as many other applications.

BRIEF DESCRIPTION OF THE DRAWINGS

The teachings of the present invention can be readily understood by considering the following detailed description in conjunction with the accompanying drawings, in which:

FIG. 1 depicts a block diagram of an imaging system incorporating an image processing system of the invention;

FIG. 2 schematically depicts the input images and output mosaics of the system of FIG. 1;

FIG. 3 is a geometric representation of the relationship amongst a reference image generated by a reference camera, an inspection image generated by an inspection camera, and an arbitrary parametric surface within a scene imaged by the cameras;

FIG. 4 is a flow chart of a P-then-P routine for registering images and extracting parallax information from the registered images;

FIG. 5 is a flow chart of a P-and-P routine for registering images and extracting parallax information from the registered images;

FIG. 6 is a functional block diagram of an image processing system of the present invention;

FIG. 7 is a flow chart of a pose estimation routine;

FIG. 8 is a flow chart of a three-dimensional corrected mosaic construction routine;

FIG. 9 is a two-dimensional geometric representation of the plane OMP of FIG. 3 where the scene contains an object that occludes points within the image; and

FIG. 10 depicts an experimental set-up for estimating heights of objects within a scene using the system of the present invention.

FIG. 11 depicts a block diagram of an application for the inventive system that synthesizes a new view of existing three-dimensional mosaics;

To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures.

DETAILED DESCRIPTION

The present invention is an image processing system that combines a plurality of images representing an imaged scene to form a three-dimensional (3D) mosaic, where the 3D mosaic contains an image mosaic representing a panoramic view of the scene and a shape mosaic representing the three-dimensional geometry of the scene. The shape mosaic defines a relationship between any two images by a motion field that is decomposed into two-dimensional image motion of a two-dimensional, parametric surface and a residual parallax field. Although many techniques may be useful in generating the motion fields and the parametric translation parameters, the following disclosure discusses two illustrative processes. The first process, known as plane-then-parallax (P-then-P), initially registers the images along a parametric surface (plane) in the scene and then determines a parallax field representing the three-dimensional geometry of the scene. The second illustrative process, known as plane-and-parallax (P-and-P), simultaneously registers the images and determines the parallax field. With either process, the results of registration are translation parameters for achieving image alignment along the parametric surface, a parallax field representing the three-dimensional geometry (motion) of the scene with respect to the parametric surface, and a planar motion field representing motion within the parametric surface. These results can be used to combine the input images to form a three-dimensional mosaic.

Image motion of a parametric surface is, in essence, a conventional representation of a 2D mosaic. Motion of the parametric surface is generally expressed as a parametric motion field that is estimated using one of the many available techniques for directly estimating two-dimensional motion fields. For an overview of such techniques, see Bergen at al., "Hierarchical Model-Based Motion Estimation," Proceedings 2nd European Conference on Computer Vision-92, Springer-Verlag, Santa Margherita Ligure, Italy, May 1992. Generally speaking, a direct approach is sufficient for aligning and combining a plurality of images to form a two-dimensional mosaic. Such a two-dimensional mosaic represents an alignment of a two-dimensional parametric surface within a scene captured by the image sequence. This parametric surface can either be an actual surface in the scene within which lie most objects of the scene or the parametric surface can be a virtual surface that is arbitrarily selected within the scene. All objects within the scene generate what is known as parallax motion as a camera moves with respect to the parametric surface. This parametric motion is represented by a parallax motion field (also referred to herein as a parallax field). The parallax field has value for objects within the scene that do not lie in the plane of the surface. Although objects lying in the plane of the surface are represented in the parallax field, those objects have zero parallax. More particularly, the parallax field represents the objects that lie in front of and behind the parametric surface and the distance (height) of these objects from the surface, i.e., the three-dimensional geometry of the scene. As such, using the parallax field in combination with the parametric surface and its planar motion field, the system can generate a three-dimensional reconstruction of the scene up to an arbitrary collineation. If camera calibration parameters such as focal length and optical center are known, then this three-dimensional reconstruction of the scene is Euclidean.

FIG. 1 depicts a block diagram of the image processing system 100 as it is used to generate 3D mosaics from a plurality of images. The image processing system is, in general, a general purpose computer that is programmed to function as an image processing system as described herein. The system further contains one or more cameras 104.sub.n that image a scene 102. In the illustrative system two cameras, cameras 104.sub.1 and 104.sub.2. are shown. Each camera, for simplicity, is assumed to be digital video camera that generates a series of frames of digitized video information. Alternatively, the cameras could be still cameras, conventional video cameras, or some other form of imaging sensor such as an infrared sensor, an ultrasonic sensor, and the like, whose output signal is separately digitized before the signal is used as an input to the image processing system 100. In any event, each camera 104.sub.1 and 104.sub.2 generates an image having a distinct view of the scene. Specifically, the images could be selected frames from each camera imaging a different view of the scene or the images could be a series of frames from a single camera as the camera pans across the scene. In either case, the input signal to the image processing system of the present invention is at least two images taken from different viewpoints of a single scene. Each of the images partially overlaps the scene depicted in at least one other image. The image processing system 100 combines the images into a 3D mosaic and presents the mosaic to an output device 106. The output device could be a video compression system, a video storage and retrieval system, or some other application for the 3D mosaic.

FIG. 2 schematically depicts the input images 200.sub.n to the system of FIG. 1 and the output 3D mosaic 202 generated by that system in response to the input images. The input images, as mentioned above, are a series of images of a scene, where each image depicts the scene from a different viewpoint. The system aligns the images and combines them to form an image mosaic 204, e.g., a two-dimensional mosaic having the images aligned along an arbitrary parametric surface extending through all the images. Aligning the images to form the image mosaic requires both the parametric translation parameters and the planar motion field. In addition to the image mosaic, the system generates a shape mosaic 206 that contains the motion field that relates the three-dimensional objects within the images to one another and to the parametric surface. The shape mosaic contains a parallax motion field 208. The planar motion field represents motion within the parametric surface that appears in the images from image to image, while the parallax flow field represents motion due to parallax of three-dimensional objects in the scene with respect to the parametric surface.

A. Determining A Residual Parallax Field

Consider two camera views, one denoted as the "reference" camera and the other denoted the "inspection" camera (e.g., respectively cameras 104.sub.1 and 104.sub.2 of FIG. 1). In general, the image processing system maps any three-dimensional (3D) point P.sub.1 in the reference camera coordinate system to a 3D point P.sub.2 in the inspection camera coordinate system using a rigid body transformation represented by Equation 1.

P.sub.2 =R(P.sub.1)+T.sub.2 =R(P.sub.1 -T.sub.1) (1)

The mapping vector is represented by a rotation (R) followed by a translation (T.sub.2) or by a translation (T.sub.1) followed by a rotation (R). Using perspective projection, the image coordinates (x,y) of a projected point P are given by the vector p of Equation 2. ##EQU1## where f is the focal length of the camera.

FIG. 3 is a geometric representation of the relationship amongst a reference image 302 generated by the reference camera, an inspection image 304 generated by the inspection camera, and an arbitrary parametric surface 300 within the imaged scene. Let S denote the surface of interest (a real or virtual parametric surface 300), P denotes an environmental point (e.g., a location of an object) within the scene that is not located on S, and O and M denote the center locations (focal points) of each camera. The image of P on the reference view 302 is p. Let the ray MP intersect the surface S at location Q. A conventional warping process, used to align the images 302 and 304 by aligning all points on the surface S, warps p', the image of P on the inspection image 304, to q, the image of Q on the reference image 302. Therefore, the residual parallax vector is pq, which is the image of line PQ. It is immediately obvious from the figure that vector pq lies on the plane OMP, which is the epipolar plane passing through p. Since such a vector is generated for any point P in the scene, it can be said that the collection of all parallax vectors forms a parallax displacement field. Since the parallax displacement vector associated with each image point lies along the epipolar plane associated with that image, the vector is referred to as an epipolar field. This field has a radial structure, each vector appearing to emanate from a common origin in the image dubbed the "epipole" (alias focus of expansion (FOE)). In FIG. 3 the epipole is located at point "t". From FIG. 3, it is obvious that the epipole t lies at the intersection of the line OM with the image plane 302. The parallax displacement field is also referred to herein simply as a parall