WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Aligning rectilinear images in 3D through projective registration and calibration    

Get related patents on CD
United States Patent6754379   
Link to this pagehttp://www.wikipatents.com/6754379.html
Inventor(s)Xiong; Yalin (Montrose, CA); Turkowski; Ken (Menlo Park, CA)
AbstractAn improved apparatus and method for creating high quality virtual reality panoramas is disclosed that yields dramatic improvements during the authoring and projecting cycles, with speeds up to several orders of magnitude faster than prior systems. In a preferred embodiment, a series of rectilinear images taken from a plurality of rows are pairwise registered with one another, and locally optimized using a pairwise objective function (local error function) that minimizes certain parameters in a projective transformation, using an improved iterative procedure. The local error function values for the pairwise registrations are then saved and used to construct a quadratic surface to approximate a global optimization function (global error function). The chain rule is used to avoid the direct evaluation of the global objective function, saving computation. In one embodiment concerning the blending aspect of the present invention, an improved procedure is described that relies on Laplacian and Gaussian pyramids, using a blend mask whose boundaries are determined by the grassfire transform. An improved iterative procedure is disclosed for the blending that also determines at what level of the pyramid to perform blending, and results in low frequency image components being blended over a wider region and high frequency components being blended over a narrower region. Human interaction and input is also provided to allow manual projective registration, initial calibration and feedback in the selection of photos and convergence of the system.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History Custom Search
Inventor     Xiong; Yalin (Montrose, CA); Turkowski; Ken (Menlo Park, CA)
Owner/Assignee     Apple Computer, Inc. (Cupertino, CA)
Patent assignment
All assignments
Company News
Publication Date     June 22, 2004
Application Number     10/370,280
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     February 18, 2003
US Classification     382/154 345/629 382/294
Int'l Classification     G06K 009/00
Examiner     Mehta; Bhavesh M.
Assistant Examiner     Chawan; Sheela
Attorney/Law Firm     Fenwick & West LLP
Address
Parent Case     CROSS REFERENCE TO RELATED APPLICATIONS This application is a continuation of U.S. patent application Ser. No. 10/077,102 for "Aligning Rectilinear Images in 3D Through Projective Registration and Calibration," filed Feb. 14, 2002, now U.S. Pat. No 6,549,651, which is a continuation of U.S. patent application Ser. No. 09/160,822 for "Aligning Rectilinear Images in 3D Through Projective Registration and Calibration," filed Sep. 25, 1998, now U.S. Pat. No. 6,434,265, the disclosures of which are incorporated herein by reference.
Priority Data    
USPTO Field of Search     382/305 382/284 382/294 382/254 382/154 382/299 382/260 382/252 382/253 382/275 382/276 382/345 345/427 345/433 345/629
Patent Tags     aligning rectilinear images 3d through projective registration and calibration
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
6549651
Xiong
382/154
Apr,2003

[0 after 0 votes]
6434265
Xiong
382/154
Aug,2002

[0 after 0 votes]
6097854
Szeliski

Aug,2000

[0 after 0 votes]
6078701
Hsu
382/294
Jun,2000

[0 after 0 votes]
6075905
Herman, deceased

Jun,2000

[0 after 0 votes]
6009190
Szeliski

Dec,1999

[0 after 0 votes]
5986668
Szeliski

Nov,1999

[0 after 0 votes]
5951475
Gueziec
600/425
Sep,1999

[0 after 0 votes]
5664082
Chen
345/648
Sep,1997

[0 after 0 votes]
5572235
Mical
345/600
Nov,1996

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B

[0 market size comments]
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%

[0 market share comments]
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%

[0 reasonable royalty comments]
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

[0 Guesstimation of Royalty Value Comments]
License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
[0 license availability comments]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
[0 owner/assignee comments]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

[0 competitive advantage comments]
Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

[0 commercial alternatives comments]
 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


We claim:

1. A method for creating a panoramic image from a plurality of rectilinear images obtained from a camera, the method comprising:

estimating a projective transformation for a pairwise registration process between at least two overlapping rectilinear images, said projective transformation relating the images and yielding at least one Hessian matrix describing a registration error landscape in the neighborhood of the projective transformation;

determining a vector of internal parameters and a vector of external parameters from the at least one Hessian matrix to minimize image discrepancy in an overlapping region between the at least two images; and

blending the at least two images to provide a smooth transition between the at least two images.

2. The method according to claim 1, wherein determining the vectors comprises determining the vectors to minimize image discrepancy in the overlapping region during a process of global optimization.

3. The method according to claim 1, wherein the projective transformation comprises a 3.times.3 matrix having nine projective parameters.

4. The method according to claim 1, wherein the pairwise registration process includes a local registration error function between the at least two images, ##EQU7##

5. The method according to claim 4, wherein the at least one Hessian matrix, C.sub.ij, between the at least two images is determined from the local registration a error function,

e.sub.ij (M.sub.ij).apprxeq.e.sub.ij.sup.0 +(M.sub.ij -M.sub.ij.sup.0).sup.T C.sub.ij (M.sub.ij -M.sub.ij.sup.0).

6. The method according to 5,

wherein determining the vectors comprises determining the vectors to minimize image discrepancy in the overlapping region during a process of global optimization,

and wherein a global error function that is optimized during the global optimization process is defined by a function of the sum of the products of overlap area between the at least two images and the local registration error function.

7. The method according to claim 1, wherein the vector of internal parameters comprises a representation of focal length.

8. The method according to claim 7, wherein the vector of internal parameters further comprises a representation of image center position.

9. The method according to claim 8, wherein the vector of internal parameters further comprises a representation of pixel aspect ratio.

10. The method according to claim 9, wherein the vector of internal parameters further comprises a representation of skew.

11. The method according to claim 1, wherein the vector of external parameters comprises a representation of camera orientation with respect to a common frame of reference.

12. The method according to claim 11, wherein the representation of camera orientation comprises Euler angles including pan, tilt, and roll.

13. The method according to claim 11, wherein the representation of camera orientation comprises Euler angles including roll, pitch, and yaw.

14. The method according to claim 11, wherein the representation of camera orientation comprises unit quaternions.

15. The method according to claim 1, wherein blending the at least two images comprises:

for signals in the overlapping region having a frequency higher than a threshold frequency, blending the signals over a first blend region range; and

for signals in the overlapping region having a frequency lower than the threshold frequency, blending the signals over a blend region range wider than the first blend region range.

16. A system for creating a panoramic image from a plurality of rectilinear images obtained from a camera, the system comprising:

a processor for estimating a projective transformation during a pairwise registration process between at least two overlapping rectilinear images obtained from a camera, said projective transformation relating the images and yielding at least one Hessian matrix describing a registration error landscape in the neighborhood of the projective transformation;

a global optimizer for determining a vector of internal parameters and a vector of external parameters from the at least one Hessian matrix to minimize image discrepancy in an overlapping region between the at least two overlapping rectilinear images; and

a blender for blending the at least two images to provide a smooth transition between the at least two images.

17. The system according to claim 16, wherein the projective transformation comprises a 3.times.3 matrix having nine projective parameters.

18. The system according to claim 16, wherein the pairwise registration process includes a local registration error function between the at least two images. ##EQU8##

19. The system according to claim 18, wherein the at least one Hessian matrix, C.sub.ij, between the at least two images is determined from the local registration error function,

e.sub.ij (M.sub.ij).apprxeq.e.sub.ij.sup.0 +(M.sub.ij -M.sub.ij.sup.0).sup.T C.sub.ij (M.sub.ij.sup.0).

20. The system according to claim 19, wherein a global error function that is optimized by the global optimizer is defined by a function of the product of overlap area between the at least two images and the local registration error function.

21. The system according to claim 16, wherein the vector of internal parameters comprises a representation of focal length.

22. The system according to claim 21, wherein the vector of internal parameters further comprises a representation of image center position.

23. The system according to claim 22, wherein the vector of internal parameters further comprises a representation of pixel aspect ratio.

24. The system according to claim 23, wherein the vector of internal parameters further comprises a representation of skew.

25. The system according to claim 16, wherein the vector of external parameters comprises a representation of camera orientation with respect to a common frame of reference.

26. The system according to claim 25, wherein the representation of camera orientation comprises Euler angles including pan, tilt, and roll.

27. The system according to claim 25, wherein the representation of camera orientation comprises Euler angles including roll, pitch, and yaw.

28. The system according to claim 25, wherein the representation of camera orientation comprises unit quaternions.

29. The system according to claim 16, wherein:

for signals in the overlapping region having a frequency higher than a threshold frequency, the blender blends the signals over a first blend region range; and

for signals in the overlapping region having a frequency lower than the threshold frequency, the blender blends the signals over a blend region range wider than the first blend region range.

30. A method for creating a panoramic image from a plurality of rectilinear images obtained from a camera, the method comprising:

estimating a projective transformation during a pairwise registration process between at least two overlapping rectilinear images obtained from a camera, said projective transformation providing at least one Hessian matrix relating the at least two rectilinear images;

determining a vector of internal parameters and a vector of external parameters from the at least one Hessian matrix to minimize image discrepancy in an overlapping region between the at least two overlapping rectilinear images;

blending the at least two images to provide a smooth transition between the at least two images; and

mapping the blended images onto a three-dimensional geometry surface.

31. The method according to claim 30, wherein determining the vectors comprises determining the vectors to minimize image discrepancy in the overlapping region during a process of global optimization.

32. The method according to claim 30, wherein the three-dimensional geometry surface is a sphere.

33. A computer program product for creating a panoramic image from a plurality of rectilinear images obtained from a camera, the computer program product comprising:

a computer-readable medium; and

computer program code, encoded on the medium, for:

estimating a projective transformation during a pairwise registration process between at least two overlapping rectilinear images obtained from a camera, said projective transformation relating the images and yielding at least one Hessian matrix describing a registration error landscape in the neighborhood of the projective transformation;

determining a vector of internal parameters and a vector of external parameters from the at least one Hessian matrix to minimize image discrepancy in an overlapping region between the at least two overlapping rectilinear images; and

blending the at least two images to provide a smooth transition between the at least two images.

34. The computer program product according to claim 33, wherein the computer program code for determining the vectors comprises computer program code for determining the vectors to minimize image discrepancy in the overlapping region during a process of global optimization.

35. The computer program product according to claim 33, wherein the projective transformation comprises a 3.times.3 matrix having nine projective parameters.

36. The computer program product according to claim 33, wherein the pairwise registration process includes a local registration error function between the at least two images. ##EQU9##

37. The computer program product according to claim 36, wherein the at least one Hessian matrix, C.sub.ij, between the at least two images is determined from the local registration error function,

e.sub.ij (M.sub.ij).apprxeq.e.sub.ij.sup.0 +(M.sub.ij -M.sub.ji.sup.0).sup.T C.sub.ij (M.sub.ij -M.sub.ij.sup.0).

38. The computer program product according to claim 37, wherein a global error function that is optimized during the global optimization process is defined by a function of the sum of the products of overlap area between the at least two images and the local registration error function.

39. The computer program product according to claim 33, wherein the vector of internal parameters comprises a representation of focal length.

40. The computer program product according to claim 39, wherein the vector of internal parameters further comprises a representation of image center position.

41. The computer program product according to claim 40, wherein the vector of internal parameters further comprises a representation of pixel aspect ratio.

42. The computer program product according to claim 41, wherein the vector of internal parameters further comprises a representation of skew.

43. The computer program product according to claim 33, wherein the vector of external parameters comprises a representation of camera orientation with respect to a common frame of reference.

44. The computer program product according to claim 43, wherein the representation of camera orientation comprises Euler angles including pan, tilt and roll.

45. The computer program product according to claim 43, wherein the representation of camera orientation comprises Euler angles including roll, pitch, and yaw.

46. The computer program product according to claim 43, wherein the representation of camera orientation comprises unit quarternions.

47. The computer program product according to claim 33, wherein the computer program code for blending the at least two images comprises computer program code for:

for signals in the overlapping region having a frequency higher than a threshold frequency, blending the signals over a first blend region range; and

for signals in the overlapping region having a frequency lower than the threshold frequency, blending the signals over a blend region range wider than the first blend region range.

48. A computer program product for creating a panoramic image from a plurality of rectilinear images obtained from a camera, the computer program product comprising:

a computer-readable medium; and

computer program code, encoded on the medium, for:

estimating a projective transformation during a pairwise registration process between at least two overlapping rectilinear images obtained from a camera, said projective transformation providing at least one Hessian matrix relating the at least two rectilinear images;

determining a vector of internal parameters and a vector of external parameters from the at least one Hessian matrix to minimize image discrepancy in an overlapping region between the at least two overlapping rectilinear images;

blending the at least two images to provide a smooth transition between the at least two images; and

mapping the blended images onto a three-dimensional geometry surface.

49. The computer program product according to claim 48, wherein the computer program code for determining the vectors comprises computer program code for determining the vectors to minimize image discrepancy in the overlapping region during a process of global optimization.

50. The computer program product according to claim 48, wherein the three-dimensional geometry surface is a sphere.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

1. Field of Invention

The present invention relates generally to an improved system for creating a full 360-degree virtual reality panorama from rectilinear images.

A panorama is a compact representation of the environment viewed from a 3D position. While an ordinary image can capture only a small portion of the environment, a panorama can capture it all, or any portion of it, depending on the geometry in which the panoramas are represented. Recently there has been an explosive popularity of panoramas on the world wide web and in multimedia as an effective tool to present a photo-realistic virtual reality. However, creating high-quality panoramas, especially those that completely enclose space, has been difficult.

2. Description of Related Art

Various systems have been proposed for simulating a virtual reality environment using photographic quality images. Many virtual reality environments use 3D models or mathematical equations to create a simulated world. The user explores this simulation in real time. Though 3D modeling via equations has certain advantages, such as a depiction of a scene from any arbitrary vantage point, creating images from equations generated by a computer is seriously limited by the speed of the computer. To avoid this problem, technology such as QuickTime.TM. VR from Apple Corporation uses images that have already been produced, either photographically or generated by a 3D modeling program, and stored in secondary memory. Software only has to read the image files from a disk and display the scene as needed, rather than calculating the scene from mathematical models. However, a limitation of the QuickTime.TM. VR program is that it requires that the view direction for photos reside in a single plane, such as that obtained by rotating a camera on a tripod. It also requires that the vertical field of view (or equivalently, the focal length) be known, and that there be roughly equal angular increments between one photo and the next.

Further, a panoramic movie or image can be created using specialized hardware, such as with a panoramic camera or a fisheye lens camera. However, such hardware is inconvenient for the average novice photographer. In the alternative, software can be used to simulate a panorama. This obviates the need for specialized hardware.

Though various software programs have been proposed to simulate panoramas without the use of special hardware, these programs have certain serious drawbacks that have not been successfully overcome to date. These include, but are not limited to, unrealistic representations of images, lack of proper registration and calibration of images, lack of proper blending of images, and slow speed in registering, calibrating and blending images to create a panorama.

SUMMARY OF THE INVENTION

Accordingly, one aspect of the present invention is to provide an improved system and method for overcoming the drawbacks of prior techniques discussed above.

Another aspect of the present invention is to provide for the registration, calibration and global optimization of images, preferably captured from a substantially single nodal position. The solution to creating a full 360-degree panorama quickly and seamlessly is divided into three steps. The first step registers all overlapping images projectively. A combination of a gradient-based optimization method and a correlation-based linear search has proved to be robust in cases of drastic exposure differences and small amount of parallax. The second step takes the projective matrices and their associated Hessian matrices as inputs, and calibrates the internal and external parameters of every image through a global optimization. The objective is to minimize the overall image discrepancies in all overlap regions while converting projective matrices into camera parameters such as focal length, aspect ratio, image center, 3D orientation and the like. Improved techniques for global optimization are disclosed that give order of magnitude improvements over prior systems of optimization. The third step re-projects all images onto a panorama by a method employing Laplacian-pyramid based blending using a Gaussian blend mask generated by the grassfire transform. The purpose of the blending is to provide a smooth transition between images and eliminate small residues of misalignments resulting from parallax or imperfect pairwise registrations. The invention further provides for human interaction, where necessary, for initialization, feedback and manual options.

Further, the present invention, unlike some of the prior art, allows for multiple views, from multiple planes and rows of images, and allows for the arbitrary orientation of photographic images to be constructed into a panorama, without specialized hardware such as a tripod or fisheye lens. In addition, the present system and method can be several orders of magnitude faster than the prior art.

The numerous aspects of the invention described herein result in a system for registration, calibration and blending that creates high quality panoramas from rectilinear images that is up to several orders of magnitude faster than prior systems. In one calculation, the present invention is up to 100,000 times faster than prior techniques. As a consequence, the present invention could be used to construct panoramas much quicker than previous methods. These panoramas can be used in applications where real-time image rendering is important, such as in real-time 3D virtual reality, the construction of background images, computer animation, multimedia, and the like.

The above described and many other features and attendant advantages of the present invention will become apparent from a consideration of the following detailed description when considered in conjunction with the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

Detailed description of preferred embodiments of the invention will be made with reference to the accompanying drawings.

FIGS. 1(a) and 1(b) is an artist's rendition of a composite photograph before and after the application of the present invention.

FIG. 2 is a generalized flowchart of the various function modules comprising one embodiment of the present invention.

FIG. 3 is a generalized flowchart of the method of operation of the Pairwise Registration function module of one embodiment of the present invention.

FIG. 4 is a generalized flowchart for the method of operation of the Calibration and Global Optimization function module of one embodiment of the present invention.

FIG. 5 is a generalized flowchart for the method of operation of the blending function module of one embodiment of the present invention.

FIG. 6 is a screen shot of a user interface dialog window for the user interface of one embodiment of the present invention.

FIG. 7 is a conceptual illustration on the problem of finding the proper Laplacian pyramid level using the minor axis of an inertial ellipse.

FIG. 8 is a graphical illustration of the transition lengths for different frequency image components (low, middle and high) used in one embodiment of the blending function module of the present invention.

FIG. 9 conceptually illustrates the weighted average method for blending.

FIGS. 10(a) and (b) illustrate the blend mask used for blending two images during the blending phase of one embodiment of the present invention.

FIGS. 11(a) and (b) illustrate a particular problem overcome during blending in one embodiment of the present invention.

FIG. 12 illustrates a particular virtual reality orientation of images for the user interface for one embodiment of the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Disclosed herein is a detailed description of the best presently known mode of carrying out the invention. This description is not to be taken in a limiting sense, but is made merely for the purpose of illustrating the general principles of the invention. The section titles and overall organization of the present detailed description are for the purpose of convenience only and are not intended to limit the present invention.

Turning to FIG. 1, there is shown a simulation of different overlapping rectilinear images, or 2D photographs, framed by dashed lines during the authoring portion of the present invention, as indicated by dashed lines 110. Where the images overlap there is potential for misalignment when constructing a 3D panorarma, as indicated by blurry lines 112, for a variety of reasons, including the arbitrary position of the camera, errors in internal and external camera pararmeters, and, distortions that occur when warping a 2D image to construct a 3D image space. The present invention is designed to calibrate and align all such 2D rectilinear images with respect to one another and globally, blend the images where they overlap, and construct a reconstructed and relatively error free 3D panorama image, shown conceptually in 2D form as FIG. 1(b), for any arbitrary geometry.

FIG. 2 discloses a generalized flowchart for overall operation of the invention. The invention is part of a system 210 comprising a computer, having all necessary hardware, such as processing unit 212, e.g., a G3 microprocessor chip; I/O 214, such as a keyboard 216, video monitor 218, and mouse 217; memory 220, which may be any sort of memory buffer, preferably primary memory, e.g. RAM, that can be cached to secondary memory, such as a hard drive. The system 210 is controlled by a program residing in system memory 220, which also stores output data and other data. The program is preferably written in the C or C++ language, using classes, structures, functions, calls, translation units, headers, subroutines, modules, and other features of structured programming, where appropriate, of both data and source code, suitably compiled and in executable form, in accordance with the teachings of the present disclosure to practice the invention. The present invention is also suitable for implementation with an interpreted language such as Java.

In constructing a panorama from rectilinear images, the system finds solutions to three sub-problems: (1) the projective registrations of overlapping images (shown as the "local pairwise registration" box 222 in FIG. 2), (2) calibration and global optimization of these images, a self-calibration in which 2D image planes are positioned as 3D planes in space (shown as the "calibration and global optimization" box 224 in FIG. 2), and (3) the composing or blending problem in which images are ready to be reprojected to a 3D environment map with pixels in overlap regions being composed from multiple images, to smooth any transitional discontinuities (shown as the "blending" box 226 in FIG. 2). Finally, there is the projection or construction of the assembled panorama onto a 3D geometry surface, such as a cylinder, cube or sphere (defined as the "projection" box 228 in FIG. 2).

The solutions to these sub-problems are performed by software function modules 222, 224, 226, 228 residing in memory 220 and operating the processor 212. The modules are designated, as explained further herein, the pairwise registration function module 222, the calibration and global optimization function module 224, the blending function module 226, and the projection function module 228. A user interface module 230, also residing in memory 220, may interact with the other modules to pass data to and from the modules, and accept input from a human user of the system. The modules may receive data from memory, manipulate that data as described herein, and output the data to other modules. The three modules 222, 224 and 226 may perform feedback to pass data back to previous modules, as indicated by arrows 233, and as described below. Although in a preferred embodiment the modules are programmed as separate software routines or classes, the modules may be combined into one module performing all the designated tasks performed by separate modules.

As a final step, the fourth module, the projection function module 228, constructs a panoramic scene by projecting the blended image onto any designated geometry view surface, typically a cubic, polyhedral, cylindrical or spherical surface. The projection module may be controlled through the user interface 230 as well, to allow a user to select what geometry will be projected onto and to control and modify other factors, including the use of photo re-touching software such as PhotoShop.TM. for modifying the final panorama.

Generally, the local registration, self-calibration and global optimization, and blending involve a multi-step procedure.

First, regarding the initial local registration, and referring generally to the generalized flowchart of FIG. 3, the system 210 reads in each overlapping rectilinear image into main memory 220, as indicated by step 312. The images are assumed to roughly share a common nodal point (i.e., that point in the three-space where all rays of light converge through a lens) with other overlapping rectilinear images. The object of the program during local registration is to register the locally overlapping images, by comparing common overlapping areas between overlapping images at certain predetermined resolution levels on a Gaussian pyramid representing the overlapped images. Different combinations of overlapping areas are tried to achieve the optimal overlap between images (or, equivalently, the smallest error in the error function or pairwise objective function described herein) using the steps described herein, which generally minimizes the average squared pixel intensity (e.g., brightness and contrast) difference with respect to certain transformation parameters. Initial values for parameters used in optimizing the pairwise objective function are assumed by the computer, as indicated in step 314. The initial values may optionally be input by a user, e.g., with a user interface 230 as in FIG. 2, and in response to a user dialog window such as of the kind shown in FIG. 6. Besides the global orientation (pan, tilt and roll) the other parameters that are most likely to give instability in convergence of the error function are bad initial estimates of the brightness and contrast, as well as of the geometric image center of projection of the overlapping rectilinear images. Certain parameters most likely to create instability in the convergence of the local error functions can be controlled (e.g., progressively dampened at different levels of the Gaussian pyramid) to ensure convergence, as indicated by step 318. The overlapping images are then perturbed and the local error function with respect to these and other variables is calculated until a minimal local error function is found, as indicated by step 320. The minimal local error function is then stored for a particular level of the Gaussian pyramid, as indicated in step 322, for each pairwise registration, and is saved and later used to compute a global error function for all the overlapping images. The local pairwise registration module 222 iterates until the entire Gaussian pyramid is traversed, starting from the coarsest level of the pyramid (sometimes called the bottom, where the pyramid can be standing on its inverted top) and working to the finest level resolution, as indicated in decision box 324. It should be noted that at any stage throughout the registration, and throughout the invention in general, the system may check for a user interruption, through the user interface, that would require immediate attention from the processor, such as to allow the user to interactively adjust the parameters to avoid divergence or convergence to an undesired local minimum.

As indicated in box 316 of FIG. 3, it must be determined at what level in the Gaussian pyramid to start the local pairwise registration. One way to find the lowest level is to select the resolution level at which it is found that the images share at least some arbitrary number of overlapping pixels, e.g., preferably about 30 pixels from each side, e.g., preferably no less than 30 pixels across the overlapping area. If greater than 60 pixels of overlap is found in these areas, the size (resolution) of the overlap region is decreased by half (going deeper into the pyramid) and the procedure of the present invention is reiterated again. If, on the other hand, the overlap is less than 30 pixels, then the size (resolution) of the overlap region is increased by doubling. By utilizing multi-resolution registration of overlapping images by way of the Gaussian pyramid, convergence to the desired optimum is accelerated, and false local minima are avoided.

On occasion, it may be visually apparent to a user that during registration the images are not converging optimally. In this case user input may manually abort the pairwise registration procedure, and the user may manually help align the images closer before resuming automatic registration, as before. This manual intervention is true for all aspects of the invention. Nevertheless, the present invention is surprisingly robust, and manual intervention is not a prerequisite for the invention to work.

Non-optimal convergence or divergence has sometimes been found to be the case whenever images for a spherical projection are used, especially those in the "pole" regions of the sphere (though in general the invention can adjust quite nicely for images that wrap around the poles). Divergence sometimes results when the initial default parameters chosen are wildly off or not suitable for convergence. During such instability, the images will appear to a user to "run away" from each other. In this case, and throughout the invention, provision may be provided in the user interface 230 of the embodiment of FIG. 2 of the present invention for manual intervention, such as to abort the program, for the manual selection and relative positioning of the images to be pairwise registered, and for the selection and relative positioning of overlapping images for blending.

The iterative method of moving down a pyramid when an overlap region is greater than, say, 30 pixels, is an attempt to prevent instability in the error function due to problematic parameters, such as initial value errors in the image center of projection of the images being registered, and errors in setting initial brightness and contrast values. Techniques of damping and annealing of problematic parameters (with damping progressively diminished and finally set to zero as one moves up the pyramid to finer levels) can be used to stabilize the local error function for these problematic terms, as explained further herein.

One improvement over prior techniques has been to save the local error function values and use them to compute and optimize the global error function needed for optimization. This improvement also avoids having to evaluate the entire global error function (global objective function) from scratch. The pairwise objective functions (local error functions) are approximated by a quadratic Taylor series, and, together with the chain rule, the global objective function (global error function) is minimized. Calculation of the global error function is greatly speeded up by this procedure.

Further regarding rectilinear images taken in a non-arbitrary manner (such as from a tripod that is rotated, or a photographer who manually "pans" a field of view), the number of pyramid levels and optimal direction for the blending of images overlapping in a region can be computed by the present invention by computing the minimum eigenvalue of the 2.times.2 inertial tensor of the overlapping region between two images. It has been found in practice that for an arbitrary polygon representing an overlapping region, the optimal direction for blending, as well as the width of the blending region (which determines the level in the pyramid at which to start the method of registration and optimization) is found along the minor axis of the inertial ellipse found from solving for the inertial tensor of the overlapping images. A similar method of finding the proper pyramid level is by solving for the smallest eigenvalue of an inertial tensor of the overlap region between images. Conceptually, such an ellipse is shown in FIG. 7. The blending region in an arbitrarily shaped polygon region 710, which represents the area of overlap between overlapping images, lies along the width and direction of the minor axis 712 of the ellipse 714, which is-calculated from the inertial tensor of the overlapping images forming the polygon.

Thus, the results from computing the inertial tensor are used to determine the pyramid level, blending width, and blending direction. The smallest inertial eigenvalue is used to determine the number of pyramid levels. One could also use the eigenvalue vector (eigenvector) to determine the direction, or, preferably, use a blending mask, as explained herein, that yields a grayscale ramp, which defines direction in a direction field from taking the grayscale ramp gradient.

Next, after pairwise local registrat