WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Multiple processor visibility search system and method    
United States Patent6437796   
Link to this pagehttp://www.wikipatents.com/6437796.html
Inventor(s)Sowizral; Henry A. (Bellevue, WA); Zikan; Karel (Seattle, WA); Keller; Randall G. (San Carlos, CA)
AbstractA system and method for performing visible object determination based upon a dual search of a cone hierarchy and a bound hierarchy performed by multiple processors. Each processor is configured to read a (global and/or local) problem queue to access a bound-cone pair. The bound-cone pair points to a bound in the bound hierarchy and a cone in the cone hierarchy. The processor computes a bound-cone distance between the bound and the cone, and compares the bound-cone distance to a visibility distance associated with the cone. If the bound-cone distance is smaller than the visibility distance, the processor may write two or more refined bound-cone pairs corresponding to a refinement of the original pair to the global or local problem queue. When the processor detects a leaf bound and a leaf cone, it updates a nearest object pointer and the visibility distance associated with the leaf cone.



 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Inventor     Sowizral; Henry A. (Bellevue, WA); Zikan; Karel (Seattle, WA); Keller; Randall G. (San Carlos, CA)
Owner/Assignee     Sun Microsystems, Inc. (Palo Alto, CA)
Patent assignment
All assignments
Publication Date     August 20, 2002
Application Number     09/895,665
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     June 29, 2001
US Classification     345/622 345/419
Int'l Classification     G09G 005/00
Examiner     Zimmerman; Mark
Assistant Examiner     Santiago; Enrique L
Attorney/Law Firm     Tayon, Brightwell; Mark K. Conley, Rose &
Address
Parent Case     CROSS REFERENCES TO RELATED APPLICATIONS This application claims the benefit of U.S. Provisional Application No. 60/250,823 filed on Dec. 1, 2000 titled "Multiple Processor Visibility Search System and Method". This application is a continuation-in-part of U.S. patent spplication Ser. No. 09/247,466 filed on Feb. 9, 1999 titled "Visible-Object Determination For Interactive Visualization", which claims the benefit of U.S. Provisional Application No. 60/074,868 filed on Feb. 17, 1998 titled "Visible-Object Determination for Interactive Visualization".
Priority Data    
USPTO Field of Search     345/419 345/420 345/421 345/428 345/620 345/622
Patent Tags     multiple processor visibility search
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
6300965
Sowizral
345/622
Oct,2001

[0 after 0 votes]
6259451
Tesler
345/419
Jul,2001

[0 after 0 votes]
6223145
Hearst
703/22
Apr,2001

[0 after 0 votes]
6137499
Tesler

Oct,2000

[0 after 0 votes]
5295243
Robertson

Mar,1994

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A system for displaying visible objects on a display device, the system comprising:

a shared memory configured to store a bound hierarchy and a cone hierarchy;

a plurality of processors coupled to the shared memory, wherein the plurality of processors are operable to perform a search of the cone hierarchy and bound hierarchy to identify one or more nearest objects for cones in the cone hierarchy;

a rendering agent configured to receive an indication of the one or more nearest objects for each of said cones and to transmit pixel data to a display device in response to said one or more nearest object indications for each of said cones.

2. The system of claim 1, wherein the shared memory is further configured to store a first problem queue, wherein each of the plurality of processors is configured to:

(a) read a first bound-cone pair from the first problem queue, wherein the first bound-cone pair points to a first bound in the bound hierarchy and a first cone in the cone hierarchy,

(b) compute a bound-cone distance between the first bound and the first cone;

(c) determine if the bound-cone distance is smaller than a first visibility distance associated with the first cone; and

(d) write two or more second bound-cone pairs to the first problem queue in response to the bound-cone distance being smaller than the first visibility distance.

3. The system of claim 2, wherein each of said plurality of processors is further operable to compute a bound size for the first bound and a cone size for the first cone, to compare the bound size and the cone size, wherein the two or more second bound-cone pairs correspond to the first bound and subcones of the first cone if the cone size is larger than the bound size.

4. The system of claim 3, wherein the two or more second bound-cone pairs correspond to the first cone and subbounds of the first bound if the cone size is smaller than the bound size.

5. The system of claim 2, wherein each of the plurality of processors is further operable:

to determine if the first cone is a leaf cone and the first bound is a leaf bound;

to perform (c) and (d) if the first cone is not a leaf cone or the first bound is not a leaf bound.

6. The system of claim 5, wherein each of the plurality of processors is further operable to update a nearest object pointer and the first visibility distance associated with the first cone if the first cone is a leaf cone and the first bound is a leaf bound.

7. The system of claim 5, wherein each of said plurality of processors is further operable to update the first visibility distance with the bound-cone distance and the nearest object pointer with a pointer associated with the first bound.

8. The system of claim 1, wherein the rendering agent comprises a graphics acceleration device.

9. The system of claim 1, wherein the rendering agent comprises a software renderer executing on a programmable processor.

10. The system of claim 1, wherein the shared memory is further configured to store a global problem queue, wherein each of the plurality of processors is configured to:

(a) read a corresponding first bound-cone pair from the global problem queue, wherein the first bound-cone pair points to a first bound in the bound hierarchy and a first cone in the cone hierarchy,

(b) compute a bound-cone distance between the first bound and the first cone;

(c) determine if the bound-cone distance is smaller than a first visibility distance associated with the first cone; and

(d) write two or more second bound-cone pairs to a local memory corresponding to the processor in response to the bound-cone distance being smaller than the first visibility distance.

11. The system of claim 10, wherein each of the plurality of processors is further configured to:

(e) read a corresponding third bound-cone pair from the corresponding local memory, wherein the third bound-cone pair points to a third bound in the bound hierarchy and a third cone in the cone hierarchy;

(f) compute a bound-cone distance between the third bound and the third cone;

(c) determine if the bound-cone distance is smaller than a third visibility distance associated with the third cone; and

(d) write two or more fourth bound-cone pairs to the corresponding local memory in response to the bound-cone distance being smaller than the third visibility distance.

12. A method for displaying visible objects on a display device, the method comprising:

storing a bound hierarchy and a cone hierarchy in a shared memory;

a plurality of processors reading bounds from the bound hierarchy and cones from the cone hierarchy and searching the cones with respect to the bounds to identify one or more nearest objects for the cones in the cone hierarchy;

said processors providing indications of the one or more nearest objects for each of said cones to a rendering agent;

said rendering agent generating pixel data in response to said nearest object indications and transmitting the pixel data to a display device.

13. The method of claim 12 further comprising each processor of said plurality of processors:

(a) reading a first bound-cone pair from a first problem queue, wherein each bound-cone pair points to a corresponding first bound in the bound hierarchy and a corresponding first cone in the cone hierarchy;

(b) computing a bound-cone distance between the first bound and the first cone;

(c) determining if the bound-cone distance is smaller than a first visibility distance associated with the first cone; and

(d) writing two or more second bound-cone pairs to the first problem queue in response to the bound-cone distance being smaller than the first visibility distance.

14. The method of claim 13 further comprising each processor of said plurality of processors:

computing a bound size for the first bound and a cone size for the first cone;

comparing the bound size and the cone size, wherein the two or more second bound-cone pairs correspond to the first bound and subcones of the first cone if the cone size is larger than the bound size.

15. The method of claim 14, wherein the two or more second bound-cone pairs correspond to the first cone and subbounds of the first bound if the cone size is smaller than the bound size.

16. The method of claim 13 further comprising each processor of said plurality:

determining if the first cone is a leaf cone and the first bound is a leaf bound;

perform (c) and (d) if the first cone is not a leaf cone or the first bound is not a leaf bound.

17. The method of claim 16 further comprising each processor of said plurality updating a nearest object pointer and the first visibility distance associated with the first cone if the first cone is a leaf cone and the first bound is a leaf bound.

18. The method of claim 16 further comprising each processor of said plurality updating the first visibility distance with the bound-cone distance and the nearest object pointer with a pointer associated with the first bound.

19. The method of claim 12 further comprising each processor of said plurality of processors:

(a) reading a corresponding first bound-cone pair from the global problem queue, wherein the first bound-cone pair points to a first bound in the bound hierarchy and a first cone in the cone hierarchy,

(b) computing a bound-cone distance between the first bound and the first cone;

(c) determining if the bound-cone distance is smaller than a first visibility distance associated with the first cone; and

(d) writing two or more second bound-cone pairs to a local memory corresponding to the processor in response to the bound-cone distance being smaller than the first visibility distance.

20. The method of claim 19 further comprising each processor of said plurality of processors:

(e) reading a corresponding third bound-cone pair from the corresponding local memory, wherein the third bound-cone pair points to a third bound in the bound hierarchy and a third cone in the cone hierarchy;

(f) computing a bound-cone distance between the third bound and the third cone;

(g) determining if the bound-cone distance is smaller than a third visibility distance associated with the third cone; and

(h) writing two or more fourth bound-cone pairs to the corresponding local memory in response to the bound-cone distance being smaller than the third visibility distance.

21. A computer-readable memory medium for storing program instructions, wherein said program instructions are executable by each processor in a collection of processors to implement the operations of:

reading a first bound-cone pair from a shared memory, wherein each bound-cone pair points to a corresponding first bound in a bound hierarchy and a corresponding first cone in a cone hierarchy;

computing a bound-cone distance between the first bound and the first cone;

determining if the bound-cone distance is smaller than a first visibility distance associated with the first cone;

writing two or more second bound-cone pairs to the first problem queue in response to (a) the bound-cone distance being smaller than the first visibility distance, and (b) the first bound being a non-terminal bound of the bound hierarchy or the first cone being a non-terminal cone of the cone hierarchy;

updating (c) the first visibility distance of the first cone with the bound-cone distance and (d) a nearest object indicator for the first cone with a pointer associated with the first bound in response to (e) the bound-cone distance being smaller than the first visibility distance and (f) the first bound being a terminal bound and the first cone being a terminal cone;

wherein the nearest object indicators of the leaf cones of the cone hierarchy are usable to generate a rendered image on a display device.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates generally to the field of computer graphics, and more particularly, to the problem of determining the set of objects (and portions of objects) visible from a defined viewpoint in a graphics environment.

2. Description of the Related Art

Visualization software has proven to be very useful in evaluating three-dimensional designs long before the physical realization of those designs. In addition, visualization software has shown its cost effectiveness by allowing engineering companies to find design problems early in the design cycle, thus saving them significant amounts of money. Unfortunately, the need to view more and more complex scenes has outpaced the ability of graphics hardware systems to display them at reasonable frame rates. As scene complexity grows, visualization software designers need to carefully use the rendering resource provided by graphic hardware pipelines.

A hardware pipeline wastes rendering bandwidth when it discards rendered triangle work. Rendering bandwidth waste can be decreased by not asking the pipeline to draw triangles that it will discard. Various software methods for reducing pipeline waste have evolved over time. Each technique reduces waste at a different point within the pipeline. As an example, software culling of objects falling outside the view frustum can significantly reduce discards in a pipeline's clipping computation. Similarly, software culling of backfacing triangles can reduce discards in a pipeline's lighting computation.

The z-buffer is the final part of the graphics pipeline that discards work. In essence, the z-buffer retains visible surfaces, and discards those not visible because they are behind another surface (i.e. occluded). As scene complexity increases, especially in walk-through and CAD environments, the number of occluded surfaces rises rapidly and as a result the number of surfaces that the z-buffer discards rises as well. A frame's average depth complexity determines roughly how much work (and thus rendering bandwidth) the z-buffer discards. In a frame with a per-pixel depth complexity of d the pipeline's effectiveness is 1/d. As depth complexity rises, the hardware pipeline thus becomes proportionally less and less effective.

Software occlusion culling has been proposed as an additional tool for improving rendering effectiveness. A visualization program which performs occlusion culling effectively increases the overall rendering bandwidth of the graphics hardware by not asking the hardware pipeline to draw occluded objects. Computing a scene's visible objects is the complementary problem to that of occlusion culling. Rather than removing occluded objects from the set of objects in a scene or frustum-culled scene, a program instead computes which objects are visible and instructs the rendering hardware to draw just those. A simple visualization program can compute the set of visible objects and draw those objects from the current viewpoint, thus allowing the pipeline to focus on removing backfacing polygons and the z-buffer to remove any non-visible surfaces of those objects.

One technique for computing the visible object set uses ray casting as shown in FIG. 1. RealEyes [Sowizral, H. A., Zikan, K., Esposito, C., Janin, A., Mizell, D., "RealEyes: A System for Visualizing Very Large Physical Structures", SIGGRAPH '94, Visual Proceedings, 1994, p. 228], a system that implemented the ray casting technique, was demonstrated in SIGGRAPH 1994's BOOM room. At interactive rates, visitors could "walk" around the interior of a Boeing 747 or explore the structures comprising Space Station Freedom's lab module.

The intuition for the use of rays in determining visibility relies on the properties of light. The first object encountered along a ray is visible since it alone can reflect light into the viewer's eye. Also, that object interposes itself between the viewer and all succeeding objects along the ray making them not visible. In the discrete world of computer graphics, it is difficult to propagate a continuum of rays. So a discrete subset of rays is invariably used. Of course, this implies that visible objects or segments of objects smaller than the resolution of the ray sample may be missed and not discovered. This is because rays guarantee correct determination of visible objects only up to the density of the ray-sample. FIG. 1 illustrates the ray-based method of visible object detection. Rays that interact with one or more objects are marked with a dot at the point of their first contact with an object. It is this point of first contact that determines the value of the screen pixel corresponding to the ray. Also observe that the object 10 is small enough to be entirely missed by the given ray sample.

Visible-object determination has its roots in visible-surface determination. Foley et al. [Foley, J., van Dam, A., Feiner, S. and Hughes, J. Computer Graphics: Principles and Practice, 2nd ed., Addison-Wesley, Chapter 15, pp.649-718, 1996] classify visible-surface determination approaches into two broad groups: image-precision and object-precision algorithms. Image precision algorithms typically operate at the resolution of the display device and tend to have superior performance computationally. Object precision approaches operate in object space--usually performing object to object comparisons.

A prototypical image-precision visible-surface-determination algorithm casts rays from the viewpoint through the center of each display pixel to determine the nearest visible surface along each ray. The list of applications of visible-surface ray casting (or ray tracing) is long and distinguished. Appel ["Some Techniques for Shading Machine Rendering of Solids", SJCC'68, pp. 37-45, 1968] uses ray casting for shading. Goldstein and Nagel [Mathematical Applications Group, Inc., "3-D Simulated Graphics Offered by Service Bureau," Datamation, 13(1), February 1968, p. 69.; see also Goldstein, R. A. and Nagel, R., "3-D Visual Simulation", Simulation, 16(1), pp.25-31, 1971] use ray casting for boolean set operations. Kay et al. [Kay, D. S. and Greenberg, D., "Transparency for Computer Synthesized Images," SIGGRAPH'79, pp.158-164] and Whitted ["An Improved Illumination Model for Shaded Display", CACM, 23(6), pp.343-349, 1980] use ray tracing for refraction and specular reflection computations. Airey et al. [Airey, J. M., Rohlf, J. H. and Brooks, Jr. F. P., "Towards Image Realism with Interactive Update Rates in Complex Virtual Building Environments", ACM SIGGRAPH Symposium on Interactive 3D Graphics, 24, 2(1990), pp. 41-50] uses ray casting for computing the portion of a model visible from a given cell.

Another approach to visible-surface determination relies on sending beams or cones into a database of surfaces [see Dadoun et al., "Hierarchical approachs to hidden surface intersection testing", Proceeedings of Graphics Interface '82, Toronto, May 1982, 49-56; see also Dadoun et al., "The geometry of beam tracing", In Joseph O'Rourke, ed., Proceeedings of the Symposium on Computational Geometry, pp.55-61, ACM Press, New York, 1985]. Essentially, beams become a replacement for rays. The approach usually results in compact beams decomposing into a set of possibly non-connected cone(s) after interacting with an object.

A variety of spatial subdivision schemes have been used to impose a spatial structure on the objects in a scene. The following four references pertain to spatial subdivision schemes: (a) Glassner, "Space subdivision for fast ray tracing," IEEE CG&A, 4(10):15-22, October 1984; (b) Jevans et al., "Adaptive voxel subdivision for ray tracing," Proceedings Graphics Interface '89, 164-172, June 1989; (c) Kaplan, M. "The use of spatial coherence in ray tracing," in Techniques for Computer Graphics . . . , Rogers, D. and Earnshaw, R. A. (eds), Springer-Verlag, New York, 1987; and (d) Rubin, S. M. and Whitted, T. "A 3-dimensional representation for fast rendering of complex scenes," Computer Graphics, 14(3):110-116, July 1980.

Kay et al. [Kay, T. L. and Kajiya, J. T., "Ray Tracing Complex Scenes", SIGGRAPH 1986, pp. 269-278,1986], concentrating on the computational aspect of ray casting, employed a hierarchy of spatial bounding volumes in conjunction with rays, to determine the visible objects along each ray. Of course, the spatial hierarchy needs to be precomputed. However, once in place, such a hierarchy facilitates a recursive computation for finding objects. If the environment is stationary, the same data-structure facilitates finding the visible object along any ray from any origin.

Teller et al. [Teller, S. and Sequin, C. H., "Visibility Preprocessing for Interactive Walkthroughs," SIGGRAPH '91, pp.61-69] use preprocessing to full advantage in visible-object computation by precomputing cell-to-cell visibility. Their approach is essentially an object precision approach and they report over 6 hours of preprocessing time to calculate 58 Mbytes of visibility information for a 250,000 polygon model on a 50 MIP machine [Teller, S. and Sequin. C. H., "Visibility computations in polyhedral three-dimensional environments," U.C. Berkeley Report No. UCB/CSD 92/680, April 1992].

In a different approach to visibility computation, Greene et al. [Greene, N., Kass, M., and Miller, G., "Hierarchical z-Buffer Visibility," SIGGRAPH '93, pp.231-238] use a variety of hierarchical data structures to help exploit the spatial structure inherent in object space (an octree of objects), the image structure inherent in pixels (a Z pyramid), and the temporal structure inherent in frame-by-frame rendering (a list of previously visible octree nodes). The Z-pyramid permits the rapid culling of large portions of the model by testing for visibility using a rapid scan conversion of the cubes in the octree.

As used herein, the term "octree" refers to a data structure derived from a hierarchical subdivision of a three-dimensional space based on octants. The three-dimensional space may be divided into octants based on three mutually perpendicular partitioning planes. Each octant may be further partitioned into eight sub-octants based on three more partitioning planes. Each sub-octant may be partitioned into eight sub-suboctants, and so forth. Each octant, sub-octant, etc., may be assigned a node in the data structure. For more information concerning octrees, see pages 550-555, 559-560 and 695-698 of Computer Graphics: principles and practice, James D. Foley et al., 2.sup.nd edition in C, ISBN 0-201-84840-6, T385.C5735, 1996.

The depth complexity of graphical environments continues to increase in response to consumer demand for realism and performance. Thus, the efficiency of an algorithm for visible object determination has a direct impact on the marketability of a visualization system. The computational bandwidth required by the visible object determination algorithm determines the class of processor required for the visualization system, and thereby affects overall system cost. Thus, a system and method for improving the efficiency of visible object determination is greatly desired.

SUMMARY OF THE PRESENT INVENTION

Various embodiments of a system and method for performing visible object determination based upon a dual search of a cone hierarchy and a bound hierarchy are herein disclosed. In one embodiment, the system may comprise a plurality of processors, a display device, a shared memory, and optionally a graphics accelerator. The multiple processors execute a parallel visibility algorithm which operates on a collection of graphical objects to determine a visible subset of the objects from a defined viewpoint. The objects may reside in a three-dimensional space and thus admit the possibility of occluding one another.

The parallel visibility algorithm represents space in terms of a hierarchy of cones emanating from a viewpoint. In one embodiment, the leaf-cones of the cone hierarchy, i.e. the cones at the ultimate level of refinement, subtend an area which corresponds to a fraction of a pixel in screen area. For example, two cones may conveniently fill the area of a pixel. In other embodiments, a leaf-cone may subtend areas which include one or more pixels.

An initial view frustum or neighborhood of the view frustum may be recursively tessellated (i.e. refined) to generate a cone hierarchy. Alternatively, the entire space around the viewpoint may be recursively tessellated to generate the cone hierarchy. In this embodiment, the cone hierarchy is recomputed for changes in the viewpoint and view-direction.

The multiple processors or some subset thereof, or another set of one or more processors, may also generate a hierarchy of bounds from the collection of objects. In particular, the bound hierarchy may be generated by: (a) recursively grouping clusters starting with the objects themselves as order-zero clusters, (b) bounding each object and cluster (of all orders) with a corresponding bound, e.g. a polytope hull, (c) allocating a node in the bound hierarchy for each object and cluster, and (d) organizing the nodes in the bound hierarchy to reflect cluster membership. For example if node A is the parent of node B, the cluster corresponding to node A contains a subcluster (or object) corresponding to node B. Each node stores parameters which characterize the bound of the corresponding cluster or object.

The cone hierarchy and bound hierarchy may be stored in the shared memory. In addition, the shared memory may store a global problem queue. The global problem queue is initially loaded with a collection of bound-cone pairs. Each bound-cone pair points to a bound in the bound hierarchy and a cone in the cone hierarchy.

The multiple processors may couple to the shared memory, and may perform a search of the cone and bound hierarchies to identify one or more nearest objects for a subset of cones (e.g. the leaf cones) in the cone hierarchy. After the multiple processors complete the search of the cone and bound hierarchies, a transmission agent (e.g. the multiple processors, some subset thereof, or another set of one or more processors) may transmit graphics primitives, e.g. triangles, corresponding to the nearest objects of each cone in the subset, to a rendering agent. The rendering agent (e.g. the graphics accelerator, or a software renderer executing on the multiple processors, some subset thereof, or another set of one or more processors) is operable to receive the graphics primitives, to perform rendering computations on the graphics primitives to generate a stream of pixels, and to transmit the pixel stream to the display device.

In some embodiments, each leaf-cone may be assigned a visibility distance value which represents the distance to the closest known object as perceived from within the leaf-cone. Each leaf-cone may also be assigned an object pointer which specifies the closest known object within view of the leaf-cone. Similarly, each non-leaf cone may be assigned a visibility distance value. However, the visibility distance value of a non-leaf cone may be set equal to the maximum of the visibility distance values for its subcone children. This implies that the visibility distance value for each non-leaf cone equals the maximum of the visibility distance values of its leaf-cone descendents.

In one embodiment, each of the plurality of processors is operable to: (a) read a bound-cone pair (H,C) from the global work queue, (b) compute the distance between the bound H and the cone C, (c) to compare the bound-cone distance to a visibility distance associated with the cone C, (d) to write two or more dependent bound-cone pairs to the global problem queue if the bound-cone distance is smaller than the visibility distance of the cone C. The two or more dependent bound-cone pairs may be pairs generated from bound H and the subcones of cone C, or pairs generated from cone C and subbounds of bound H.

Furthermore, when the processor detects that the hull H is a leaf bound of the bound hierarchy and the cone C is a leaf cone of the cone hierarchy, the processor may update the visibility information for the leaf cone, i.e. may set the visibility distance value for cone C equal to the cone-hull distance computed in (b) above, and may set the nearest object pointer associated with cone C equal to a pointer associated with hull H.

In one alternative embodiment, each processor may couple to a local memory containing a local problem queue. Each processor may read and write bound-cone pairs from/to its local problem queue, and access the global problem queue to read initial bound-cone pairs.

In another alternative embodiment, a collection of cones may be selected from the cone hierarchy, i.e. a collection of non-overlapping cones which fill the space of the root cone (i.e. top level cone). The cones of the collection may be distributed among the multiple processors. Each of the multiple processors may perform a search of its assigned cones (i.e. the subtrees of the cone hierarchy defined by these assigned cones) against the hull tree.

BRIEF DESCRIPTION OF THE FIGURES

The foregoing, as well as other objects, features, and advantages of this invention may be more completely understood by reference to the following detailed description when read together with the accompanying drawings in which:

FIG. 1 illustrates the ray-based method of visible object detection according to the prior art;

FIG. 2A illustrates one embodiment of a graphical computing system for performing visible object determination;

FIG. 2B is a block diagram illustrating one embodiment of the graphical computing system 80;

FIG. 3 is a flowchart for processing operations performed in one embodiment of graphical computing system 80;

FIG. 4A illustrates a collection of objects in a graphics environment;

FIG. 4B illustrates a first step in one embodiment of a method for forming a hull hierarchy, i.e. the step of bounding objects with containing hulls and allocating hull nodes for the containing hulls;

FIG. 4C illustrates one embodiment of the process of grouping together hulls to form higher order hulls, and allocating nodes in the hull hierarchy which correspond to the higher order hulls;

FIG. 4D illustrates a final stage in the recursive grouping process wherein all objects are contained in a universal containing hull which corresponds to the root node of the hull hierarchy;

FIG. 5A illustrates the mathematical expressions which describe lines and half-planes in two dimensional space;

FIG. 5B illustrates the description of a rectangular region as the intersection of four half-planes in a two dimensional space;

FIG. 6 illustrates a two-dimensional cone C partitioned into a two subcones C.sub.1 and C.sub.2 which interact with a collection of objects;

FIG. 7 illustrates polyhedral cones with rectangular and triangular cross-section emanating from the origin of a three-dimensional space;

FIG. 8A illustrates mathematical expressions which describe a line through the origin and a corresponding half-plane given a normal vector in two-dimensional space;

FIG. 8B illustrates the specification of a two-dimensional conic region as the intersection of two half-planes;

FIGS. 9A-9C illustrate the formation of a cone hierarchy based on repeated subdivision of an initial cone with rectangular cross-section;

FIG. 10A illustrates one embodiment of a program thread 250 which is executed by each of multiple processors to accomplish a dual search of the hull hierarchy and cone hierarchy;

FIG. 10B illustrates an embodiment of graphical computing system 80 where each of multiple processors reads initial hull-cone pairs from a global problem queue, and accesses non-initial hull-cone pairs from a corresponding local queue;

FIG. 10C illustrates an embodiment of graphical computing system 80 where cones from the cone hierarchy are distributed among a plurality of processors, and each processor searches the assigned cones (and their descendents) with respect to the hull hierarchy;

FIG. 10D illustrates a cone C which has a small normalized size compared to a bound hull H;

FIG. 10E illustrates a hull H which has a small normalized size compared to a cone C;

FIG. 11 illustrates one embodiment of the process of recursively clustering a collection of objects to form a bounding hierarchy.

While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that the drawings and detailed description thereto are not intended to limit the invention to the particular forms disclosed, but on the contrary, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope of the present invention as defined by the appended claims. Please note that the section headings used herein are for organizational purposes only and are not meant to limit the description or claims. The word "may" is used in this application in a permissive sense (i.e., having the potential to, being able to), not a mandatory sense (i.e., must). Similarly, the word include, and derivations thereof, are used herein to mean "including, but not limited to."

DETAILED DESCRIPTION OF SEVERAL EMBODIMENTS

FIG. 2A presents one embodiment of a graphical computing system 80 for performing visible object determination. Graphical computing system 80 may include a system unit 82, and a display device 84 coupled to the system unit 82. The display device 84 may be realized by any of various types of video monitors or graphical displays. Graphics computer system 80 may include one or more input devices such as a keyboard 86, a mouse 88, a trackball, a digitizing pad, a joystick, etc.

FIG. 2B is a block diagram illustrating one embodiment of graphical computing system 80. Graphical computing system 80 may include a plurality of processors PR.sub.1 through PR.sub.M and a shared memory 106 coupled to a high-speed system bus 104. Graphical computing system 80 may also include a graphics accelerator 112 coupled to system bus 104 and display device 84.

Each processor PR.sub.1 may couple to a dedicated local memory (not shown) for storing local code and/or local data. (The notation PR.sub.1 refers to an arbitrary one of the processors PR.sub.1 through PR.sub.M.) The shared memory 106 may include any of various types of memory subsystems including random access memory, read only memory, and/or mass storage devices. Processors PR.sub.1 through PR.sub.M operate on a set of objects to determine a subset of the objects which are visible from a particular viewpoint in a three-dimensional scene. Each object in the original set may comprise a collection of graphics primitives (e.g. triangles). In one embodiment, objects may be described in terms of a system of equations and/or geometric constraints, e.g. polynomial equations. In this case, the visible objects may need to be decomposed (i.e. partitioned) into graphics primitives (e.g. tessellated into triangles) prior to pixel rendering and display. The object decomposition may be performed by processors PR.sub.1 through PR.sub.M, some subset thereof, and/or by a second set of one or more processors (not shown).

Graphics primitives (e.g. triangles) corresponding to the visible objects may be transmitted to graphics accelerator 112 for rendering and display on display device 84. Since graphics accelerator 112 operates on primitives corresponding to the visible objects, a higher percentage of rendered pixels (or supersamples) survive the z-comparison than if the graphics accelerator 112 were supplied with primitives corresponding to the full object set. In other words, the rendering hardware in graphics accelerator 112 may operate with increased efficiency.

In one alternative embodiment, processors PR.sub.1 through PR.sub.M, or some subset thereof, and/or another set of one or more processors (not shown) may perform pixel rendering computations on the graphics primitives corresponding to the visible objects, and may generate a stream of pixels which are transmitted to display device 84 for image display. In this alternative embodiment, graphics accelerator 112 may not be included in graphics computing system 80.

In one embodiment, graphics accelerator 112 comprises a plurality of graphics processors, and these graphics processors may perform the visible object determination instead of processors PR.sub.1 through PR.sub.M. In this embodiment, graphics accelerator 112 may receive a set of objects (or pointers to the objects) in a 3D scene. The graphics processors may operate on the set of objects to determine the subset of visible objects with respect to a current viewpoint in the 3D scene. In another embodiment, processors PR.sub.1 through PR.sub.M and graphics processor in the graphics accelerator may cooperate to determine the set of visible objects.

As mentioned above, 3-D graphics accelerator 112 may couple to system bus 104, and display device 84 may couple to graphics accelerator 112. 3-D graphics accelerator 112 may be a specialized graphics rendering subsystem which is designed to off-load the 3-D rendering functions from the host system 82, thus providing improved system performance. It is assumed that various other peripheral devices, or other buses, may be connected to system bus 104, as is well known in the art. If 3D accelerator 112 is not included in graphical computing system 80, display device 84 may couple directly to system bus 104.

Processor devices (e.g. processors PR.sub.1 through PR.sub.M) coupled to system bus 104 may transfer information to and from graphics accelerator 112 according to a programmed input/output (I/O) protocol over the system bus 104. In one embodiment, graphics accelerator 112 may access system memory 106 according to a direct memory access (DMA) protocol or through intelligent bus mastering. In another embodiment, graphics accelerator 112 may couple to system memory 106 through an Advanced Graphics Port connection.

Processors PR.sub.1 through PR.sub.M may operate under the control of visualization software stored in shared memory 106 and/or the local memories of the individual processors.

FIG. 3 is a flowchart for one embodiment of the processing performed by graphics computing system 80 in response to the visualization software. In an initial step 210, the graphics computing system 80 may receive a plurality of objects and construct an object hierarchy from the plurality of objects. (The object hierarchy construction is discussed in more detail below). It is noted that the object hierarchy may have been precomputed, in which case step 210 may be skipped.

In step 220, graphical computing system 80 may discover the set of visible objects in the scene with respect to a current viewpoint. In the preferred embodiment, graphical computing system 80 may be configured to compute visibility for three-dimensional objects from a view point in a three-dimensional coordinate space. However, the methodologies herein described naturally generalize to spaces of arbitrary dimension.

In one embodiment of graphical computing system 80, the viewpoint and view direction in the graphical environment may be changed in response to user input. For example, by manipulating mouse 88, depressing keys on keyboard 86, manipulating a joystick or game control pad, the user may cause the viewpoint and/or view direction to change. Thus, graphical computing system 80 may recompute the set of visible objects whenever the viewpoint and/or the view orientation changes. Furthermore, it is quite often the case that objects may move within the 3D scene. Thus, graphical computing system 80 may recomputed the set of visible objects when the objects in the scene move.

In step 225, graphics computing system 80 may transmit graphics primitives (e.g. triangles) corresponding to the visible objects to a rendering agent for pixel rendering.

In step 230, the rendering agent may perform rendering computations on the graphics primitives to generate a stream of pixels. In one embodiment, the rendering agent may be graphics accelerator 112. In another embodiment, the rendering agent may be a software renderer running on one or more processors (e.g. processors PR.sub.1 through PR.sub.M or some subset thereof) configured within graphical computing system 80.

In step 230, the rendering agent may transmit the pixel stream to display device 84 for image display.

Visible object determination step 220 may be performed repeatedly as the viewpoint and/or view direction (i.e. orientation) changes, and/or as the objects themselves evolve in time.

In some embodiments, objects may be modeled as opaque convex polytopes. A three-dimensional solid is said to be convex if any two points in the solid (or on the surface of the solid) may be connected with a line segment which resides entirely within the solid. Thus a solid cube is convex, while a donut (i.e. solid torus) is not. A polytope is an object with planar sides (e.g. cube, tetrahedron, etc.). The methodologies described herein for opaque objects naturally extend to transparent or semi-transparent objects by not allowing such objects to terminate a cone computation. Although not all objects are convex, every object can be approximated as a union of convex polytopes. It is helpful to note that the visible-object-set computation does not require an exact computation, but rather a conservative one. In other words, it is permissible to estimate a superset of the set of visible objects.

Constructing the Object Hierarchy

Initially, the objects in a scene may be organized into a hierarchy that groups objects spatially. An octree is one possibility for generating the object hierarchy. However, in the preferred embodiment, a clustering algorithm is used which groups nearby objects then recursively clusters pairs of groups into larger containing spaces. The clustering algorithm employs a simple distance measure and threshold operation to achieve the object clustering. FIGS. 4A-4D illustrate one embodiment of a clustering process for a collection of four objects J00 through J11. The objects are indexed in a fashion which anticipates their ultimate position in a binary tree of object groups. The objects are depicted as polygons situated in a plane (see FIG. 4A). However, the reader may imagine these objects as arbitrary three-dimensional objects. In one embodiment, the objects are three-dimensional polytopes.

Each object may be bounded, i.e. enclosed, by a corresponding bounding surface referred to herein as a bound. In the preferred embodiment, the bound for each object is a polytope hull (i.e. a hull having planar faces) as shown in FIG. 4B. The hulls H00 through H11 are given labels which are consistent with the objects they bound. For example, hull H00 bounds object J00. The hulls are illustrated as rectangles with sides parallel to a pair of coordinate axes. These hulls are intended to represent rectangular boxes (parallelepipeds) in three dimensions whose sides are normal to a fixed set of coordinate axes. For each hull a corresponding node data structure is generated. The node stores parameters which characterize the corresponding hull.

Since a hull has a surface which is comprised of a finite number of planar components, the description of a hull is intimately connected to the description of a plane in three-space. In FIG. 5A, a two dimensional example is given from which the equation of an arbitrary plane may be generalized. A unit vector n [any vector suffices but a vector of length one is convenient for discussion] defines a line L through the origin of the two dimensional space. By taking the dot product v.multidot.n of a vector v with the unit vector n, one obtains the length of the projection of vector v in the direction defined by unit vector n. Thus, given a real constant c, it follows that the equation x.multidot.n=c, where x is a vector variable, defines a line M perpendicular to line L and situated at a distance c from the origin along line L. In the context of three-dimensional space, this same equation defines a plane perpendicular to the line L, again displaced distance c from the origin along line L. Observe that the constant c may be negative, in which case the line (or plane) M is displaced from the origin at distance .vertline.c.vertline. along line L in the direction opposite to unit vector n.

The line x.multidot.n=c divides the plane into two half-planes. By replacing the equality in the above equation with an inequality, one obtains the description of one of these half-planes. The equality x.multidot.n<c defines the half-plane which contains the negative infinity end of line L. [The unit vector n defines the positive direction of line L.] In three dimensions, the plane x.multidot.n=c divides the three-dimensional space into two half-spaces. The inequality x.multidot.n<c defines the half-space which contains the negative infinity end of line L.

FIG. 5B shows how a rectangular region may be defined as the intersection of four half-planes. Given four normal vectors n.sub.1, through n.sub.4, and four corresponding constants c.sub.1 through c.sub.4, a rectangular region is defined as the set of poin