The present invention is embodied in an apparatus, and related method, for sensing a person's facial movements, features and characteristics and the like to generate and animate an avatar image based on facial sensing. The avatar apparatus uses an image processing technique based on model graphs and bunch graphs that efficiently represent image features as jets. The jets are composed of wavelet transforms processed at node or landmark locations on an image corresponding to readily identifiable features. The nodes are acquired and tracked to animate an avatar image in accordance with the person's facial movements. Also, the facial sensing may use jet similarity to determine the person's facial features and characteristic thus allows tracking of a person's natural characteristics without any unnatural elements that may interfere or inhibit the person's natural characteristics.
This application is a continuation of U.S. Ser. No. 09/188,079, filed Nov. 6, 1998 which claims priority from U.S. Patent Application Ser. No. 60/081,615, filed Apr. 13, 1998.
A method and system for controlling an avatar using computer vision is presented. A video stream representing a background and a foreground is received. A user in the foreground is segmented from the background and classified to produce effector information. An avatar may be controlled based on the effector information.
The present invention provides a method for fitting a surface to a point group using computer. The method enables to prevent an improper modification or an erroneous modification even when the point group includes a point having a low reliability, thus to achieve proper modification. The method comprises a first step for judging reliability of each of points composing the point group and a second step for changing the method for fitting the surface to the point group based on the result of judgment of the first step. In the second step, the method for fitting is changed by varying weights of the points based on the reliabilities.
A method for automatically recognizing one or more structures in digitized image date includes determining sub-jets of the image graph at the nodes defined by its structure. The sub-jets correspond to at least part of the determined sub-jets of the specific reference graph. The projection of the net-like structure of the specific reference varies until a graph comparison function, which compares the jets of the image graph with the corresponding jets of the specific graph, becomes optimal. The structure is associated with a reference image corresponding to the reference graph for which the graph comparison function is optimal, with respect to the optimal image graph determined for the reference graph.
Methods and devices for processing captured video frames to detect specific changes observable in video using three consecutive video frames. The images in the first video frame are compared with the second frame and the images of the second frame are compared with the third frame to produce two intermediate images which show the first order change observable in video. These interim images are then analyzed. A geometric transformation is found such that when the transformation is applied to one of these intermediate images, the number of pixels which match between the interim is maximized. This geometric transformation, which may include a linear as well as a rotational component, is then applied to one of the intermediate images to result in a transformed image. The transformed image is then subtracted from the other intermediate image to arrive at an end image which shows the second order change or the change in a change observable in video. The second order change image will show only those specific pixels which have changed in the images between the three original video frames. The invention may be used to detect changes in the state of a subject's eyes. A subject's blinking can thus be used for sending binary commands to a computer remotely. In particular, a double blink, i.e., two consecutive blinks, of a person can be used as a hand-free substitute to a clicking of a mouse.
A method, apparatus, and computer program product for estimating face direction using a single gray-level image (110, 150) of a face are described. Given the single image (110, 150), a face direction can be determined by computing a nose axis (140, 180) maximising a correlation measure between left and right sides (120, 130; 160, 170) of the face. The correlation measure is computed by comparing one of the two sides (A) with another synthetic side (C) derived from the other side (B) using symmetry and perspective transforms. Optionally, this process can be accelerated using a contrast enhancement algorithm taking advantage of the circumstances that the nose is the part of a face reflecting the most light and that this reflected light is represented as a line-like region close to the real nose axis. The computation result is a word describing the spatial position of the face and combining height ("up", "normal", "down") and neck-rotation ("left", "frontal", "right").