WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Authoring and use systems for sound synchronized animation    
United States Patent5111409   
Link to this pagehttp://www.wikipatents.com/5111409.html
Inventor(s)Gasper; Elon (12849 - 67th St., Bellevue, WA 98006); Matthews, III.; Joseph H. (16522 NE. 135th Pl., Redmond, WA 98052)
AbstractA general purpose computer, such as a personal computer, is programmed for sound-synchronized random access and display of synthesized actors ("synactors") on a frame-by-frame basis. The interface between a user and the animation system is defined as a stage or acting metaphor. The user interface provides the capability to create files defining individually accessible synactors representing real or imaginary persons, animated characters and objects or scenes which can be programmed to perform speech synchronized action. Synactor speech is provided by well-known speech synthesis techniques or, alternatively, by inputting speech samples and communication characteristics to define a digital model of the speech and related animation for a particular synactor. A synactor is defined as combination of sixteen predefined images; eight images to be synchronized with speech and eight images to provide additional animated expression. Once created, a synactor may be manipulated similarly to a file or document in any application. Once created, a synactor is controlled with scripts defined and edited by a user via the user interface.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5111409
Authoring and use systems for sound synchronized animation - US Patent 5111409 Drawing
Authoring and use systems for sound synchronized animation
Inventor     Gasper; Elon (12849 - 67th St., Bellevue, WA 98006); Matthews, III.; Joseph H. (16522 NE. 135th Pl., Redmond, WA 98052)
Owner/Assignee    
Patent assignment
All assignments
Publication Date     * May 5, 1992
Application Number     07/384,243
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     July 21, 1989
US Classification     715/500.1 434/167 434/169 434/185 434/307R 704/260 704/270 704/278
Int'l Classification     G09B 019/04
Examiner     Herndon; Heather R.
Assistant Examiner    
Attorney/Law Firm     Davis & Schroeder
Address
Parent Case    
Priority Data    
USPTO Field of Search     364/518 364/521 340/729 434/169 434/185 434/167 434/172
Patent Tags     authoring sound synchronized animation
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
4884972
Gasper
434/185
Dec,1989

[0 after 0 votes]
4333152
Best
715/716
Jun,1982

[0 after 0 votes]
4305131
Best
715/716
Dec,1981

[0 after 0 votes]
4569026
Best
715/716
Dec,1969

[0 after 0 votes]
4445187
Best
463/31
Dec,1969

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


We claim:

1. Apparatus for generating and displaying user created animated objects having synchronized visual and audio characteristics, said apparatus comprising:

a program-controlled microprocessor;

first means coupled to said microprocessor and responsive to user input signals for generation a first set of signals defining visual characteristics of a desired animated object;

second means coupled to said microprocessor and to said first means and responsive to user input signals for generating a second set of signals defining audio characteristics of said desired animated object; and

controller means coupled to said first and second means and to said microprocessor for generating a set of instructions collating and synchronizing said visual characteristics with said audio characteristics thereby defining said animated object having synchronized visual and audio characteristics.

2. The apparatus as in claim 1 further comprising:

integrator means coupled to said microprocessor and responsive to command signals generated by said microprocessor for producing signals representing encoded elements of sound and encoded elements of constituent object parts, said constituent object parts associated with said visual characteristics, said microprocessor responsive to user input signals and to said set of instructions for generating said command signals;

audio means coupled to said microprocessor and to said integrator means responsive to said signals representing encoded elements of sound for producing sounds associated with said signals representing encoded elements of sound; and

display means coupled to said microprocessor, to said integrator means and to said sound emitting means responsive to said signals representing encoded elements of constituent object parts for displaying visual images of said desired animated object, said visual images having said visual characteristics synchronized with said audio characteristics.

3. Apparatus as in claim 2 wherein said first means is further coupled to said display means, said display means responsive to said user input signals for displaying images of said visual characteristics as said first set of signals is being generated.

4. Apparatus as in claim 3 wherein said second means is further coupled to said display means and includes testing and editing means responsive to user input for displaying said desired animated object and testing and editing the synchronization of said audio characteristics with said visual characteristics as said second set of signals is being generated.

5. Apparatus as in claim 4 further comprising storage means coupled to said microprocessor for storing a plurality of data sets, at least one of said data sets defining the visual characteristics of a predetermined prototype animated object.

6. Apparatus as in claim 5 wherein said plurality of data sets include at least one data set defining the audio characteristics of selectable predetermined text.

7. Apparatus as in claim 5 wherein said plurality of data sets include at least one data set defining the audio characteristics of selectable prerecorded sounds.

8. Apparatus as in claim 2 wherein said audio means includes speech synthesizer means for digitally synthesizing signals representing sounds associated with said signals representing encoded elements of sound.

9. A method for generating user created animated objects having synchronized visual and audio characteristics, said method comprising the steps of:

generating a first set of signals defining visual characteristics of a desired animated object in response to user input signals;

generating a second set of signals defining audio characteristics of said desired animated object in response to user input signals; and

generating a set of instructions collating and synchronizing said visual characteristics with said audio characteristics thereby defining said desired animated object having synchronized visual and audio characteristics.

10. The method of claim 9 including the step of displaying visual images of said desired animated object during the generation of said first set of signals.

11. A method of synchronizing sound with visual images of animated objects pronouncing the sound, said method comprising the steps of:

defining a text string representing a desired sound to be synchronized with visual images of a speaking animated object;

translating said text string into a phonetic text string representative of said text string; and

translating said phonetic text string into a recite command, said recite command including phonetic/timing pairs, each of said phonetic/timing pairs comprising a phonetic code corresponding to an associated phonetic code of said phonetic text string and a number defining a predetermined time value, said phonetic code representative of a sound element to be pronounced and an associated image to be displayed while said sound element is being pronounced and said predetermined time value defines the amount of time said associated image is to be displayed.

12. A method as in claim 11 including the step of displaying said associated images during the pronounciation of said desired sound for testing the accuracy of the synchronization between said animated object and said pronounced desired sound.

13. A method as in claim 12 wherein said time value is adjustable, the further step of adjusting the value of said time value to edit and tune the accuracy of the synchronization between said animated object and said pronounced desired sound.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

The present invention relates generally to computerized animation methods and, more specifically to a method and apparatus for creation and control of random access sound-synchronized talking synthetic actors and animated characters.

It is well-known in the prior art to provide video entertainment or teaching tools employing time synchronized sequences of pre-recorded video and audio. The prior art is best exemplified by tracing the history of the motion picture and entertainment industry from the development of the "talkies" to the recent development of viewer interactive movies.

In the late nineteenth century the first practical motion pictures comprising pre-recorded sequential frames projected onto a screen at 20 to 30 frames per second to give the effect of motion were developed. In the 1920's techniques to synchronize a pre-recorded audio sequence or sound track with the motion picture were developed. In the 1930's animation techniques were developed to produce hand drawn cartoon animations including animated figures having lip movements synchronized with an accompanying pre-recorded soundtrack. With the advent of computers, more and more effort has been channeled towards the development of computer generated video and speech including electronic devices to synthesize human speech and speech recognition systems.

In a paper entitled "KARMA: A system for Storyboard Animation" authored by F. Gracer and M. W. Blasgen, IBM Research Report RC 3052, dated Sep. 21, 1970, an interactive computer graphics program which automatically produces the intermediate frames between a beginning and ending frame is disclosed. The intermediate frames are calculated using linear interpolation techniques and then produced on a plotter. In a paper entitled "Method for Computer Animation of Lip Movements", IBM Technical Disclosure Bulletin, Vol. 14 No. 10 Mar., 1972, pages 5039, 3040, J. D. Bagley and F. Gracer disclosed a technique for computer generated lip animation for use in a computer animation system. A speech-processing system converts a lexical presentation of a script into a string of phonemes and matches it with an input stream of corresponding live speech to produce timing data. A computer animation system, such as that described hereinabove, given the visual data for each speech sound, generates intermediate frames to provide a smooth transition from one visual image to the next to produce smooth animation. Finally the timing data is utilized to correlate the phonetic string with the visual images to produce accurately timed sequences of visually correlated speech events.

Recent developments in the motion picture and entertainment industry relate to active viewer participation as exemplified by video arcade games and branching movies. U.S. Pat. Nos. 4,305,131; 4,333,152; 4,445,187 and 4,569,026 relate to remote-controlled video disc devices providing branching movies in which the viewer may actively influence the course of a movie or video game story. U.S. Pat. No. 4,569,026 entitled "TV Movies That Talk Back" issued on Feb. 4, 1986 to Robert M. Best discloses a video game entertainment system by which one or more human viewers may vocally or manually influence the course of a video game story or movie and conduct a simulated two-way voice conversation with characters in the game or movie. The system comprises a special-purpose microcomputer coupled to a conventional television receiver and a random-access videodisc reader which includes automatic track seeking and tracking means. One or more hand-held input devices each including a microphone and visual display are also coupled to the microcomputer. The microcomputer controls retrieval of information from the videodisc and processes viewers' commands input either vocally or manually through the input devices and provides audio and video data to the television receiver for display. At frequent branch points in the game, a host of predetermined choices and responses are presented to the viewer. The viewer may respond using representative code words either vocally or manually or a combination of both. In response to the viewer's choice, the microprocessor manipulates pre-recorded video and audio sequences to present a selected scene or course of action and dialogue.

In a paper entitled "Soft Machine: A Personable Interface", "Graphics Interface '84", John Lewis and Patrick Purcell disclose a system which simulates spoken conversation between a user and an electronic conversational partner. An animated person-likeness "speaks" with a speech synthesizer and "listens" with a speech recognition device. The audio output of the speech synthesizer is simultaneously coupled to a speaker and to a separate real-time format-tracking speech processor computer to be analyzed to provide timing data for lip synchronization and limited expression and head movements. A set of pre-recorded visual images depicting lip, eye and head positions are properly sequenced so that the animated person-likeness "speaks" or "listens". The output of the speech recognition device is matched against pre-recorded patterns until a match is found. Once a match is found, one of several pre-recorded responses is either spoken or executed by the animated person-likeness.

Both J. D. Bagley et al and John Lewis et al require a separate format-tracking speech processor computer to analyze the audio signal to provide real-time data to determine which visual image or images should be presented to the user. The requirement for this additional computer adds cost and complexity to the system and introduces an additional source of error.

SUMMARY OF THE INVENTION

The present invention provides a method and apparatus for a random access user interface referred to as hyperanimator, which enables a user to create and control animated lip-synchronized images or objects utilizing a personal computer. The present invention may be utilized as a general purpose learning tool, interface device between a user and a computer, in video games, in motion pictures and in commercial applications such as advertising, information kiosks and telecommunications. Utilizing a real-time random-access interface driver (RAVE) together with a descriptive authoring language called RAVEL (real-time random-access animation and vivification engine language), synthesized actors, hereinafter referred to as "synactors", representing real or imaginary persons and animated characters, objects or scenes can be created and programmed to perform actions including speech which are not sequentially pre-stored records of previously enacted events. Animation and sound synchronization are produced automatically and in real-time.

The communications patterns--the sounds and visual images of a real or imaginary person or of an animated character associated with those sounds--are input to the system and decomposed into constituent parts to produce fragmentary images and sounds. Alternatively, or in conjunction with this, well known speech synthesis methods may also be employed to provide the audio. That set of communications characteristics is then utilized to define a digital model of the motions and sounds of a particular synactor or animated character. A synactor that represents the particular person or animated character is defined by a RAVEL program containing the coded instructions for dynamically accessing and combining the video and audio characteristics to produce real-time sound and video coordinated presentations of the language patterns and other behavior characteristics associated with that person or animated character. The synactor can then perform actions and read or say words or sentences which were not prerecorded actions of the person or character that the synactor models. Utilizing these techniques, a synactor may be defined to portray a famous person or other character, a member of one's family or a friend or even oneself.

In the preferred embodiment, hyperanimator, a general purpose system for random access and display of synactor images on a frame-by-frame basis that is organized and synchronized with sound is provided. Utilizing the hyperanimator system, animation and sound synchronization of a synactor is produced automatically and in real time. Each synactor is made up of sixteen images, eight devoted to speaking and eight to animated expressions.

The eight speaking images correspond to distinct speech articulations and are sufficient to create realistic synthetic speaking synactors. The remaining eight images allow the synactor to display life-like expressions. Smiles, frowns and head turns can all be incorporated into the synactor's appearance.

The hyperanimator system provides the capability to use both synthetic speech and/or digitized recording to provide the speech for the synactors. Speech synthesizers can provide unlimited vocabulary while utilizing very little memory. To make a synactor speak, the text to be spoken is typed or otherwise input to the system. Then the text is first broken down into its phonetic components. Then the sound corresponding to each component is generated through a speaker as an image of the synactor corresponding to that component is simultaneously presented on the display device. Digitized recording provides digital data representing actual recorded sounds which can be utilized in a computer system. Utilizing a "synchronization lab" defined by the hyperanimator system, a synactor can speak with any digitized sound or voice that is desired.

The interface between the user and the hyperanimator system is defined as a stage or acting metaphor. The hyperanimator system allows the user to shift or navigate between a number of display screens or cards to create and edit synactor files. While other paradigms are possible, this one works well and allows relatively inexperienced users to understand and operate the hyperanimator system to create, edit and work with the synactors.

The dressing room is where synactors are created and edited and is where users and synactors spend most of their time. The dressing room comprises 16 cards, 1 for each of the synactor images describing a synactor. Buttons are provided on each card to allow the user to navigate between the cards by pressing or clicking on a button with a mouse or other input device. Within the dressing room, the image of the synactor is placed in a common area named the Synactor Easel. Utilizing separate utilities such as "paint tools" or "face clip art", the user can create and edit the synactor. With a paint tool, a synactor may be drawn from scratch or, with clip art, a synactor may be created by copying and "pasting" eyes, ears, noses and even mouths selected from prestored sets of the different features.

Once the synactor has been created or built in the dressing room, the user can transfer the synactor to a stage screen where the lip synchronization and animation of the actor may be observed. The stage screen includes a text field wherein a user can enter text and watch the synactor speak. If the synactor thus created needs additional work, the user can return the synactor to the dressing room for touchup. If the user is satisfied with the synactor, the synactor can be then saved to memory for future use.

In the hyperanimator system, the synactor file is manipulated like a document in any application. Copying, editing (transferring a synactor file to the dressing room) and deleting actors from memory is accomplished in the casting call screen. The casting call screen displays a stagehand clipboard and provides buttons for manipulating the synactor files.

Copying and deleting sound resources comprising digitized sounds is accomplished in the sound booth screen. The digitized sound resources are synchronized with the image of the synactor in the screen representing the hyperanimator speech synchronization lab. The speech sync lab examines the sound and automatically creates a phonetic string which is used to create the animation and sound synchronization of the synactor. The speech sync lab generates a command called a RECITE command which tells the RAVE driver which sound resource to use and the phonetic string with associated timing values which produces the desired animation. The speech sync lab also provides for testing and refinement of the animation. If the synchronization process is not correct, the user can modify the RECITE command manually.

The above described functions and screens are tied together and accessed essentially from a menu screen. The menu screen contains six buttons allowing a user easy navigation through the screens to the hyperanimator system features. At the center of the menu screen is displayed a synactor called the Hyperanimator Navigator who serves a guide for a user through the hyperanimator system. The RAVE system is responsible for the animation and sound synchronization of the synactors. RAVEL defines and describes the synactor while the RAVE scripting language is an active language which controls the synactor after it is created by a user. RAVE scripting language commands enable a programmer to control the RAVE for an application program created by the programmer utilizing a desired programming system. Utilizing facilities provided in the programming system to call external functions, the programmer invokes the RAVE and passes RAVE scripting language commands as parameters to it. The RAVE script command controller 43 interprets these commands to control the synactor.

Once a synactor is created, it is controlled in a program by scripts through the RAVE scripting language level. All of the onscreen animation is controlled by scripts in the host system through the RAVE scripting language. Various subroutines called external commands ("XCMD") and external functions ("XFCN") are utilized to perform functions not available in the host language, for example creating synactors from the dressing room. The RAVE XCMD processes information between the scripts and the RAVE driver. Fifteen separate commands are utilized to enable users to open, close, move, hide, show and cause the synactor to speak. A program may have these commands built in, selected among or generated by the RAVE driver itself at runtime.

The hyperanimator system of the present invention is user friendly and easily understood by inexperienced users. It provides a user with the capability to create animated talking agents which can provide an interface between people and computers.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of a system which displays computer generated visual images with real time synchronized computer generated speech according to the principles of the present invention;

FIG. 2 is a conceptual block diagram illustrating the hyperanimator synactor authoring and use as implemented in the system shown in FIG. 1;

FIG. 3 is a functional block diagram illustrating the major data flows and processes for the system shown in FIG. 1;

FIG. 4 is a functional block diagram illustrating a hierarchical overview of the Hyperanimator screens;

FIGS. 5a-5h are presentations illustrating the screen layout of the display screens corresponding to the major Hyperanimator screens shown in FIG. 4;

FIG. 6a is a presentation of the Face Clip Art menu screen;

FIGS. 6b and 6c are detailed presentations illustrating the screen layout, for example display screens subordinate to the menu screen of FIG. 6a;

FIGS. 7a and 7b are detailed presentations illustrating the screen layout for a second preferred embodiment of display screens subordinate to the dressing room screen;

FIG. 8 is a diagram illustrating the fields of a synactor model table record;

FIG. 9 is a conceptual block diagram illustrating the flow of speech editing and testing during the process of speech synchronization;

FIGS. 10a-10g are detailed presentations illustrating the screen layout for the speech synchronization process shown in FIG. 9; and

FIGS. 11a-11d are presentations of animation sequences illustrating the effects of coarticulation.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

Referring now to FIG. 1, in one preferred embodiment of the present invention, a special purpose microcomputer comprises a program controlled microprocessor 10 (a Motorola MC68000 is suitable for this purpose), random-access memory (RAM) 20, readonly only memory (ROM) 11, disc drive 13, video and audio input devices 7 and 9,1 user input devices such as keyboard 15 or other input devices 17 and output devices such as video display 19 and audio output device 25. RAM 20 is divided into four blocks which are shared by the microprocessor 10 and the various input and output devices.

The video output device 19 may be any visual output device such as a conventional television set or the CRT for a personal computer. The video output 19 and video generation 18 circuitry are controlled by the microprocessor 10 and share display RAM buffer space 22 to store and access memory mapped video. The video generation circuits also provide a 60 Hz timing signal interrupt to the microprocessor 10.

Also sharing the audio RAM buffer space 23 with the microprocessor 10 is the audio generation circuitry 26 which drives the audio output device 25. Audio output device 25 may be a speaker or some other type of audio transducer such as a vibrator to transmit to the hearing impaired.

Disc controller 12 shares the disc RAM 21 with the microprocessor 10 and provides control reading from and writing to a suitable non-volatile mass storage medium, such as floppy disc drive 13, for long-term storing of synactors that have been created using the hyperanimator system and to allow transfer of synactor resources between machines.

Input controller 16 for the keyboard 15 and other input devices 17 is coupled to microprocessor 10 and also shares disc RAM 21 with the disc controller 12. This purpose may be served by a Synertek SY6522 Versatile Interface Adaptor. Input controller 16 also coordinates certain tasks among the various controllers and other microprocessor support circuitry (not shown). A pointing input device 17 such as a mouse or light pen is the preferred input device because it allows maximum interaction by the user. Keyboard 15 is an optional input device in the preferred embodiment, but in other embodiments may function as the pointing device, or be utilized by an instructor or programmer to create or modify instructional programs or set other adjustable parameters of the system. Other pointing and control input devices such as a joy stick, a finger tip (in the case of a touch screen) or an eyemotion sensor are also suitable.

RAM 24 is the working memory of microprocessor 10. The RAM 24 contains the system and applications programs other information used by the microprocessor 10. Microprocessor 10 also accesses ROM 11 which is the system's permanent read-only memory. ROM 11 contains the operational routines and subroutines required by the microprocessor 10 operating system, such as the routines to facilitate disc and other device I/0, graphics primitives and real time task management, etc. These routines may be additionally supported by extensions and patches in RAM 24 and on disc.

Controller 5 is a serial communications controller such as a Zilog Z8530 SCC chip. Digitized samples of video and audio may be input into the system in this manner to provide characteristics for the talking heads and synthesized speech. Digitizer 8 comprises an audio digitizer and a video digitizer coupled to the video and audio inputs 7 and 9, respectively. Standard microphones, videocameras and VCRs will serve as input devices. These input devices are optional since digitized video and audio samples may be input into the system by keyboard 15 or disc drive 13 or may be resident in ROM 11.

Referring now also to FIG. 2, a conceptual block diagram of the animated synthesized actor, hereinafter referred to as synactor, editing or authoring and application system according to the principles of the present invention is shown. The animation system of the present invention, hereinafter referred to as "hyperanimator", is a general purpose system which provides a user with the capability to create and/or edit synactors and corresponding speech scripts and to display on a frame-by-frame basis the synactors thus created. The hyperanimation system provides animation and sound synchronization automatically and in real time. To accomplish this, the hyperanimator system interfaces with a real time random access driver (hereinafter referred to as "RAVE") together with a descriptive authoring language called "RAVEL" which is implemented by the system shown in FIG. 1.

Prototype models, up to eight different models, for synactors are input via various input devices 31. The prototype models may comprise raw video and/or audio data which is converted to digital data in video and audio digitizers 33 and 35 or any other program data which is compiled by a RAVEL compiler 37. The prototype synactors are saved in individual synactor files identified by the name of the corresponding synactor. The synactor files are stored in memory 39 for access by the hyperanimator system as required. Memory 39 may be a disk storage or other suitable peripheral storage device.

To create a new synactor or to edit an existing prototype synactor, the hyperanimator system is configured as shown by the blocks included in the CREATE BOX 30. The author system shell 41 allows the user to access a prototype synactor file via RAM 20 and display the synactor on a number of screens which will be described in detail hereinbelow. Utilizing the various tools provided by the screens and the script command controller 43, the user is able to create a specific synactor and/or create and test speech and behavior scripts to use in an application. The new synactor thus created may be saved in the original prototype file or in a new file identified by a name for the new synactor. The synactor is saved as a part of a file called a resource. Scripting created, for example, digitized sound "recite" commands can be saved to application source files by means of "clipboard" type copy and paste utilities. The microprocessor 10 provides coordination of the processes and control of the I/0 functions for the system.

When using a synactor, as an interactive agent between a user and an applications program, for example, the hyperanimator system is configured as shown by the USE BOX 40. User input to the applications controller 45 will call the desired synactor resource from a file in memory 39 via RAM 20. The script command controller 43 interprets script from the application controller 45 and provides the appropriate instructions to the display and the microprocessor 10 to use. Similarly, as during the create (and test) process, the microprocessor 10 provides control and coordination of the processes and I/0 functions for the hyperanimator system.

Referring now to FIG. 3, a functional block diagram illustrating the major data flows, processes and events required to provide speech and the associated synchronized visual animation is shown. A detailed description of the processes and events that take place in the RAVE system is given in co-pending U.S. patent application Ser. No. 06/935,298 which is incorporated by reference as if fully set forth herein and will not be repeated. The hyperanimator system comprises the author system shell 41, the application controller 45, the script command processor 49 and associated user input devices 47 and is interfaced with the RAVE system at the script command processor 49. In response to a user input, the application controller 45 or the author system shell 41 calls on the microprocessor 10 to fetch from a file in memory 39 a synactor resource containing the audio and visual characteristics of a particular synactor. As required by user input, the microprocessor will initiate the RAVE sound and animation processes. Although both the author system shell 41 and the application controller 45 both access the script command processor 49, the normal mode of operation would be for a user to utilize the author system shell 41 to create/edit a synactor and at a subsequent time utilize the application controller 45 to call up a synactor for use (i.e., speech and visual display) either alone or coordinated with a particular application.

The hyperanimator system is a "front end" program that interfaces the system shown in FIG. 1 to the RAVE system to enable a user to create and edit synactors. The system comprises a number of screen images (sometimes referred to as "cards") which have activatable areas referred to as buttons that respond to user actions to initiate preprogrammed actions or call up other subroutines. The buttons may be actuated by clicking a mouse on them or other suitable methods, using a touch-screen for example. The screen images also may have editable text areas, referred to as "fields". The hyperanimator system comprises a number of screens or cards which the user moves between by activating or "pressing" buttons to create, edit and work with synactors.

Referring now to FIGS. 4, 5a-5i and 6a-6f, FIG. 4 is a functional block diagram illustrating a hierarchical overview of the hyperanimator screens. The startup screen 51 comprises one card and informs a user that he or she is running the hyperanimator system. The startup screen also provides the user with bibliographic information and instructions to begin use of the hyperanimator system. Once the initiate button (not shown) has been pressed, the RAVE driver is called to perform system checks. The RAVE driver is a portion of the hyperanimator system that handles much of the programmatic functions and processes of the synactor handling. It introduces itself with a box message (not shown) which includes a "puppet" icon. After the initial checks have been passed, a star screen 53 is shown which provides a transition between the startup screen 51 and the menu screen 55. The menu screen 55 is then shown after the star screen 53. The startup screen 51 also includes a button (not shown) for taking the user to the hyperanimator credit screen 57. The credit screen 57 comprises one card and provides additional bibliographic information to the user. The credit screen 57 can be accessed three ways: from the startup screen 51, from the menu screen 55 and from the first card in the dressing room 59. Pressing or clicking anywhere on the credit screen 57 will take the user back to the card he or she was at before going to the credit screen 57.

The menu screen 55 (also shown in FIG. 5a) comprises one card and is provided to allow the user to navigate among the hyperanimator system features. Upon first entering the menu screen 55, the Hyperanimator Navigator 510 greets the user. The menu screen 55 contains seven buttons for accessing the hyperanimator system.

The seven buttons allow the user to: go to the dressing room 59, go to the casting call screen 67, go to the sound booth screen 63, go to the speech sync screen 65, go to the credit screen 57, and quit 513 the hyperanimator system. With the exception of the quit button, the buttons take the user to different cards within the hyperanimator system. The quit button closes hyperanimator and returns the user to the host operating system shell level in the host program. Anytime the user returns to the menu screen 55 from within the hyperanimator system, the HyperAnimator Navigator 510 will greet him or her.

The casting call screen 61 (also shown in FIG. 5b) comprises functions which allow the synactor files to be copied or deleted from memory 39 or placed in the dressing room 59. An appropriate designed button 521, 523 and 535 represents and initiates each of these tasks. Copying a synactor file takes the file resource of a selected synactor from an application program or synactor file and places an exact copy in a destination application program or synactor file. (A synactor file is defined as a file containing synactor resources only.) Placing a synactor into the dressing room 59 (also shown in FIG. 5c) allows the user to edit an existing synactor. The user selects a synactor from an application program or synactor file stored in memory 39. Deleting a synactor removes a selected synactor resource from an application program or synactor file in memory 39. The RAVE driver includes special commands to accomplish the tasks initiated at the casting call screen 61.

The sound booth screen 63 (also shown in FIG. 5f) comprises functions which allow sound resources to be copied or deleted from a file. Sound resources are portions of files which are sequential prerecorded digital representations of actual sound. They are input to the system via digital recording devices and stored as resource files in memory 39. An appropriately identified button 527, 529 initiates these functions. The sound booth screen also provides buttons 531, 533 to allow the user to return to the menu screen 55 and the speech sync screen 65.

The dressing room screen or dressing room 59 begins with an animated sequence (not shown) showing a door opening into a room. The dressing room 59 is used to create new synactors or to edit existing synactors. A user can access the dressing room 59 from the menu screen 55, from any Face Clip Art card 75, from the stage screen 77, from the spotlight screen 79 or from the casting call screen 61. The dressing room proper comprises sixteen cards 71. Placing a synactor into the dressing room 59 places each image 83 of the selected synactor in the synactor easel 85 on the respective cards 71 in the dressing room 59. For example, the REST image 83 is placed on the REST card 87 and the REST button 89 is highlighted. Each synactor will have sixteen images corresponding to respective ones of the sixteen cards 71 of the dressing room 59. Each of the sixteen cards 71 contains two buttons allowing the user to return to the menu screen 55 and go to the stage screen 77. Each of the sixteen cards 71 also includes a button 95 for taking the user to the Face Clip Art menu screen 73. Each of the sixteen cards 71 contain a field 97 at the top informing the user that he or she is currently in the dressing room 59. Each of the sixteen cards 71 includes a representation of a painter's easel called the synactor easel 85. Each of the sixteen cards 71 includes sixteen buttons 72 which represent each of the sixteen cards 71.

With these buttons 72, the user can immediately go to any of the sixteen cards 71 from any of the sixteen cards 71 within the dressing rom 59. For each of the sixteen cards 71, the button that represents itself is highlighted showing the user where they are within the dressing room 59. Each of the sixteen cards 71 has a field 99 which labels which of the sixteen cards it is. The sixteen cards 71 which make up the dressing room 59 are labeled as follows: REST, F, M, R, W, IH, AH, E, Al, A2, A3, A4, A5, A6, A7, AND A8.

The first eight cards deal with specific lip positions which correspond to the sounds of the letters that the cards represent. The last eight cards deal with any type of expression. The first eight cards each contains a picture in the field 99 of representative lips which indicate the lip position corresponding to the letter that card represents. The last eight cards contain the saying "Expressions" because expressions are not predefined (the user can design the expressions as desired; smiles or frowns, for example). The REST card 87 also has a special button 101 which enables the user to copy the image 83 that resides on the synactor easel 85 within the REST card 87 to the synactor easel 85 on every card within the dressing room 59. This button 101 is only present on the REST card 87. Each of the sixteen cards 71 in the dressing room 59 also include a menu 103 which allows access to additional tools such as paint tool or scrapbook applications which the user can manipulate to create or edit synactors. Pressing the stage button 93 on any of the dressing room's sixteen cards will initiate the building and copying of the synactor in the dressing room 59 into a temporary memory (not shown) and take the user to the stage screen 77 to display that synactor. No matter where the user is located within the dressing room 59, pressing the stage button 93 always selects the REST card 87 to begin building and copying the synactor into memory. When building the synactor, the art that is within the frame of the synactor easel 85 on the REST card 87 is selected and copied first. The hyperanimator system then calls on an external command (XCMD) which provides the memory location where that image is stored. The next dressing room card is then selected and the above procedure is repeated. Each of the sixteen dressing room cards is selected in sequence and the art within the frame of the synactor easel is copied. When all of images have been copied, a list of the memory locations for the images is sent to the RAVE driver where a synactor resource is built of those images in memory. At the completion of the synactor resource file building process, the user is transferred to the stage screen 77 to view the synactor thus created.

The stage screen 77, 78, 81 is a display for examining the lip-synchronization of newly constructed synactors. It is entered by pressing the appropriate button 93 found on any of the sixteen cards 71 of the dressing room 59. The stage screen consists of eight cards 77, 78, 81 of which seven are used for animation purposes (not shown). The first five cards 77 show stage curtains opening up. The sixth card 78 (also shown in FIG. 5d) is an open stage 105 where a newly created synactor 107 is displayed.

The stage screen 78 provides a button 109 and a field 111 which allow the user to enter in any text string and see and hear the synactor 107 speak. The "Read Script" button 109 takes the text string entered in the field 11 and calls the RAVE driver to create the animation and speak the text string through the RAVE system. The stage screen 78 contains three buttons 113, 115, 117 allowing the user to return to the menu screen, return to the dressing room, or go on to the spotlight screen 79, respectively, to save the newly constructed synactor 107.

If the user chooses to return to the menu screen, the newly constructed synactor is retired and the HyperAnimator Navigator 510 is returned. If the user chooses to return to the dressing room 59, the two remaining cards 81 in the stage screen are called showing the synactor being pulled from the stage 105 with a stage hook. If the user would like to save the synactor to a destination program or synactor file, the user should click or press on the spotlight screen button 117.

The spotlight screen 79 consists of one card (also shown in FIG. 5e) and allows the user to save a newly constructed synactor as a resource file. A newly constructed synactor exists as temporary data in RAM memory and must be saved permanently to a file or be lost. The spotlight screen 79 provides a field 119 where the user can type in a text string that will be the new synactor's file name. The text string must be one continuous word. The spotlight screen 79 has a "Save Actor" button 121 that allows the user to select a destination program or synactor file to save the newly constructed synactor resource in. If the destination program or synactor file already contains a synactor with the same name as the text file in the spotlight screen field 121, a different name must be selected or the existing synactor file will be lost. After the newly constructed synactor is saved, the user is taken back to the menu screen 55. The spotlight screen 79 also includes two buttons 123, 125 which allow the user to return to the menu screen 55 or to return to the dressing room 59.

Art which can be used to create synactors is provided within the hyperanimator system in a Face Clip Art screen 73, 75. The Face Clip Art screen comprises seventeen cards; one, shown in FIG. 6a, serves as a menu for navigating among the Face Clip Art cards 75 and the other sixteen cards 75 contain the actual art, examples of which are shown in FIGS. 6b and 6c. The Face Clip Art screen can be entered from any of the dressing room cards 71 through a Face Clip Art button 95. Upon entering the Face Clip Art screen, the user is first taken to the Face Clip Art Menu 73. From the Face Clip Art Menu 73, the user can directly acces