WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
System and method for developing interactive speech applications    
United States Patent6173266   
Link to this pagehttp://www.wikipatents.com/6173266.html
Inventor(s)Marx; Matthew T. (Everett, MA), Carter; Jerry K. (Somerville, MA), Phillips; Michael S. (Belmont, MA), Holthouse; Mark A. (Newton, MA), Seabury; Stephen D. (Boston, MA), Elizondo-Cecenas; Jose L. (Boston, MA), Phaneuf; Brett D. (Marshfield, MA)
AbstractDialogue modules are provided, with each dialogue module includes computer readable instructions for accomplishing a predefined interactive dialogue task in an interactive speech application. In response to user input, a subset of the plurality of dialogue modules are selected to accomplish their respective interactive dialogue tasks in the interactive speech application and are interconnected in an order defining the call flow of the application, and the application is generated. A graphical user interface represents the stored plurality of dialogue modules as icons in a graphical display in which icons for the subset of dialogue modules are selected in the graphical display in response to user input, the icons for the subset of dialogue modules are graphically interconnected into a graphical representation of the call flow of the interactive speech application, and the interactive speech application is generated based upon the graphical representation. Using the graphical display, the method further includes associating configuration parameters with specific dialogue modules. Each configuration parameter causes a change in operation of the dialogue module when the interactive speech program executes. A window is displayed for setting the value of the configuration parameter in response to user input, when an icon for a dialogue module having an associated configuration parameter is selected.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 6173266
System and method for developing interactive speech applications - US Patent 6173266 Drawing
System and method for developing interactive speech applications
Inventor     Marx; Matthew T. (Everett, MA) , Carter; Jerry K. (Somerville, MA) , Phillips; Michael S. (Belmont, MA) , Holthouse; Mark A. (Newton, MA) , Seabury; Stephen D. (Boston, MA) , Elizondo-Cecenas; Jose L. (Boston, MA) , Phaneuf; Brett D. (Marshfield, MA)
Owner/Assignee     SpeechWorks International, Inc. (Boston, MA)
Patent assignment
All assignments
Publication Date     January 9, 2001
Application Number     09/081,719
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     May 6, 1998
US Classification     704/270 704/272
Int'l Classification    
Examiner     Dorvil; Richemond
Assistant Examiner    
Attorney/Law Firm     Mintz, Levin, Cohn, Ferris, Glovsky and Popeo, P.C.
Address
Parent Case     RELATED APPLICATIONS This application claims priority from provisional application, U.S. Ser. No. 60/045,741, filed on May 6, 1997, which is incorporated herein by reference.
Priority Data    
USPTO Field of Search     704/270 704/272 704/275 704/231 704/200 704/276 704/255 704/256 704/257
Patent Tags     developing interactive speech applications
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
6058166
Osder et al.

May,2000

[0 after 0 votes]
6035275
Brode et al.

Mar,2000

[0 after 0 votes]
5905774
Tatchell et al.

May,1999

[0 after 0 votes]
5842193
Reilly

Nov,1998

[0 after 0 votes]
5774860
Bayya et al.

Jun,1998

[0 after 0 votes]
5694558
Sparks et al.

Dec,1997

[0 after 0 votes]
5652789
Miner et al.

Jul,1997

[0 after 0 votes]
5638425
Meador, III et al.

Jun,1997

[0 after 0 votes]
5615296
Stanford et al.

Mar,1997

[0 after 0 votes]
5594638
Iliff

Jan,1997

[0 after 0 votes]
5566272
Brems et al.

Oct,1996

[0 after 0 votes]
5544305
Ohmaye et al.

Aug,1996

[0 after 0 votes]
5479488
Lennig et al.

Dec,1995

[0 after 0 votes]
5357596
Takebayashi et al.

Oct,1994

[0 after 0 votes]
4625081
Lotito et al.

Nov,1986

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A computer-implemented method of constructing, by a developer, an interactive speech application for use by an application user, the method comprising:

providing a plurality of dialogue modules, wherein each dialogue module includes computer readable instructions for accomplishing a predefined interactive dialogue task;

selecting, in response to developer input, at least one of the plurality of dialogue modules to accomplish at least one respective interactive dialogue task; and

establishing, in response to developer input, at least one relationship between the at least one dialogue module and a dialogue-processing unit other than the at least one dialogue module to define a flow of dialogue between the application user and the interactive speech application.

2. The method of claim 1, further comprising:

setting, in response to developer input, at least one configuration parameter associated with at least one of the dialogue modules, wherein each configuration parameter affects how an associated dialogue module operates when the interactive speech application executes.

3. The method of claim 2, wherein an interactive dialogue task associated with a dialogue module includes outputting a prompt to the application user and receiving a response from the application user, and at least one of the at least one configuration parameters is a timeout parameter defining a period of time for the application user to respond after a prompt is output.

4. The method of claim 2, wherein an interactive dialogue task associated with a dialogue module includes outputting a prompt to the application user and receiving a response from the application user, and at least one of the at least one configuration parameter is a prompt parameter defining a prompt to be output.

5. The method of claim 2, wherein an interactive dialogue task associated with a dialogue module includes outputting a prompt to the application user and receiving a response from the application user, and at least one of the at least one configuration parameters is an apology prompt parameter for defining an apology prompt to be output if the application user's response is not recognized.

6. The method of claim 2, wherein an interactive dialogue task associated with a dialogue module includes outputting a prompt to the application user and receiving a response from the application user, and at least one of the at least one configuration parameters is a parameter for designating recognizable responses from the application user.

7. The method of claim 1, further comprising storing the selected at least one dialogue module and an indication of the at least one relationship.

8. The method of claim 1, wherein an interactive dialogue task associated with a dialogue module includes:

instructions for outputting a prompt to the application user;

instructions for receiving a response from the application user; and

instructions for interacting with a speech recognition engine for recognizing the received response using recognition models.

9. The method of claim 8, wherein an interactive dialogue task associated with a dialogue module further includes instructions for updating the recognition models used by the speech recognition engine based on recognized responses during execution of the interactive speech application.

10. The method of claim 1, further comprising:

graphically representing the plurality of dialogue modules as icons in a graphical display, wherein:

the selecting includes receiving an indication of the at least one dialogue module; and

the establishing includes graphically interconnecting the icon representing the at least one dialogue module with a graphical indication representing the other dialogue-processing unit according to the at least one relationship.

11. The method of claim 10, further comprising:

displaying a window in the graphical display for setting a value of a configuration parameter when an icon for a dialogue module having an associated configuration parameter is selected in response to developer input; and

setting the value of the configuration parameter in response to developer input;

wherein the configuration parameter affects how a dialogue module operates when the interactive speech application executes.

12. The method of claim 1 wherein the selecting includes selecting at least two dialogue modules.

13. The method of claim 12 wherein the another dialogue-processing unit is a selected dialogue module.

14. The method of claim 13 wherein the selecting includes selecting at least two different dialogue modules and the another dialogue-processing unit is different from the selected dialogue module with which the another dialogue-processing unit has a relationship established.

15. A memory device storing computer-readable instructions for enabling a developer to construct an interactive speech application in a speech processing system, the instructions being for causing a computer to:

perform a plurality of predefined interactive dialogue tasks, at least the instructions associated with the tasks forming a respective plurality of dialogue module templates;

produce, in response to developer input, a plurality of dialogue module instances for use in an interactive speech application, wherein each dialogue module instance is based on, and is a customized version of, one of the dialogue module templates, the dialogue module templates and the dialogue module instances being forms of dialogue modules; and

establish, in response to developer input, at least one relationship between at least two dialogue modules to define a dialogue flow.

16. The memory device of claim 15, further comprising instructions for causing a computer to:

set, in response to developer input, a value of at least one configuration parameter associated with at least one of the dialogue modules, wherein each configuration parameter affects how an associated dialogue module operates when the interactive speech application executes.

17. The memory device of claim 16, wherein an interactive dialogue task associated with a dialogue module includes outputting a prompt to the application user and receiving a response from the application user, and at least one of the at least one configuration parameter is a parameter for designating recognizable responses from the application user.

18. The memory device of claim 15, further comprising instructions for causing a computer to store the at least two dialogue modules and an indication of the relationship between the at least two dialogue modules.

19. The memory device of claim 15, wherein an interactive dialogue task associated with a dialogue module includes instructions for causing a computer to:

output a prompt to the application user;

receive a response from the application user; and

interact with a speech recognition engine for recognizing the received response using recognition modules.

20. The memory device of claim 19, wherein an interactive dialogue task associated with a dialogue module further includes instructions for causing a computer to update the recognition models used by the speech recognition engine based on recognized responses during execution of the interactive speech application.

21. The memory device of claim 15, further comprising:

instructions for causing a computer to graphically represent the plurality of dialogue modules as icons in a graphical display,

wherein:

the instructions for causing a computer to produce the dialogue module instances include instructions for causing a computer to select the plurality of dialogue modules templates in response to developer input and instructions for causing a computer to graphically represent the dialogue module instances as icons in the graphical display; and

the instructions for causing a computer to establish at least one relationship between at least two dialogue modules include instructions for causing a computer to graphically interconnect the icons representing the dialogue modules into a graphical representation of the dialogue flow of the interactive speech application in accordance with the at least one relationship.

22. A computer program product, residing on a computer-readable medium, for enabling a speech-application developer to construct an interactive speech application for use by an application user, the computer program product comprising instructions for causing a computer to:

provide a plurality of dialogue modules, each dialogue module including computer-readable instructions for causing a computer to accomplish a predefined interactive dialogue task;

select, in response to developer input, at least one of the plurality of dialogue modules to accomplish at least one respective interactive dialogue task; and

establish, in response to developer input, at least one relationship between the at least one dialogue module and a dialogue-processing unit other than the at least one dialogue module to define a flow of dialogue between the application user and the interactive speech application.

23. The computer program product of claim 22 wherein the instructions for causing a computer to select include instructions for causing a computer to select, in response to developer input, at least two dialogue modules.

24. The computer program product of claim 23 wherein the another dialogue-processing unit is a selected dialogue module.

25. The computer program product of claim 22 further comprising instructions for causing a computer to store the selected at least one dialogue module and an indication of the at least one relationship.
 Description Submit all comments and votes
 


FIELD OF THE INVENTION

The present invention relates generally to a system and method for developing computer-executed interactive speech applications.

BACKGROUND

Computer-based interactive speech applications are designed to provide automated interactive communication, typically for use in telephone systems to answer incoming calls. Such applications can be designed to perform various tasks of ranging complexity including, for example, gathering information from callers, providing information to callers, and connecting callers with appropriate people within the telephone system. However, using past approaches, developing these applications has been difficult.

FIG. 1 shows a call flow of an illustrative interactive speech application 100 for use by a Company A to direct an incoming call. Application 100 is executed by a voice processing unit or PBX in a telephone system. The call flow is activated when the system receives a incoming call, and begins by outputting a greeting, "Welcome to Company A" (110).

The application then lists available options to the caller (120). In this example, the application outputs an audible speech signal to the caller by, for example, playing a prerecorded prompt or using a speech generator such as text-to-speech converter: "If you know the name of the person you wish to speak to, please say the first name followed by the last name now. If you would like to speak to an operator, please say `Operator` now."

The application then waits for a response from the caller (130) and processes the response when received (140). If the caller says, for example, "Mike Smith," the application must be able to recognize what the caller said and determine whether there is a Mike Smith to whom it can transfer the call. Robust systems should recognize common variations and permutations of names. For example, the application of FIG. 1 may identify members of a list of employees of Company A by their full names--for example, "Michael Smith." However, the application should also recognize that a caller asking for "Mike Smith" (assuming there is only one employee listed that could match that name) should also be connected to the employee listed as "Michael Smith."

Assuming the application finds such a person, the application outputs a confirming prompt: "Do you mean `Michael Smith`?" (150). The application once again waits to receive a response from the caller (160) and when received (170), takes appropriate action (180). In this example, if the caller responded "Yes," the application might say "Thank you. Please hold while I transfer your call to Michael Smith," before taking the appropriate steps to transfer the call.

FIG. 2 shows some of the steps that are performed for each interactive step of the interactive application of FIG. 1. Specifically, applying the process of FIG. 2 to the first interaction of the application described in FIG. 1, the interactive speech application outputs the prompt of step 120 of FIG. 1 (210). The application then waits for the caller's response (220, 130). This step should be implemented not only to process a received response, as shown in the example of FIG. 1 (140), but also to handle a lack of response. For example, if no response is received within a predetermined time, the application can be implemented to "time out" (230) and reprompt the caller (step 215) with an appropriate prompt such as "I'm sorry, I didn't hear your response. Please repeat your answer now," and return to waiting for the caller's response (220, 130).

When the application detects a response from the caller (240), step 140 of FIG. 1 attempts to recognize the caller's speech, which typically involves recording the waveform of caller's speech, determining a phonetic representation for the speech waveform, and matching the phonetic representation with an entry in a database of recognized vocabulary. If the application cannot determine any hypothesis for a possible match (250), it reprompts the caller (215) and returns to waiting for the caller's response (220). Generally, the reprompt is varied at different points in the call flow of the application. For example, in contrast to the reprompt when no response is received during the time out interval, the reprompt when a caller's response is received but not matched with a recognized response may be "I'm sorry, I didn't understand your response. Please repeat the name of the person to whom you wish to speak, or say `Operator.`"

If the application comes up with one or more hypotheses of what the caller said (260, 270), it determines a confidence parameter for each hypothesis, reflecting the likelihood that it is correct. FIG. 2 shows that the interpretation step (280) may be applied for both low confidence and high confidence hypotheses. For example, if the confidence level falls within a range determined to be "high" (step 260), an application may be implemented to perform the appropriate action (290, 180) without going through the confirmation process (150, 160, 170). Alternatively, an application can be implemented to use the confirmation process for both low and high confidence hypotheses. For example, the application of FIG. 1 identifies the best hypothesis to the caller and asks whether it is correct.

If the application interprets the hypothesis to be incorrect (for example, if the caller responds "No" to the confirmation prompt of step 150), the application rejects the hypothesis and reprompts the caller to repeat his or her response (step 215). If the application interprets the hypothesis to be correct (for example, if the caller responds affirmatively to the verification prompt), the application accepts the hypothesis and takes appropriate action (290), which in the example of FIG. 1, would be to output the prompt of 180 and transfer the caller to Michael Smith.

As exemplified by application 100 of FIGS. 1 and 2, interactive speech applications idare complex. Implementing an interactive speech application such as that described with reference to FIGS. 1 and 2 using past application development tools requires a developer to design the entire call flow of the application, including defining vocabularies to be recognized by the application in response to each prompt of the application. In some cases, vocabulary implementation can require the use of an additional application such as a database application. In the past approaches, it has been time consuming and complicated for the developer to ensure compatibility between the interactive speech application and any external applications and data it accesses.

Furthermore, the developer must design the call flow to account for different types of responses for the same prompt in an application. In general, past approaches require that the developer define a language model of the language to be recognized, typically including grammar rules to generally define the language and to more specifically define the intended call flow of the interactive conversation to be carried on with callers. Such definition is tedious.

Because of the inevitable ambiguities and errors in understanding speech, an application developer also needs to provide error recovery capabilities, including error handling and error prevention, to gracefully handle speech ambiguities and errors without frustrating callers. This requires the application developer not only to provide as reliable a speech recognition system as possible, but also to design alternative methods for successfully eliciting and processing the desired information from callers. Such alternative methods may include designing helpful prompts to address specific situations and implementing different methods for a caller to respond, such as allowing callers to spell their responses or input their responses using the keypad of a touch-tone phone. In past approaches, an application developer is required to manually prepare error handling, error prevention, and any alternative methods used in them. This is time consuming and may lead to omissions of functions or critical steps.

Based on the foregoing, there is a clear need in this field for an interactive speech development system and method that overcome these shortcomings.

SUMMARY

In general, in one aspect, the invention features a computer-implemented method of constructing an interactive speech