A web-based voice dialog interface for use in communicating dialog information between a user at a client machine and one or more servers coupled to the client machine via the Internet or other computer network. The interface in an illustrative embodiment includes a web page interpreter for receiving information relating to one or more web pages. The web page interpreter generates a rendering of at least a portion of the information for presentation to a user in an audibly-perceptible format. A grammar processing device utilizes interpreted web page information received from the web page interpreter to generate syntax information and semantic information. A speech recognizer processes received user speech in accordance with the syntax information, and a natural language interpreter processes the resulting recognized speech in accordance with the semantics information to generate output for delivery to a web server in conjunction with a voice dialog which includes the user speech and the rendering of the web page(s). The output may be processed by a common gateway interface (CGI) formatter prior to delivery to a CGI associated with the web server.
PRIORITY CLAIM
The present application claims the priority of U.S. Provisional Application No. 60/135,130 filed May 20, 1999 and entitled "Web-Based Voice Dialog Interface."
A virtual speech interface system and method (100) for controlling a client device (103) using speech commands for electronic devices that do not include integrated speech control capability includes a virtual speech interface client program (113) installed within the client device for controlling a client device application (104). A virtual speech interface server device (101) that is separate from the client device (103 is then used to interface with the client program (113). The virtual speech interface server device (101 includes at least one server (111) for sending and/or receiving command information to the virtual speech interface client program (113) for controlling the client device (103) using speech commands.
An apparatus and method are provided for estimating the grade of service (52) and offered traffic (51) for voice over internet protocol calls at a gateway (2) bridging calls between a public switched telephone network (3) and an internet protocol network (4), the gateway (2) having a dial-control management information base. The method comprises the steps of periodically polling the dial-control management information base for dial peer traffic statistics (44), storing the polled data, estimating the carried traffic using the polled data (501), estimating the grade of service (52) by utilizing the Erlang-B formula in an inverse manner (502), operating on the estimated carried traffic obtained in the first estimating step (501), and estimating the offered traffic (51) using the estimated values for the carried traffic and the grade of service (52) obtained in the previous estimation steps (503). In a second embodiment of the invention, a system (FIG. 4) utilizing the method, continuously monitors the grade of service (52) and offered traffic (51) at gateways (2) in an internet protocol telecom network supporting voice over internet protocol. An enhancement of the system further comprises a world wide web interface (46) for generating monitoring reports.
A method of dynamically formatting a speech menu construct can include a series of steps. A markup language document containing a reference to a server-side program can be provided. The server-side program can be programmed to dynamically format data using a voice-enabled markup language. A database can be accessed using the server-side program. The database can have a plurality of data items. Using the voice-enabled markup language, the selected data items can be formatted thereby creating speech menu items. The speech menu items can specify a speech menu construct resulting in a menu interface that is dynamically generated from data in data store, rather than being written by a programmer, and allows the user to "speak to the data."
It is one object of the present invention to provide a method and an apparatus whereby a Web page creator can easily specify, in a Web page, a desired text area for audible reading, merely by using an input device, such as a keyboard or a mouse. A Web page creator need only use an input device to mark and specify a desired text portion that is to be read audibly, so that tag information and a program, which are required when a browser reads a specific portion of text on a Web page, are automatically inserted into a Web page that is being created. Further, a tag for visually displaying an area to be read audibly is automatically inserted, so that a creator and a reader of a Web page can easily identify a portion that is to be read audibly.
Disclosed is a system and method for generating a spoken dialog service from website data. Spoken dialog components typically include an automatic speech recognition module, a language understanding module, a dialog management module, a language generation module and a test-to-speech module. These components are capable of being automatically trained from processed website data. A website analyzer converts a website into structured text data set and a structured task knowledge base. The website analyzer further extracts linguistic items from the website data. The dialog components are automatically trained from the structured text data set, structured task knowledge base and linguistic items.