Computer programs such as word processing, spreadsheet and electronic mail are run from a remote telecommunications terminal, without benefit of any user interaction with a screen display, by software associated with the computer that translates a combination of remotely-generated tone and voice signals to executable application program commands. The communication protocols use a conversant system's query-response sequences to call up an application and to run selected portions of the program. Pre-recorded voiceprints of a particular user's voice commands and utterances required to operate the software are stored at the computer; and compared during usage to the actual utterances. Insufficient matches result in suspension or termination of access.
Communication devices, methods, and computer program products may be used to transmit information to a destination by associating a voice command with the destination and associating a signaling message with the voice command and with the destination. After establishing communication with the destination, speech input that is received from a user is compared with the voice command to determine if there is a match. If the speech input matches the voice command, then the signaling message associated with the voice command is transmitted to the destination. A user may therefore be relieved of the burden of having to remember the keystrokes to perform a specific operation or function by embedding the signal(s) corresponding to the operation in the signaling message. Moreover, because the signaling message is associated with both the voice command and the destination, the same voice command may be used to perform an operation on more than one destination. For example, a user may have multiple bank accounts such that when a first bank is called, speaking the command "balance" results in a first signaling message (e.g., a specific key sequence) being transmitted to the first bank's automated account information system. Likewise, when a second bank is called, speaking the command "balance" results in a second signaling message, different from the first signaling message, being transmitted to the second bank's automated account information system.
A method for enabling dictation into a speech application from an audio-only interface, in accordance with an inventive arrangement, can include the following steps beginning with creating a dictation template with a plurality of named dictation fields and respective audio prompts identifying each of the dictation fields by a respective name. Second, the template can be opened in response to a command spoken through the audio-only interface. Third, a first one of the audio prompts corresponding to a first one of the dictation fields can be transmitted through the audio-only interface. Fourth, dictation can be accepted into the first one of the dictation fields through the audio-only interface. Fifth, a subsequent one of the dictation fields can be opened in response to another command spoken through the audio-only interface. Sixth, a subsequent one of the audio prompts corresponding to the subsequent one of the dictation fields can be transmitted through the audio-only interface. Seventh, dictation can be accepted into the subsequent one of the dictation fields through the audio-only interface. Finally, each of the fifth, sixth and seventh steps can be repeated until the dictation is complete.
A method for programming a grammar code for a speech-enabled computer program, comprising the steps of: enabling a plurality of natural command grammars containing respective representations of valid expressions for said speech-enabled computer program; providing as one of said valid expressions a plurality of ordered words virtually certain not to be uttered by someone not already aware of said plurality of ordered words; and, causing said program to automatically initiate a perceptible action in response to recognition of said plurality of ordered words. The method can further comprise the step of providing translation rules for said plurality of command grammars, wherein speech recognition of said plurality of ordered words invokes one of said translation rules, said invoked one of said translation rules initiating said perceptible action. The action can be activation of a predetermined printer function, displaying a predetermined message on a display device and printing a predetermined message.
A control system for a modular, mixed initiative, human-machine interface. The control system comprises moves, the moves defining units of interaction about a topic of information. The moves comprise at least one system move and at least one user move. Each system move is structured such that it contains information regarding pre-processing to be performed, information to develop a prompt to be issued to the user and information that enables possible user moves which can follow the system move to be listed. Each user move is structured such that it contains information relating to interpretation grammars that trigger the user move, information relating to processing to be performed based upon received and recognized data and information regarding the next move to be invoked. A corresponding method is provided.
A method for enrolling a user in a speech recognition system, without requiring reading, comprises the steps of: generating an audio user interface having an audible output and an audio input; audibly playing a text phrase; audibly prompting the user to speak the played phrase; repeating the steps of audibly prompting the user not to speak, audibly playing the phrase and audibly prompting the user to speak, for a plurality of further phrases; and, processing enrollment of the user based on the audibly prompted and subsequently spoken phrases. A graphical user interface can also be generated for: displaying text corresponding to the phrases and to the audible prompts; displaying a plurality of icons for user activation; and, selectively distinguishing different ones of the icons at different times by at least one of: color; shape; and, animation.