A production script for interactions between participants and an automated data collection system is created by selectively tuning an experimental script through successive trials until a recognition rate of the system is at least an acceptability threshold. The data collection system uses a semi-constrained grammar, which is a practical accommodation of a larger range of possible inputs than a menu. The data collection system collects data by recognizing utterances from participants in accordance with the production script.
A method of modeling a speech recognition system includes decoding a speech signal produced from a training text to produce a sequence of predicted speech units. The training text comprises a sequence of actual speech units that is used with the sequence of predicted speech units to form a confusion model. In further embodiments, the confusion model is used to decode a text to identify an error rate that would be expected if the speech recognition system decoded speech based on the text.
A customer survey design system that allows survey organizations to easily design and modify the survey, and to do so in a shorter period of time. In particular, there is a telephone survey creation method usable by a non-technical person, using a single web site and an audio responsive system like a telephone. The client is allowed to create a customer survey to be taken over a phone. Specifically, there is a step of accessing a web site and begin creating the survey by typing in survey questions and enteries. Additionally, there is a second step of completing the creation of the survey by providing a voice script for each question typed onto the web site. Creating the survey also includes the step of allowing the client to modify the questions created on the web site. Additionally, there is a method of accessing a web site that further comprises the steps of a) selecting an entry from a pre-set entry list for the survey, b) determining if the selected entry is a question, and if so, prompting the client to type the question to be asked in the survey, and thereafter completing the question criteria, and returning to the step of selecting a pre-set entry.
According to an apparatus form of the invention, an apparatus of a size and weight that is suitable to be carried easily in a pocket or purse includes telecommunications circuitry operable to wirelessly send and receive voice communications directly to and from a cell phone network. The apparatus also has a processor communicatively coupled to the telecommunications circuitry for receiving the voice communications. A memory of the apparatus has program instructions for speech recognition stored therein and the processor is operable under control of the speech recognition program instructions to generate a transcript of the voice communications. The apparatus also includes a display and the memory has program instructions for a display function stored therein. The processor is operable under control of the display program instructions to project the transcript on the display.
A method and apparatus for personalizing voice messages to be used by a voice mail system in interacting with a user based on information provided by the user in an interactive communication between the voice mail system and the user. The method comprises the steps of creating sets of recorded messages according to distinct personalities for interacting with the voice mail system, selecting a recorded message from the plurality of sets of recorded messages based on interactive inquiries between the user and the voice mail system, and personalizing the selected recorded message responsive to the information provided by the user.
A method of modeling a speech recognition system includes decoding a speech signal produced from a training text to produce a sequence of predicted speech units. The training text comprises a sequence of actual speech units that is used with the sequence of predicted speech units to form a confusion model. In further embodiments, the confusion model is used to decode a text to identify an error rate that would be expected if the speech recognition system decoded speech based on the text.