A speech model is produced for use in determining whether a speaker associated with the speech model produced an unidentified speech sample. First a sample of speech of a particular speaker is obtained. Next, the contents of the sample of speech are identified using speech recognition. Finally, a speech model associated with the particular speaker is produced using the sample of speech and the identified contents thereof. The speech model is produced without using an external mechanism to monitor the accuracy with which the contents were identified.
A method of recognizing a speaker of an input speech according to the distance between an input speech pattern, obtained by converting the input speech to a feature parameter series, and a reference pattern preliminarily registered as feature parameter series for each speaker is provided. Contents of the input and reference speech patterns is obtained by recognition. An identical section, in which the contents of the input and reference speech patterns are identical is determined. The distance between the input and reference speech patterns in the calculated identical content section is determined. The speaker of the input speech is recognized on the basis of the determined distance.
It is suggested to include application speech (AS) into the set of identification speech data (ISD) for training a speaker-identification process so as to make possible a reduction of the set of initial identification speech data (IISD) to be collected within an initial enrolment phase and therefore to add more convenience for the user to be registered or enrolled.
A technique for improved score calculation and normalization in a framework of recognition with phonetically structured speaker models. The technique involves determining, for each frame and each level of phonetic detail of a target speaker model, a non-interpolated likelihood value, and then resolving the at least one likelihood value to obtain a likelihood score.
A method, system, and program for identifying a context for a call are provided. Multiple context clues for a call are detected from a line number, a line subscriber profile, a caller profile, and other parameters associated with the call. A context for the call is identified from the context clues, such that at least one party to the call is enabled to receive the context of the call. Context for the call preferably includes at least one of who is placing a call, who is receiving a call, identities of devices utilized for the call, locations of those devices, the path of line number numbers accessed for a call, a billing plan for the call, and a subject matter of the call.
A method, system, and program for controlling advertising output during hold periods are provided. A context for a call on hold is detected. An advertisement is selected for output during a hold space a hold period of the call according to the context. Output of the advertisement during the hold space is controlled, wherein the advertisement is specified according to the context. The advertisement may include text messages, audio messages, video messages, for advertising a product or service or making an announcement.