In a method and a device for reliably detecting at least the beginning of a voice command for a voice control of function elements, signals of a first microphone to which contact sound of an operating person is applied, are used for triggering a second microphone which is directed towards the mouth of the operating person in order to improve the reliability of the voice control in high ambient noise.
A handsfree structure with antibackground noise function, comprising: a male plug, coupled to an output of a mobilephone; a voice receiving device, coupled to said male plug by a wire for picking up a voice signal of a user; a reaction type voice receiving device, coupled to said male plug by said wire for picking up said voice signal of said user by sensing the vibration of skin of said user; and a earphone set, coupled to said male plug by said wire for transformation a electric signal received from said mobilephone into a voice signal; while said mobilephone receiving a call, said user could switch said handsfree structure from said voice receiving device to said reaction type voice receiving device or from said reaction type voice receiving device to said voice receiving device according to a background noise of the environment for getting more clear voice signal so as to improve the communication quality of a high background noise environment.
The recognition rate of a speech recognition system is improved by compensating for changes in the user's speech that result from factors such as emotion, anxiety or fatigue. A speech signal derived from a user's utterance, and a bio-signal, which is indicative of the user's emotional state, are provided to a speech recognition system. The bio-signal is used to provide a reference frequency that changes when the user's emotional state changes. An utterance is identified by examining the relative magnitudes of its frequency components and the position of the frequency components relative to the reference frequency.
The recognition rate of a speech recognition system is improved by compensating for changes in the user's speech that result from factors such as emotion, anxiety or fatigue. A speech signal derived from a user's utterance is modified by a preprocessor and provided to a speech recognition system to improve the recognition rate. The speech signal is modified based on a bio-signal which is indicative of the user's emotional state.
A bio-signal is monitored while a speech recognition system is trained to recognize a word or utterance. An utterance is identified for retraining when the bio-signal is above an upper threshold or below a lower threshold while the recognition system is being trained to recognize the utterance.
An electronic device, e.g. a mobile telephone, includes speech recognition means for controlling the operation of the device in response to a voice command. The device includes a proximity sensor e.g. a capacitive, inductive, or IR-red proximity sensor, for providing a control signal indicative of whether an object is in proximity of the device, and control means for controlling the voice controlled operation of the device in response to the control signal.