A method for converting speech to text and vice versa. The method for converting speech to text includes receiving a spoken input having a non-verbal characteristic, and automatically generating a text output, responsive to the spoken input, having a variable format characteristic corresponding to the non-verbal characteristic of the spoken input. The method for converting text to speech includes receiving a text input having a given variable format characteristic and synthesizing speech corresponding to the text input and having a non-verbal characteristic corresponding to the variable format characteristic of the text input.
A variable voice rate apparatus to control a reproduction rate of voice, includes a voice data generation unit configured to generate voice data from the voice, a text data generation unit configured to generate text data indicating a content of the voice data, a division information generation unit configured to generate division information used for dividing the text data into a plurality of linguistic units each of which is characterized by a linguistic form, a reproduction information generation unit configured to generate reproduction information set for each of the linguistic units, and a voice reproduction controller which controls reproduction of each of the linguistic units, based on the reproduction information and the division information.
A closed caption display controller to control a display mode of a closed caption corresponding to a speech of an audio signal, the controller comprises an analysis unit to analyze the speech on a speech quality, an examination unit configured to examine a speech listening level according to a given rule based on an analysis result of the analysis unit, and a determination unit to determine a display mode according to an examination result of the examination unit.
A voice discriminating tag for making a reception voice and a transmission voice distinguishable is added to the voice inputted into a cell phone. Further, a volume discriminating tag is added based on a detection result of a volume detector. A CPU converts the voice into character data and image data on the basis of the voice discriminating tag and the volume discriminating tag, referring to various files stored in file devices. The converted data is outputted to a display panel on which the character data corresponding to the transmission voice and the character data corresponding to the reception voice are shown in time series so as to have different colors. The character data is shown in a literal type corresponding to a volume level.
The system includes a computer for transmitting news information and a computer communicably connected to the computer for receiving the news information. The computer outputs the content of the news information as voice in an order predetermined based upon the content of the received news information, and displays an animation, which imitates a speaking individual, in conformity with the voice output. The user is thus capable of acquiring desired news information with facility.
Web server controls are provided for generating client side markups with recognition and/or audible prompting to enable telephone call controls such as making, transferring and disconnecting telephone calls.