A speech synthesis apparatus for synthesizing speech to read out a text in place of a reader at a speed corresponding to a set time and the text volume, a medium on which is recorded a computer program for reading out a text in place of a reader, an apparatus for calculating time necessary for a reader to finish reading out a text on the basis of the reader's speech characteristic data of a prescribed word or sentence, and a medium on which is recorded a computer program for calculating the read out time.
A plurality of display control portions display multiple text sets simultaneously on a display screen. Meanwhile, only one reading portion converts a text set into voice to be read out. When the reading portion is not reading out the text set, a request for changing a selected text set is accepted, and when it is reading out the text set, such a request is rejected. Further, the title of the text set which is selected to be read out is displayed on the display. Since only one reading portion is provided, the exclusive control for reading out the text set can be facilitated. In addition, it is possible to easily identify which text set among multiple text sets is read out because the title of the work which is currently read out is displayed.
An apparatus and method for screening an individual's ability to process acoustic events is provided. The invention provides sequences (or trials) of acoustically processed target and distractor phonemes to a subject for identification. The acoustic processing includes amplitude emphasis of selected frequency envelopes, stretching (in the time domain) of selected portions of phonemes, and phase adjustment of selection portions of phonemes relative to a base frequency. After a number of trials, the method of the present invention develops a profile for an individual that indicates whether the individual's ability to process acoustic events is within a normal range, and if not, what processing can provide the individual with optimal hearing. The individual's profile can then be used by a listening or processing device to particularly emphasize, stretch, or otherwise manipulate an audio stream to provide the individual with an optimal chance of distinguishing between similar acoustic events.
On receipt of a tagged file, as a tagged document, at step S1, a document processing apparatus at step S2 derives the attribute information for read-out from tags of the tagged file and embeds the attribute information to generate a speech read-out file. Then, at step S3, the document processing apparatus performs processing suited for a speech synthesis engine, using the generated speech read-out file. At step S4, the document processing apparatus performs processing depending on the operation by the user through a user interface.
In a synthesis unit generator, a plurality of synthesis speech segments are generated by synthesizing training speech segments labeled with phonetic contexts and input speech segments while altering the pitch/duration of the input speech segments in accordance with the pitch/duration of the training speech segments. Typical speech segments are selected from the input speech segments on the basis of a distance between the synthesis speech segments and the training speech segments, and are stored in a storage. In addition, a plurality of phonetic context clusters corresponding to the synthesis units are generated on the basis of the distance, and are stored in a storage. A synthesis speech signal is generated by reading out, from the storage, those of the synthesis units, which correspond to the phonetic context clusters including phonetic contexts of input phonemes, and connecting the selected synthesis units in a speech synthesizer.
Embodiments of the present invention provide method and apparatus for determining audience affinity and/or aptitude in portions of media works and for developing information that represent measures of the audience affinity and/or aptitude. Further embodiments of present invention provide method and apparatus for utilizing the information to create altered media works and/or to present the altered media works to an audience. One embodiment of the present invention is a method for inferring audience affinity or aptitude with regard to content or properties of portions of a media work which includes: (a) presenting the media work to an audience; (b) obtaining user input regarding presentation rates for the portions of the media work; (c) correlating content or properties of the portion with the presentation rates; and; (d) associating audience affinity or aptitude with the correlated content or properties.