or
Results for FIELD_OF_SEARCH: 704/240
Showing 1 - 10 of 836
A speech recognizing device. Natural speech recognizing means recognizes speech input in an application program by dictation. Recognition result converting means converts a recognition result from said natural speech recognizing means into a final recognition result processable by said application program on the basis of a grammar to he used for recognizing said input speech in a grammar method. The recognition result converting means further comprises candidate sentence generating means for evo...
A method and apparatus are provided for storing parameters of a deleted interpolation language model as parameters of a backoff language model. In particular, the parameters of the deleted interpolation language model are stored in the standard ARPA format. Under one embodiment, the deleted interpolation language model parameters are formed using fractional counts.
Apparatus and methods are provided for producing a zoomed image signal from an input image signal. The input image signal is separated into a number of block signals each representing a subarea of the input image. Class codes are produced based on the block signals. Each class code identifies predetermined image data of a zoomed image portion which corresponds to the subarea represented by the block signal on which the class code is based. The predetermined image data is generated in response to...
A video cassette recorder has a video screen zooming function, capable of zooming an optional region of one of plural still pictures sequentially displayed on a screen, as one picture. A digital signal processing unit converts composite video signals into digital video signals and a first memory unit stores the digital video signals by frames. A zoom signal processing unit zooms only digital video signals corresponding to a predetermined zoom region from the digital video signals so as to reconf...
Techniques and tools for selectively using multiple entropy models in adaptive coding and decoding are described herein. For example, for multiple symbols, an audio encoder selects an entropy model from a first model set that includes multiple entropy models. Each of the multiple entropy models includes a model switch point for switching to a second model set that includes one or more entropy models. The encoder processes the multiple symbols using the selected entropy model and outputs results....
Systems and methods for determining a confidence score associated with a decoding output of a speech recognition engine. In one embodiment, a method of determining the confidence score comprises arranging time frame and acoustic score data into an array, determining a phoneme sequence in the array that yields the highest sum of acoustic scores under certain constraints, e.g., minimum number of time frames and order of phonemes in a phoneme string. A relative score is derived by applying a functi...
Hidden Markov models (HMMs) rely on high-dimensional feature vectors to summarize the short-time properties of speech correlations between features that can arise when the speech signal is non-stationary or corrupted by noise. These correlations are modeled using factor analysis, a statistical method for dimensionality reduction. Factor analysis is used to model acoustic correlation in automatic speech recognition by introducing a small number of parameters to model the covariance structure of a...
An audio encoder implements multi-channel coding decision, band truncation, multi-channel rematrixing, and header reduction techniques to improve quality and coding efficiency. In the multi-channel coding decision technique, the audio encoder dynamically selects between joint and independent coding of a multi-channel audio signal via an open-loop decision based upon (a) energy separation between the coding channels, and (b) the disparity between excitation patterns of the separate input channels...
An active labeling process is provided that aims to minimize the number of utterances to be checked again by automatically selecting the ones that are likely to be erroneous or inconsistent with the previously labeled examples. In one embodiment, the errors and inconsistencies are identified based on the confidences obtained from a previously trained classifier model. In a second embodiment, the errors and inconsistencies are identified based on an unsupervised learning process. In both embodime...
A method and apparatus for speaker recognition is provided that matches the noise in training data to noise in testing data using spectral addition. Under spectral addition, the mean and variance for a plurality of frequency components are adjusted in the training data and the test data so that each mean and variance is matched in a resulting matched training signal and matched test signal. The adjustments made to the training data and test data add to the mean and variance of the training data ...
1 2 3 4 5 6 7 8 9 10
About| FAQs| Terms & Disclaimer| Link to Us| Contact Us