Joshi et al., 2016 - Google Patents
Modified mean and variance normalization: transforming to utterance-specific estimatesJoshi et al., 2016
- Document ID
- 7249241965735153674
- Author
- Joshi V
- Prasad N
- Umesh S
- Publication year
- Publication venue
- Circuits, systems, and signal processing
External Links
Snippet
Cepstral mean and variance normalization (CMVN) is an efficient noise compensation technique popularly used in many speech applications. CMVN eliminates the mismatch between training and test utterances by transforming them to zero mean and unit variance …
- 238000010606 normalization 0 title abstract description 59
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Li et al. | An overview of noise-robust automatic speech recognition | |
| Grozdić et al. | Whispered speech recognition using deep denoising autoencoder | |
| Kwon et al. | Phoneme recognition using ICA-based feature extraction and transformation | |
| US20080294432A1 (en) | Signal enhancement and speech recognition | |
| US20150317990A1 (en) | Deep scattering spectrum in acoustic modeling for speech recognition | |
| Takiguchi et al. | Robust feature extraction using kernel PCA | |
| Wu et al. | Exemplar-based voice conversion using joint nonnegative matrix factorization | |
| Takiguchi et al. | PCA-Based Speech Enhancement for Distorted Speech Recognition. | |
| JP2019090930A (en) | Sound source enhancement device, sound source enhancement learning device, sound source enhancement method and program | |
| Joshi et al. | Modified mean and variance normalization: transforming to utterance-specific estimates | |
| Sadjadi et al. | Mean Hilbert envelope coefficients (MHEC) for robust speaker recognition | |
| Selva Nidhyananthan et al. | Noise robust speaker identification using RASTA–MFCC feature with quadrilateral filter bank structure | |
| Ahmad et al. | Improving Children's Speech Recognition Through Explicit Pitch Scaling Based on Iterative Spectrogram Inversion. | |
| Sainath et al. | Raw multichannel processing using deep neural networks | |
| Ghai et al. | Pitch adaptive MFCC features for improving children’s mismatched ASR | |
| Zhang et al. | Distant-talking speaker identification by generalized spectral subtraction-based dereverberation and its efficient computation | |
| Zouhir et al. | A bio-inspired feature extraction for robust speech recognition | |
| Obuchi et al. | Normalization of time-derivative parameters using histogram equalization. | |
| Sarangi et al. | Improved speech-signal based frequency warping scale for cepstral feature in robust speaker verification system | |
| Peng et al. | Effective Phase Encoding for End-To-End Speaker Verification. | |
| Han et al. | Reverberation and noise robust feature compensation based on IMM | |
| Shukla et al. | A subspace projection approach for analysis of speech under stressed condition | |
| Kaur et al. | Optimizing feature extraction techniques constituting phone based modelling on connected words for Punjabi automatic speech recognition | |
| Chatterjee et al. | Auditory model-based design and optimization of feature vectors for automatic speech recognition | |
| Pradhan et al. | Speaker verification in sensor and acoustic environment mismatch conditions |