Gales et al., 2008 - Google Patents
Discriminative classifiers with generative kernels for noise robust speech recognitionGales et al., 2008
View PDF- Document ID
- 9355372852900490359
- Author
- Gales M
- Flego F
- Publication year
External Links
Snippet
Discriminative classifiers are a popular approach to solving classification problems. However one of the problems with these approaches, in particular kernel based classifiers such as Support Vector Machines (SVMs), is that they are hard to adapt to mismatches …
- 102100010552 AURKA 0 abstract description 9
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
- G10L15/144—Training of HMMs
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6296—Graphical models, e.g. Bayesian networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Poddar et al. | Speaker verification with short utterances: a review of challenges, trends and opportunities | |
| Gales | Maximum likelihood linear transformations for HMM-based speech recognition | |
| US7328154B2 (en) | Bubble splitting for compact acoustic modeling | |
| US8515758B2 (en) | Speech recognition including removal of irrelevant information | |
| US8484024B2 (en) | Phonetic features for speech recognition | |
| US20150149174A1 (en) | Differential acoustic model representation and linear transform-based adaptation for efficient user profile update techniques in automatic speech recognition | |
| EP2189976A1 (en) | Method for adapting a codebook for speech recognition | |
| Mao et al. | Automatic training set segmentation for multi-pass speech recognition | |
| Li et al. | Speaker verification using simplified and supervised i-vector modeling | |
| Mirsamadi et al. | A study on deep neural network acoustic model adaptation for robust far-field speech recognition. | |
| US7574359B2 (en) | Speaker selection training via a-posteriori Gaussian mixture model analysis, transformation, and combination of hidden Markov models | |
| WO2007105409A1 (en) | Reference pattern adapter, reference pattern adapting method, and reference pattern adapting program | |
| Gales et al. | Discriminative classifiers with adaptive kernels for noise robust speech recognition | |
| Gales et al. | Support vector machines for noise robust ASR. | |
| Soldi et al. | Short-Duration Speaker Modelling with Phone Adaptive Training. | |
| Aradilla | Acoustic models for posterior features in speech recognition | |
| Yu et al. | Bayesian adaptive inference and adaptive training | |
| US8185490B1 (en) | Class-specific iterated subspace classifier | |
| Gales et al. | Discriminative classifiers with generative kernels for noise robust speech recognition | |
| Hsu et al. | Mispronunciation Detection Leveraging Maximum Performance Criterion Training of Acoustic Models and Decision Functions. | |
| Lee et al. | Using discrete probabilities with Bhattacharyya measure for SVM-based speaker verification | |
| Gales et al. | Discriminative classifiers with generative kernels for noise robust ASR. | |
| Gales et al. | Combining VTS model compensation and support vector machines | |
| Kim et al. | Constrained MLE-based speaker adaptation with L1 regularization | |
| Kumar et al. | Class specificity and commonality based discriminative dictionary for speaker verification |