Oukas et al., 2024 - Google Patents

A novel dataset for arabic speech recognition recorded by tamazight speakers

Oukas et al., 2024

Document ID: 10805256195022974057
Author: Oukas N; Chabi T; Sari T
Publication year: 2024
Publication venue: Authorea Preprints

External Links

Cited by

Snippet

Automatic Speech Recognition (ASR) is an area of research that's constantly evolving, thanks to important advancements like machine learning and deep learning techniques. Its applications are wide-ranging, touching fields like healthcare, public services, and interfaces …

Continue reading at www.authorea.com (PDF) (other versions)

238000000034 method 0 abstract description 5

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3074—Audio data retrieval
- G06F17/30743—Audio data retrieval using features automatically derived from the audio content, e.g. descriptors, fingerprints, signatures, MEP-cepstral coefficients, musical score, tempo
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/226—Taking into account non-speech caracteristics
- G10L2015/228—Taking into account non-speech caracteristics of application context
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems

Similar Documents

Publication	Publication Date	Title
Liberman	2019	Corpus phonetics
Javed et al.	2023	Indicsuperb: A speech processing universal performance benchmark for indian languages
Hanani et al.	2020	Spoken Arabic dialect recognition using X-vectors
Nasr et al.	2023	End-to-end speech recognition for arabic dialects
Ferraro et al.	2023	Benchmarking open source and paid services for speech to text: an analysis of quality and input variety
Polat et al.	2020	Building a speech and text corpus of Turkish: large corpus collection with initial speech recognition results
Filipowicz et al.	2023	Rediscovering automatic detection of stuttering and its subclasses through machine learning—the impact of changing deep model architecture and amount of data in the training set
Zhou et al.	2025	Childmandarin: A comprehensive mandarin speech dataset for young children aged 3-5
Chen et al.	2024	A proof-of-concept study for automatic speech recognition to transcribe AAC speakers’ speech from high-technology AAC systems
Qharabagh et al.	2025	Manatts persian: a recipe for creating tts datasets for lower resource languages
Bawitlung et al.	2025	Mizo Automatic Speech Recognition: Leveraging Wav2vec 2.0 and XLS-R for Enhanced Accuracy in Low-Resource Language Processing
Zealouk et al.	2020	Pathological detection using HMM speech recognition-based Amazigh digits
Oukas et al.	2024	A novel dataset for arabic speech recognition recorded by tamazight speakers
Dielen	2023	Improving the automatic speech recognition model whisper with voice activity detection
Thennattil et al.	2016	Phonetic engine for continuous speech in Malayalam
Li et al.	2024	Phonetic and lexical discovery of a canine language using hubert
Soundarya et al.	2023	Analysis of mispronunciation detection and diagnosis based on conventional deep learning techniques
Tripathi et al.	2022	VEP detection for read, extempore and conversation speech
Ricketts	2023	Speech Recognition Application With Tone Analyzer
Brhanemeskel et al.	2022	Amharic speech search using text word query based on automatic sentence-like segmentation
Pallavi Reddy et al.	2024	Identification of disfluency among children using efficient machine learning techniques
Shah et al.	2023	It’s What You Say and How You Say It: Exploring Audio and Textual Features for Podcast Data
Heydarova	2021	Compiling of Phonetic Database Structure
Fu et al.	2023	Speaker diarization with variants of self-attention and joint speaker embedding extractor
Devi et al.	2024	Disambiguation of isolated manipuri tonal contrast word pairs using acoustic features