CN119068898B - Adaptive noise reduction method based on frequency point gain smoothing and post-filter - Google Patents

Adaptive noise reduction method based on frequency point gain smoothing and post-filter Download PDF

Info

Publication number
CN119068898B
CN119068898B CN202411557581.9A CN202411557581A CN119068898B CN 119068898 B CN119068898 B CN 119068898B CN 202411557581 A CN202411557581 A CN 202411557581A CN 119068898 B CN119068898 B CN 119068898B
Authority
CN
China
Prior art keywords
gain
smoothing
noise reduction
signal
frequency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202411557581.9A
Other languages
Chinese (zh)
Other versions
CN119068898A (en
Inventor
周智
仇健乐
于欣
蒋寿美
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Time Intelligence Technology Shanghai Co ltd
Original Assignee
Time Intelligence Technology Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Time Intelligence Technology Shanghai Co ltd filed Critical Time Intelligence Technology Shanghai Co ltd
Priority to CN202411557581.9A priority Critical patent/CN119068898B/en
Publication of CN119068898A publication Critical patent/CN119068898A/en
Application granted granted Critical
Publication of CN119068898B publication Critical patent/CN119068898B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/16Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/175Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
    • G10K11/178Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
    • G10K11/1785Methods, e.g. algorithms; Devices
    • G10K11/17853Methods, e.g. algorithms; Devices of the filter
    • G10K11/17854Methods, e.g. algorithms; Devices of the filter the filter being an adaptive filter
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/16Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/175Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
    • G10K11/178Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
    • G10K11/1787General system configurations
    • G10K11/17885General system configurations additionally using a desired external signal, e.g. pass-through audio such as music or speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0224Processing in the time domain

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

本发明涉及一种基于频点增益平滑的自适应降噪方法和后置滤波器,本发明把语音信号转换为频谱信号,然后进行增益,其对增益过程进行平滑处理,平滑处理包括前向平滑和/或后向平滑;通过前向平滑和后向平滑的处理方式,语音转换的频点增益按照算法处理实现平滑处理,以新的增益替代原先增益,从而得到平滑自然的频点数据。相应地,本方法可以改善降噪后的语音自然度,尤其在信噪比较低场景下;本方法明显的降低过度降噪、频谱突变带来的“音乐噪声”;本方法为通用方法,可以基于不同的参数或变种应用于不同的降噪算法(模型)中;本方法实现简单、计算复杂度低,同时可以与其他后置滤波相结合以达到更好的降噪和语音保存效果。

The present invention relates to an adaptive noise reduction method and a post-filter based on frequency gain smoothing. The present invention converts a speech signal into a spectrum signal, and then performs gain, and performs smoothing processing on the gain process, and the smoothing processing includes forward smoothing and/or backward smoothing; through the processing mode of forward smoothing and backward smoothing, the frequency gain of speech conversion is smoothed according to the algorithm processing, and the new gain replaces the original gain, so as to obtain smooth and natural frequency data. Accordingly, the present method can improve the naturalness of speech after noise reduction, especially in the scene with low signal-to-noise ratio; the present method significantly reduces the "music noise" caused by excessive noise reduction and spectrum mutation; the present method is a general method, which can be applied to different noise reduction algorithms (models) based on different parameters or variants; the present method is simple to implement and has low computational complexity, and can be combined with other post-filters to achieve better noise reduction and speech preservation effects.

Description

Adaptive noise reduction method based on frequency point gain smoothing and post-filter
Technical Field
The invention relates to a processing technology of audio frequency points, in particular to a technology for avoiding distortion caused by excessively strong gain of the audio frequency points through gain smoothing processing.
Background
The audio signal generally contains noise, and in an audio processing scheme represented by intercom conversation and recording, an adaptive noise reduction (ANS) technology is one of the most widely used technologies. There are different kinds of noise reduction techniques based on different models, and one key problem encountered is the fidelity of speech, i.e. many algorithms can damage speech while reducing noise. As an academic study scheme, a high signal-to-noise ratio can be obtained and is considered as a feasible scheme, but in practical application, music noise can appear even when the signal-to-noise ratio is low because of hearing impairment to voice, which is unacceptable in practical application.
In order to solve the problem, there are some prior art solutions, in which a binary masking method is widely used, in which only 0 or 1 gain is set for each frequency point, that is, it is determined whether it belongs to speech or noise, if it belongs to noise, it is completely removed, and if it belongs to speech, it is completely released. The speech obtained in this way can in some cases improve the above but is not audible enough.
In order to better solve the problem, the method can improve the voice quality on the basis of the gain of the existing noise reduction algorithm according to the adjacent correlation between the frequency points, namely, the gain variation between the adjacent frequency points is not very severe, and meanwhile, the gain mutation like a binary mask mode is avoided.
Disclosure of Invention
The invention mainly aims to provide a method for solving the problem that the gain of a paraphrase audio point is smoothed by an algorithm to avoid abrupt distortion of data.
In order to achieve the above object, the present invention provides an adaptive noise reduction method based on frequency point gain smoothing, which converts a speech signal into a spectrum signal and then performs gain, which performs smoothing processing on a gain process, wherein the smoothing processing includes forward smoothing and/or backward smoothing;
Wherein, forward smoothing:
Taking out ;
For a pair of;
Wherein, Is the gain attenuation coefficient;
the original gain is replaced by a new gain, ;
Wherein, The virtual gain of the frequency point outside the highest frequency is used as an initial point of calculation;
For each of the actual frequency points, For its original gain, this gain is derived by a noise reduction algorithm;
an adjusted gain for the last higher frequency bin;
In addition, backward smoothing:
Taking out ;
For a pair of;
The original gain is replaced by a new gain,;
The virtual gain of the frequency point outside the lowest frequency is used as an initial point of calculation;
For each of the actual frequency points, For its original gain, this gain is derived by a noise reduction algorithm, or from a previous frontal adjustment;
An adjusted gain for the last lower frequency bin.
Preferably, further comprising time smoothing;
Taking out ;
Has the following components;
The original gain is replaced by a new gain,;
Here, theIs the current frequency pointUpper firstIs provided.
Preferably, this is accomplished by the steps of:
preprocessing, namely framing a real-time voice signal, converting the signal into a frequency spectrum through FFT (fast Fourier transform), and obtaining a step 1 and a step 2;
ANS noise reduction, gain is obtained by using a certain noise reduction algorithm, and the step 3 is carried out;
post filtering, namely adjusting the gains, namely steps 4,5 and 6;
Returning to the time domain signal, the spectrum is converted to the time domain by an inverse FFT transformation and synthesized into an audio signal, steps 7 and 8.
Preferably, this is accomplished by the steps of:
step 1, real-time voice signal framing:
the voice signal is a signal stream after device sampling or pre-algorithm processing ;
Each sampling point has a certain number of bits, and samples according to the certain number of bits, normalization is required to be carried out, so that;
In real-time processing, each is takenThe sampling point is one frame, i.eThe frame signal isWherein;
Step 2, speech signalFFT transformed into spectrumWherein, the method comprises the steps of,
Is an analysis window;
is a discrete fourier transform;
Is complex spectrum, in which ,Is the frequency point label;
step 3, adopting a certain signal processing noise reduction algorithm to obtain the gain on each frequency point In general, the number of the cells in a cell,I.e., the gain is real, which means that,For noise reduced spectrum, i.e;
Step 4, namely adopting the forward smoothing step;
step 5, namely adopting the backward smoothing step;
Step 6, namely adopting the time smoothing.
Preferably, the method further comprises the following steps:
and 7, updating the frequency spectrum signal: ;
Step 8: inverse fourier transformed back to the time domain signal: ;
according to the windowing mode, the analysis window used for synthesizing the signal is determined.
Preferably, whenAnd is also provided withWhen a Hanning window is used, the composite window is a unit window;
by Overlap-add mode Synthesizing speech signals by frame
Preferably, for an audio file at a 16KHz sampling rate, 32ms is taken as one frame, and the 16ms frame is shifted, i.e
Preferably, the attenuation coefficient in the forward smoothing and the backward smoothingThe values are the same or different;
Attenuation coefficient Adjusting according to signal-to-noise ratio, i.e. when signal-to-noise ratio is highReduced signal to noise ratioThe improvement is carried out;
Replaced by
The invention also provides a self-adaptive noise reduction post-filter based on the frequency point gain smoothing, and the post-filter is processed by adopting the noise reduction method.
Preferably, the processing is performed by using independently arranged post-filters at the time of forward smoothing, backward smoothing and time smoothing.
The method has the beneficial effects that through the forward smoothing and backward smoothing processing mode, the frequency point gain of voice conversion is processed according to an algorithm to realize smoothing processing, and the original gain is replaced by the new gain, so that smooth and natural frequency point data are obtained. Correspondingly, the method can improve the naturalness of the noise-reduced voice, particularly in a scene with low signal-to-noise ratio, obviously reduces the music noise caused by excessive noise reduction and frequency spectrum mutation, is a general method, can be applied to different noise reduction algorithms (models) based on different parameters or varieties, is simple to realize and low in calculation complexity, and can be combined with other post-filtering to achieve better noise reduction and voice preservation effects.
Drawings
Fig. 1 is a diagram showing the audio data before and after processing, in which the upper side is a time axis ranging from 25ms to 39ms, and the processing variation of the audio is shown in four areas.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways other than those described herein, and persons skilled in the art will readily appreciate that the present invention is not limited to the specific embodiments disclosed below.
The principle of the invention is explained as follows:
one voice being a time sequence Can be seen as two tones together, one being speechOne is noiseI.e.,. The purpose of noise reduction or speech separation algorithms is fromHandleAnd (5) calculating.
Because sound is processed little by little and is processed in real time, each time it is faced with a small piece of data, e.g. tens of milliseconds, step 1, framing. And we will typically put the speech on the frequency domain for processing. I.e. step 2, thus corresponding to the noisy speech in the frequency domain,,Corresponding to the spectrum of speech and noise, respectively. The calculation is the content of a noise reduction algorithm, step 3, where it is assumed that the speech and noise are random variables independent of each other, and the signal processing algorithm does not change phase, so that when we estimate a speechMeaning we considerFor example, in general, we let,The power spectrum of the noisy speech, noise, respectively, then,Is a noise gain estimate. In the example of calculation, for example, assuming that the energy of the noisy (original) speech is 100 and the noise energy is 10, one of the simplest estimates is speech energy 90, i.e., gain g=0.9. Step 7 is to apply the gain toThe estimated voice is obtained, and step 8 is to return to the time domain from the frequency domain, so as to obtain the voice stream heard by us at ordinary times.
A practical noise reduction algorithm may need to take into account the speed of noise variation, the accuracy of the noise estimate, etc. (subject to some statistical model), and most will tend to reduce the noise more deeply, so that the noise is almost absent and the speech is also greatly impaired. This is not a problem in the paper, which is to say that the index is rather than the sense of hearing, but in practical use, it is a problem that in practical use, we rather allow some noise and do not have great damage to speech. Thus, what is done in the following step 4/5/6 is that the principle is to preserve speech by preventing abrupt changes in gain. The gain is not reduced but only increased by the basic principle that if the last gain was 0.9, which suddenly became 0.1, it is considered too great, we set a factor such asAs for 0.8, 0.9 is next, and the minimum gain is 0.9×0.8=0.72. And step 4/5/6 is to do this "next" from three directions, namely, going to low frequency, going to high frequency, and going to the next frame, respectively. Thus, the signal will be more stable and the audible sensation will be better.
For each of the actual frequency points,For its original gain, this gain is obtained by a noise reduction algorithm, i.e. the frequency point is obtained by the existing noise reduction algorithmThe noise reduction algorithm is calculated on the frequency point according to the frame iteration, the gain of the adjacent frequency point is not considered, so that the very aggressive gain is obtained, namely, unsmooth or unnatural voice is generated, and the noise reduction algorithm is needed to be applied toSmoothing is performed by the algorithm.
As shown in fig. 1, which is a demonstration of the effect of the data algorithm processed by the method of the present invention, the first channel is the original noise reduction effect, and the second channel is enhanced by the method of the present invention. It can be seen that after the processing of the method herein, many details are recovered in the speech frequency domain, and it can be noted that the lack of fluency (empty space) is somewhat supplemented, such as 28-29 seconds (second region), 31-32 seconds (third region), etc.
Example 1
The embodiment provides a self-adaptive noise reduction method based on frequency point gain smoothing, which is used for converting a voice signal into a frequency spectrum signal and then carrying out gain, wherein the gain process is smoothed by the self-adaptive noise reduction method, and the smoothing process comprises forward smoothing and/or backward smoothing;
wherein, forward smoothing of step 4:
Taking out ;
For a pair of;
Wherein, Is the gain attenuation coefficient, and in the subject scheme,Is a fixed gain attenuation coefficient which is used to determine the gain,If (if)The post-filtering does not affect the original signal,The smaller, the more tends to preserve the original gain,The larger the gain, the smoother;
the original gain is replaced by a new gain, ;
Wherein, For the virtual gain of the frequency bin outside the highest frequency, this frequency bin is not applied in practice, here as an initial point of calculation, and in particular,For the frequency point array, whenWhen the highest frequency point is taken, the algorithm existsThus taking the number ofI.e. the outer frequency point of the highest frequency.
For example, in one specific processing case, for a 16k sample rate signal, a 16ms frame shift, a 32ms frame long data processing,The frequency domain signal range is {0,1,.. The first place, 255}, at this time, the starting point of the back-to-front calculation is 256 which is not within the range, and when one g (256) =0, max {0, g (255) } =g (255), according to the induction method, a value of 254 can be obtained from the value of 255 stepwise, a value of 1 is obtained, and finally a value of 0 is obtained, thus completing the whole calculation process.
Likewise, starting from back-to-front calculation starts from-1, from-1 to 0,0 to 1, and so does smoothing over time from 254 to 255.
For the adjusted gain of the last higher frequency point, a large gap between gains of adjacent frequency points is not desirable for audio quality consideration, and thusAdjusting the smoothing coefficient;
Thereby will Setting the gain of the frequency point after adjustment;
This gain is used for subsequent adjustment, thus setting ;
In addition, step 5 backward smoothing:
Taking out ;
For a pair of;
The original gain is replaced by a new gain,;
The virtual gain of the frequency point outside the lowest frequency is not applied in practice, and is used as the initial point of calculation whenWhen the lowest frequency point 0 is taken, the algorithm existsThus taking the number ofI.e. the outer frequency bin of the lowest frequency. See the description of the "treatment case" above.
For each of the actual frequency points,For its original gain, this gain is derived by a noise reduction algorithm, or from a previous frontal adjustment;
The frequency domain is forward, the frequency domain is backward, the adjustment of the time domain is independent, the input of each adjustment is a set of gains, namely, the functions are nested, and the nesting order of the functions can be adjusted at will.
If the gain of the original noise reduction algorithm is g, the forward filtering algorithm of the frequency domain is Apre, the backward filtering algorithm is Apost, and the time domain filtering is Atime, then the algorithm structure herein can be very flexible, such as Atime (Apre (Apost)) is the order described herein, apre (Apost (Atime (g))) is another order, and so on. Thus, the "forehead adjustment" is achieved by these means.
The noise reduction algorithm is calculated on the frequency point according to the frame iteration, and the gain of the adjacent frequency point is not considered, so that the very aggressive gain is obtained;
For the adjusted gain of the last lower frequency point, a large gap between gains of adjacent frequency points is not desirable for audio quality consideration, and thus Adjusting the smoothing coefficient;
Thereby will Setting the gain of the frequency point after adjustment;
this gain can be used for subsequent adjustment, thus setting ;
Preferably, the method further comprises the step of time smoothing of the step 6;
Taking out ;
Has the following components;
The original gain is replaced by a new gain,;
The gain calculation of the noise reduction algorithm is derived from the estimation of the signal to noise ratio and contains little gain contrast adjustment for the previous and subsequent frames. In view of the fact that there is considerable continuity in the signal between the preceding and following frames, abrupt changes in the signal gain therebetween also tend to be responsible for reduced signal quality or poor hearing, and it is therefore necessary to maintain signal gain smoothness to some extent over adjacent frames.
Similar to the previous description, hereIs the current frequency pointUpper firstIs provided. In particular, the method comprises the steps of,Representing a sequence of frames. That is, from-1 (virtual frame), frame 0 is calculated, then frame 1, then frame 2.
Example 2
On the basis of example 1, this is preferably done by the following steps:
preprocessing, namely framing a real-time voice signal, and converting the signal into a frequency spectrum through FFT (Fourier transform) transformation, wherein the steps are step 1 and step 2;
ANS noise reduction (Automatic Noise Suppression, background noise suppression), gain is obtained by using a certain noise reduction algorithm, which is step 3;
post filtering, namely adjusting the gains, namely steps 4,5 and 6;
Returning to the time domain signal, the spectrum is converted to the time domain by an inverse FFT transformation and synthesized into an audio signal, steps 7 and 8.
Step 1, real-time voice signal framing:
the voice signal is a signal stream after device sampling or pre-algorithm processing ;
Each sampling point has a certain number of bits, and is sampled according to the certain number of bits, for example, 16-bit sampling, and normalization is performed so that;
In real-time processing, each is takenThe sampling point is one frame, i.eThe frame signal isWhereinPreferably, for an audio file at a 16KHz sampling rate, 32ms is taken as one frame, and the 16ms frame is shifted, i.e. I.e.The number of samples for a frame,The number of samples for the frame shift length.
Step 2, speech signalFFT transformed into spectrumWherein, the method comprises the steps of,
Is an analysis window;
is a discrete fourier transform;
Is complex spectrum, in which ,Is the frequency point label;
Step 3, obtaining the gain of each frequency point by adopting the existing signal processing noise reduction algorithm In general, the number of the cells in a cell,I.e., the gain is real, which means that,For noise reduced spectrum, i.e;
Step 4, adopting the forward smoothing step in embodiment 1;
Step 5, adopting the backward smoothing step in embodiment 1;
Step 6, the time smoothing step of example 1 was employed.
And 7, updating the frequency spectrum signal:;
Step 8: inverse fourier transformed back to the time domain signal: ;
According to the windowing mode, the analysis window used for synthesizing the signal is determined. Preferably, when And is also provided withWhen a Hanning window is used, the composite window is a unit window;
by Overlap-add mode Synthesizing speech signals by frame
Example 3
Preferably, the attenuation coefficient in the forward smoothing and the backward smoothingThe values are the same or different;
Attenuation coefficient Adjusting according to signal-to-noise ratio, i.e. when signal-to-noise ratio is highReduced signal to noise ratioThe improvement is carried out;
Replaced by Instead of the typical exponential decay approach of atack-Decay, a simple linear decay approach is also more effective in certain situations.
This means that an additional implementation is given. Attack-Decay refers to the original implementation of fast tracking large values (attock), exponentially decaying tracking small values (Decay), and replacing the exponential decay here with a linear decay.
Example 4
The invention also provides a self-adaptive noise reduction post-filter based on frequency point gain smoothing, and the post-filter is processed by adopting the noise reduction method in the embodiments 1-3.
Preferably, the processing is performed by using independently arranged post-filters at the time of forward smoothing, backward smoothing and time smoothing.
In order to better illustrate the solution of the present invention, some prior art documents are given below.
The following summary contains a description of general steps (e.g., steps 1-3):
Mahdi Parchami, Wei-Ping Zhu, Benoit Champagne, and Eric Plourde, Recent Developments in Speech Enhancement in the Short-Time Fourier Transform Domain , July 2016 IEEE Circuits and Systems Magazine 16(3):45-77
the classical noise reduction algorithm is described below:
Y. Ephraim and D. Malah, Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator, IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-32, no. 6, pp. 1109–1121, December 1984.
a very common noise reduction method is referred to as follows:
Timo Gerkmann, and Richard C. Hendriks, Unbiased MMSE-Based Noise Power Estimation with Low Complexity and Low Tracking Delay [435 citations] May 2012IEEE Transactions on Audio Speech and Language Processing 20(4):1383-1393,
It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Claims (8)

1. The adaptive noise reduction method based on the frequency point gain smoothing converts a voice signal into a frequency spectrum signal and then carries out gain, and is characterized in that the gain process is smoothed, the smoothing process comprises forward smoothing and/or backward smoothing and/or time smoothing, and the adaptive noise reduction method based on the frequency point gain smoothing is completed through the following steps:
step 1, real-time voice signal framing:
the voice signal is a signal stream after device sampling or pre-algorithm processing ;
Each sampling point has a certain number of bits, and samples according to the certain number of bits, normalization is required to be carried out, so that;
In real-time processing, each is takenThe sampling point is one frame, i.eThe frame signal isWherein;
Step 2, speech signalFFT transformed into spectrumWherein, the method comprises the steps of,
Is an analysis window;
is a discrete fourier transform;
Is complex spectrum, in which ,Is the frequency point label;
step 3, obtaining the gain of each frequency point by adopting a signal processing noise reduction algorithm ,I.e., the gain is real, which means that,For noise reduced spectrum, i.e;
Step 4, namely adopting the forward smoothing step;
step 5, namely adopting the backward smoothing step;
step 6, adopting the time smoothing;
Wherein, forward smoothing:
Taking out ;
For a pair of;
Wherein, Is the gain attenuation coefficient;
the original gain is replaced by a new gain, ;
Wherein, The virtual gain of the frequency point outside the highest frequency is used as an initial point of calculation;
For each of the actual frequency points, For its original gain, this gain is derived by a noise reduction algorithm;
an adjusted gain for the last higher frequency bin;
In addition, backward smoothing:
Taking out ;
For a pair of;
The original gain is replaced by a new gain,;
The virtual gain of the frequency point outside the lowest frequency is used as an initial point of calculation;
For each of the actual frequency points, For its original gain, this gain is derived by a noise reduction algorithm, or from a previous frontal adjustment;
an adjusted gain for the last lower frequency bin;
In addition, time smoothing:
Taking out ;
Has the following components;
The original gain is replaced by a new gain,;
Here, theIs the current frequency pointUpper firstIs provided.
2. The adaptive noise reduction method based on frequency point gain smoothing of claim 1, wherein the step of:
preprocessing, namely framing a real-time voice signal, converting the signal into a frequency spectrum through FFT (fast Fourier transform), and obtaining a step 1 and a step 2;
ANS noise reduction, gain is obtained by using a noise reduction algorithm, and the step 3 is performed;
post filtering, namely adjusting the gains, namely steps 4,5 and 6;
Returning to the time domain signal, the spectrum is converted to the time domain by an inverse FFT transformation and synthesized into an audio signal, steps 7 and 8.
3. The adaptive noise reduction method based on frequency bin gain smoothing of claim 1, further comprising the step of:
and 7, updating the frequency spectrum signal: ;
Step 8: inverse fourier transformed back to the time domain signal: ;
according to the windowing mode, the analysis window used for synthesizing the signal is determined.
4. The adaptive noise reduction method based on frequency bin gain smoothing as defined in claim 3, wherein whenAnd is also provided withWhen a Hanning window is used, the composite window is a unit window;
by Overlap-add mode Synthesizing speech signals by frame
5. The adaptive noise reduction method based on frequency point gain smoothing as defined in claim 1, wherein for an audio file with a sampling rate of 16KHz, taking 32ms as a frame, 16ms frame shift is
6. The adaptive noise reduction method based on frequency bin gain smoothing of claim 1, wherein attenuation coefficients in forward smoothing and backward smoothingThe values are the same or different;
Attenuation coefficient Adjusting according to signal-to-noise ratio, i.e. when signal-to-noise ratio is highReduced signal to noise ratioThe improvement is carried out;
Replaced by
7. An adaptive noise reduction post-filter based on frequency bin gain smoothing, characterized in that the post-filter is processed with the noise reduction method according to any of claims 1-6.
8. The adaptive noise reduction post-filter based on frequency bin gain smoothing as claimed in claim 7, wherein the post-filter is independently disposed for processing at the time of forward smoothing, backward smoothing and time smoothing, respectively.
CN202411557581.9A 2024-11-04 2024-11-04 Adaptive noise reduction method based on frequency point gain smoothing and post-filter Active CN119068898B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411557581.9A CN119068898B (en) 2024-11-04 2024-11-04 Adaptive noise reduction method based on frequency point gain smoothing and post-filter

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411557581.9A CN119068898B (en) 2024-11-04 2024-11-04 Adaptive noise reduction method based on frequency point gain smoothing and post-filter

Publications (2)

Publication Number Publication Date
CN119068898A CN119068898A (en) 2024-12-03
CN119068898B true CN119068898B (en) 2025-02-07

Family

ID=93637299

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411557581.9A Active CN119068898B (en) 2024-11-04 2024-11-04 Adaptive noise reduction method based on frequency point gain smoothing and post-filter

Country Status (1)

Country Link
CN (1) CN119068898B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10043530B1 (en) * 2018-02-08 2018-08-07 Omnivision Technologies, Inc. Method and audio noise suppressor using nonlinear gain smoothing for reduced musical artifacts
CN118824277A (en) * 2024-07-26 2024-10-22 昂思科技(定南)有限公司 Adaptive speech noise reduction method based on ternary microphone array

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6941263B2 (en) * 2001-06-29 2005-09-06 Microsoft Corporation Frequency domain postfiltering for quality enhancement of coded speech
US11373667B2 (en) * 2017-04-19 2022-06-28 Synaptics Incorporated Real-time single-channel speech enhancement in noisy and time-varying environments

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10043530B1 (en) * 2018-02-08 2018-08-07 Omnivision Technologies, Inc. Method and audio noise suppressor using nonlinear gain smoothing for reduced musical artifacts
CN118824277A (en) * 2024-07-26 2024-10-22 昂思科技(定南)有限公司 Adaptive speech noise reduction method based on ternary microphone array

Also Published As

Publication number Publication date
CN119068898A (en) 2024-12-03

Similar Documents

Publication Publication Date Title
CN110767244B (en) Speech enhancement method
KR100549133B1 (en) Noise reduction method and apparatus
Lebart et al. A new method based on spectral subtraction for speech dereverberation
EP0788089B1 (en) Method and apparatus for suppressing background music or noise from the speech input of a speech recognizer
CN108172231B (en) A Kalman Filter-Based Reverberation Method and System
JP4512574B2 (en) Method, recording medium, and apparatus for voice enhancement by gain limitation based on voice activity
US20070255535A1 (en) Method of Processing a Noisy Sound Signal and Device for Implementing Said Method
CN101083640A (en) Low complexity noise reduction method
KR20010019603A (en) Speech enhancement method
CN114566179A (en) Time delay controllable voice noise reduction method
JP2025503325A (en) Method and system for speech signal enhancement with reduced latency - Patents.com
CN119418712A (en) A noise reduction method for real-time speech at the edge
Wisdom et al. Enhancement and recognition of reverberant and noisy speech by extending its coherence
Subramanya et al. A graphical model for multi-sensory speech processing in air-and-bone conductive microphones
Schröter et al. Lightweight online noise reduction on embedded devices using hierarchical recurrent neural networks
CN119068898B (en) Adaptive noise reduction method based on frequency point gain smoothing and post-filter
CN113160842B (en) A speech dereverberation method and system based on MCLP
CN117037825A (en) Adaptive filtering and multi-window spectrum estimation spectrum subtraction combined noise reduction method
Meutzner et al. Binaural signal processing for enhanced speech recognition robustness in complex listening environments
CN119517059B (en) A lightweight speech enhancement method based on frame resampling and subband pruning
Wang et al. Speech recognition using blind source separation and dereverberation method for mixed sound of speech and music
Li RESEARCH ON ENHANCEMENT AND DENOISING OF ENGLISH SPEECH IN COMPLEX NOISE ENVIRONMENTS
Paikrao et al. Analysis modification synthesis based Optimized Modulation Spectral Subtraction for speech enhancement
CN121393458A (en) Real-time audio noise reduction method and device based on deep learning
CN121545511A (en) Synchronous Speech Recognition System and Method Based on Large Language Model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant