有很多的語音增強處理系統致力於降低受干擾語音信號的雜訊,雖然可以有效的抑制雜訊,但是
卻遭遇了兩個主要的問題,其一是語音失真,其二是殘留雜訊。為了有效抑制殘留雜訊,故將受干擾
的語音信號抑制多一點,如此一來,卻讓語音信號也被抑制,使得語音失真增加。相對的,若將受干
擾語音信號壓抑少一點,語音失真降低,卻讓殘留雜訊變多;因此,在語音增強系統中,如何在降低
殘留雜訊與降低語音失真之間,做一適當的取捨,仍然是一項值得探討的問題。在這一個研究計畫中,
我們希望可以完成三級的語音增強處理系統,既可降低殘留雜訊,又可重建遭破壞的語音信號,提昇
語音信號品質。在第一級中,使用以知覺為基礎的語音增強處理系統來抑制干擾的雜訊,並且儘量不
要破壞受干擾信號中的語音成分。為了能有效抑制音樂型殘留雜訊,在第二級的處理中,我們會分析
經過第一級語音增強處理過後的信號,並且分析頻譜移動向量在相鄰次頻帶間,與相鄰數個音框間的
變化情形,並且根據頻譜移動向量的變化量解析出音樂型殘留殘留雜訊的頻譜,然後予以適當的抑
制;第三級會把遭受第一級語音增強處理誤刪的母音諧波成分,予以重建,優化增強後的語音信號,
使處理後的音訊聽起來更舒服。
Many speech enhancement systems attempt to reduce the background noise in a noisy speech signal.
Although the noise can be efficiently reduced, the enhanced speech signal suffers from two major problems,
known as speech distortion and residual noise. In order to suppress more amount of residual noise, the noisy
speech should be reduced more. It results in the higher the speech distortion. The speech quality is therefore
deteriorated. On the contrary, the slighter the noisy speech is reduced, the slighter the speech distortion is.
This fact results in the more the residual noise to be retained. Accordingly, how to make tradeoff between
the reduction of residual noise and the reduction of speech distortion is still a major problem for designing a
speech enhancement system. In this project, we aims at reducing the residual noise and reconstructing
distorted speech signal for a noisy signal by a three-stage speech enhancement system. The first stage is
perceptual-based speech enhancement algorithm. This algorithm is beneficial to suppress background noise
while speech components can be preserved. At the second stage, musical residual noise is detected and
suppressed. We will investigate the variation properties of residual noise in enhanced speech denoised by
the first stage. Analyzing the motion vectors of a spectral bin over neighbor subbands and over successive
frames should be performed. Hence, the musical tone is then detected and suppressed adequately to avoid
another musical tone arising when a musical tone is suppressed. In order to improve speech quality, the third
stage is employed to reconstruct the harmonics which were destroyed by infecting noise or removed by the
speech enhancement algorithm given in the first stage. Consequently, the denoised speech can be refined by
both reconstructing the harmonics of vowels with little energy which were removed by a speech
enhancement system or destroyed by corrupting noise, and by suppressing the musical residual noise in a
denoised speech signal. It enables the processed speech to sound comfortable.