Microphone array voice signal processing technology is widely used in tasks such as telephony video conferences,smart speakers,and human-computer voice interaction.This paper is dedicated to the front-end research of speech signal processing of microphone arrays,focusing on three parts: speech endpoint detection,three-dimensional stereo microphone array design and de-reverberation system design in the front-end speech processing technology of microphone arrays.The main research work of this paper is summarized as follows:1.Voice endpoint detection(VAD)under low signal-to-noise ratio: As a very important part of voice signal processing,voice endpoint detection can greatly reduce the time and resource conditions required for later voice signal processing.Under low signal-to-noise ratio,various speech endpoint detection algorithms such as dual-threshold method and short-time energy entropy ratio method have different degrees of distortion and offset in the discriminant map and discriminant threshold,which leads to a significant decrease in the accuracy of endpoint detection.Aiming at such problems,firstly,the empirical mode decomposition(EMD)algorithm is used to decompose and reconstruct the speech signal,and the 2-5 layers of IMFs that mainly contain the speech signal are retained;secondly,the existing energy entropy ratio method is improved: using subbands The spectral entropy replaces the original spectral entropy,and a new energy calculation formula and spectral entropy probability calculation formula are adopted.Combining the above two points,a speech endpoint detection algorithm based on EMD and improved energy-entropy ratio is proposed.Experiments show that the method proposed in this paper is superior to the traditional endpoint detection algorithm,and can still achieve more than 70% endpoint detection accuracy under-15 d B signal-to-noise ratio.2.Configuration design problem of three-dimensional stereo microphone array: For the problem that the selection of microphone arrays in the existing market cannot meet some scenarios with high spatial information requirements,targeted research and design of three-dimensional stereo arrays are carried out.First,based on certain criteria,a sixteen-element three-ring stereo microphone array is obtained by simulation design.Second,considering that the array design is difficult to optimize,the genetic algorithm is innovatively applied to the design of the stereo microphone array.The main processes are:constructing an objective function that takes into account the width of the main lobe and the maximum side lobe level,and designing a genetic algorithm process suitable for the population coding,evolution,mutation and selection of the stereo microphone array,and optimizing the Hat-shaped stereo microphone array.The simulation results of the final array pattern and direction of arrival(DOA)estimation show that the Hat-shaped stereo microphone array and 16-element three-ring stereo microphone array designed in this paper have good performance in beam directivity,main lobe width,maximum side lobe level and DOA.The resolution ability is better than the existing common microphone arrays in the market,and the Hat-shaped stereo microphone array is better,which proves the feasibility of the genetic algorithm in the design of the stereo microphone array.3.Design of the de-reverberation system under the stereo microphone array: For the follow-up research on the application of the array in complex problems such as "cocktail party",sound source localization and speech recognition,the designed stereo microphone array is implemented as a hardware acquisition sound card and Design the host computer.In addition,considering the impact of room reverberation on subsequent algorithms,the system uses a delay linear prediction(DLP)-based improved variance-normalized delay linear prediction(NDLP)algorithm to perform multi-channel de-reverberation processing to ensure High quality and high reliability.The final measured performance shows that in different scenarios,the effect of single or multi-person speech de-reverberation is excellent,which provides favorable basic conditions for subsequent expansion applications. |