Font Size: a A A

Research On Feature Selection And Classification Based On Audio Parameters

Posted on:2021-06-29Degree:MasterType:Thesis
Country:ChinaCandidate:H F SunFull Text:PDF
GTID:2518306200953049Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,audio is being integrated into today's computer networks with different data types.Among them,the research on audio signal recognition and classification has produced a series of intelligent classification application systems.Through the research and analysis of existing literature on audio classification,it is found that the two most common types of speech and music signals in audio signals still have the following problems in the classification process: when less feature parameters are used in the classification system,the classification accuracy rate needs to be improved;when more feature parameters are used,although the accuracy is higher,the real-time performance of classification remains to be improved due to the higher feature dimension.In addition,as an important part of audio data,music signal is characterized by large amount and rich types,which makes it take a lot of time for users to find their favorite music.As an advanced label of music,music genre provides an effective method of music retrieval.For the classification of music genres,most of the existing researches are based on deep learning to conduct relevant research on public data sets,but the accuracy of current deep learning models in the field of music genre classification still has much room for improvement.At the same time,when the training set is small,it is difficult to obtain a model with better performance.In response to the above problems,this paper conducted the following research:1.Aiming at the classification of speech and music signals,a classification model is proposed which only extracts two feature parameters and does not use a classifier.The experimental results show that the accuracy of this model is about 7.9% higher than that of extracting at most two audio features without the classifier,and 5.7% higher than that of extracting multiple audio features with the classifier.It proves that the model can still improve a certain classification accuracy rate when the number of extracted features is small.2.In the study of classifying music genres based on deep learning technology,the audio signal is visualized(converted to log-mel spectrum),and the attention mechanism can be used to assign different weights to different parts of the picture.The feature parameters that contribute more to the classification can be selected by using the inherent correspondence between the input and output terms.3.The prototype network can be used to identify the category of the current signal more accurately when the sample data is small.The model combining it with the attention mechanism is applied to the public data set GTZAN.The experimental results show that the classification accuracy of the single prototype network has reached more than 90%,and the corresponding accuracy has been improved by 1%-2% after combining the attention mechanism,and the convergence speed of the model is faster.The validity of the model on the GTZAN dataset is proved.
Keywords/Search Tags:Audio classification, speech and music classification, genre classification, attention mechanism, prototype network
PDF Full Text Request
Related items