Font Size: a A A

Study On The Classification Of Xiaomi Voice Text Based On Deep Learning

Posted on:2020-12-26Degree:MasterType:Thesis
Country:ChinaCandidate:X F LiuFull Text:PDF
GTID:2428330596981772Subject:Master of Applied Statistics
Abstract/Summary:PDF Full Text Request
Nowadays,with the rapid improvement of network technology and mobile phone performance,mobile phone is a common part of daily life.A lot of text messages can be generated by mobile phones every day,such as SMS,chat records,comment statements,news,especially in smart phones,people often use WeChat,QQ voice to chat,and use the phone's built-in voice assistant,resulting in the rapid increase of voice text data.If computers can automatically identify and process information,it will greatly improve the efficiency of doing things.Therefore,we use the method of deep learning to classify the text data in the xiao mi phone's assistant.In this way,accurate analysis of users' daily needs can be realized,which is of guiding significance for the improvement and development of certain functions of smart phones.This paper first focuses on the traditional theory and technology of speech text classification,and then introduces the Word2 vec model,CNN model and LSTM model in deep learning.This experiment use the voice text data in the voice assistant of xiaomi phone.This data set is made up of text data and classification label.There are a total of 10,000 training sets,2000 validation set and 1,000 test sets.In this paper,jieba word segmentation is used for word segmentation of the original data.After word segmentation,words irrelevant to classification are removed to obtain the data form we need.First,vector space model is adopted,text data is vectorized by TF-IDF method,and data is classified by machine learning classification algorithm.The word vector is obtained by using the Word2 vec model in deep learning,and then the speech text data is classified by the CNN model and LSTM model.Find out the most effective classification method from these classification methods,and use this method to classify the test set data,and put forward relevant suggestions for the corresponding function of the category with the most occurrence.The conclusions of empirical analysis are as follows: First,when classifying short speech texts,traditional machine learning method is used to classify them through vector space model.Support vector machine has the best classification effect,followed by random forest and logistic regression,and naive bayes is the worst.Second,When classifying short speech texts,word vectors are obtained from the Word2 vec model in deep learning,CNN model and LSTM model in deep learning are used for classification.The classification effect of long and short term memory network is slightly higher than that of convolutional neural network.Theclassification effect of deep learning exceeds that of traditional machine learning algorithm by 7 percentage points.Based on this,two suggestions are proposed: for similar voice short text data,in deep learning Word2 vec method and long and short term memory network can be used classification.For xiaomi phones,more attention should be paid to the improvement or innovation of music(music),alarm(alarm clock)and samrtMiot(smart home).
Keywords/Search Tags:Voice text data, Deep learning, Vector space model, Voice text classification
PDF Full Text Request
Related items