Font Size: a A A

Research On Chinese Short Text Classification Methods

Posted on:2021-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:C Q YangFull Text:PDF
GTID:2518306095490184Subject:Measuring and Testing Technology and Instruments
Abstract/Summary:PDF Full Text Request
With the rapid development of network and Internet plus push forward the economic model,the information in the human daily life mass migration to the network,and in the passage of this information is contained in the information,how to arrange,quickly and efficiently the information to extract valuable information become the hot topic in business and government.Aiming at this problem,this paper makes some research.Firstly,according to the characteristics of Chinese,the existing word vector model is improved to improve the representational performance of the word vector.Secondly,aiming at the shortcomings of the existing feature extraction network in the Chinese text,this paper makes corresponding improvements to improve the feature extraction performance of the network.Finally,on the basis of the improved word vector and feature extraction network,a text classification model based on multi-feature fusion is proposed.The main work of this paper includes the following aspects:1.Aiming at the characteristics of Chinese,the existing word vector model is improved by introducing the word information.The traditional Word2 vec model takes the word as the smallest semantic unit,but in Chinese,a single character often contains rich semantic information,and the introduction of word information can effectively improve the representational performance of the word vector.Aimed at the problem this paper proposed a word word vector model of information,the model first word in the word information extraction by neural network features,and then the extracted features and trained in advance word vector for Mosaic compression as a new term vectors,experimental results show the new word vector compared with the original word has a stronger function of semantic expression vector.2.Aiming at the shortcomings of the traditional Convolutional Neural Network(CNN)in the feature extraction process of the passage,Dual Attention Convolutional Neural Network(DACNN)is constructed by introducing the Non-local attention Network and Channel-attention Network.Non-local attention network can effectively expand the receptive field of the convolution kernel and help the network to extract the global features of the text.Channel-attenion network can dynamically adjust the features extracted by different filters to build better text features in the final feature construction of the text.The results show that DACNN has stronger feature extraction ability than traditional CNN.3.In view of the traditional Short-and long-term Memory Network(Long Short-Term Memory Network,LSTM)problems ignored Neurons sequence information in the Network,this paper introduced an orderly Network Neurons both short-term and long-term Memory(Ordered Neurons Long Short-Term Memory Network,ON-LSTM)Network,the Network of Neurons sorting division has realized to the hierarchy of the sentence structure,and ON the basis of this puts forward the hierarchical update mechanism,It makes the information of memory cells in LSTM more reasonable and helps the network to extract better text features.The results show that this model has stronger feature extraction ability.4.Based on the above research,this paper puts forward a Chinese text classification model based on multi-feature fusion.Firstly,the improved word vector is used to represent the text.Secondly,the improved feature extraction network DACNN and ON-LSTM are used to construct the dual-channel feature extraction network,so as to avoid the limitation of single-channel network in feature extraction.Finally,Attention network is used to realize the effective fusion of feature extraction of different channels.This model improves the existing model from the aspects of word vector representation and feature extraction respectively.The experimental results show that the model proposed in this paper has certain competitiveness.
Keywords/Search Tags:Chinese short text, Attention mechanism, Hierarchical update mechanism, Multi-feature fusion
PDF Full Text Request
Related items