Safe production concerns the interests of the people and the stable development of society.At the same time,it is also the basis for the stable development of enterprises.With the deepening of informationization and intelligence,the application of hidden danger selfexamination and self-examination reporting platform,enterprise supervision departments and enterprises have stored a large amount of hidden danger text data.How to make full use of these data to identify irresponsible enterprises,provide the basis for government supervision departments to carry out accurate inspection and accurate law enforcement under limited human resources,and have a very important significance for enterprises to improve their awareness of self-prevention and avoid production accidents.In order to solve the problem that the security hidden text has short content,sparse feature matrix and is easy to introduce noise data when using external corpus for feature expansion,a feature extension method based on WTTM(Word-network Triangle Topic Model)theme model is adopted.On this basis,a category feature is added,that is,to add features and enhance text semantic information on the basis of the original text features.This can improve the classification effect.To solve the problem that theme model extended features cannot be expanded indifferently,combining the correlation between theme model extended features and original features,the importance of theme in the original text,and the influence of word type on the features,a weight calculation formula for theme model extended features is proposed,which incorporates the semantic correlation between theme extended features and original features,so as to better express the importance of theme model extended features.In order to solve the problem that category features cannot be randomly selected for feature expansion,a formula for calculating the importance of category features is proposed,which combines the influence of category feature words on the classification results and the correlation between current feature words and original text.By incorporating the correlation between category features and text content,better category features can be selected as supplements to text content.To solve the feature selection problem,based on the CTC2(Connectionist Temporal Classification)feature selection model and the feature extension framework,a feature selection model DC-LSTM(Dilated Convolution and Convolutional Neural Networks and Long ShortTerm Memory)is proposed,which combines an improved one-dimensional convolution network with LSTM(Long Short-Term Memory),removes the pooling layer of onedimensional convolution,and adds the expanding convolution.In order to expand the convolution field and make full use of the role of subsequent LSTM,the context extraction ability of one-dimensional convolution is enhanced.Finally,the channel features are fused and fed into LSTM to extract features again.In summary,this paper mainly starts from the point of text classification,and puts forward a method of combining feature extension and selection,FE-DCLSTM.A large number of comparative experiments are designed based on the contest dataset.The experimental results show that this method can significantly improve the accuracy of score classification when classifying hidden text datasets in this paper. |