The research is mainly on natural language processing methods for analyzing Chinese document data related to private enterprises of non-public economy.The information extracted from the statistics is used for assisting the government or departments that are supervising the enterprises to assess the development of the enterprises and obtain the public opinion about the enterprises.Firstly,this paper defines two NLP tasks which are the major work of the research.One is defined as the classification of enterprises’ development,the other is defined as the classification of public opinion.And for each task,a proper measure method,which is verified by experts in the field of economy and social study is proposed for judging the sentiments of the documents data.Secondly,this paper proposes an enterprise development analysis method based on Naive Bayes classification model and a long document enterprise public opinion analysis method based on Bert,which are respectively used to solve the two tasks defined previously.Besides,an event-based analysis method is proposed in this paper which uses the relation between documents and events to increase the accuracy of the classification models.Finally,combined with the named entity recognition technology,the document data is more accurately corresponded to the enterprises,and also combined with other techniques for automation including public opinion heat calculation,preprocessing and keyword extraction,a solution for scoring of enterprise development status and real-time warning of enterprise public opinion is described.The experiments verified the performances of the two enterprise analysis models.The accuracy of the enterprise development classification model is 89.24% and the accuracy of the enterprise public opinion classification model is 96.51% proving the effectiveness of the work done in this paper. |