Font Size: a A A

Research On Regional Classification Of News Texts Oriented To Network Education

Posted on:2020-10-30Degree:MasterType:Thesis
Country:ChinaCandidate:M MengFull Text:PDF
GTID:2428330596979667Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the rapid development and popularization of the Internet,more and more educational news data are available on the Internet.In order to enable the relevant personnel concerned with educational news to obtain educational news in specific regions of 34 administrative regions of China conveniently according to their needs.It is very practical significance to collects a large number of educational news texts,and establishes a set of labeled educational news texts,then designs a regional classification method suitable for educational news texts.This paper will study the regional classification of educational news texts.The main work can be summarized as follows:(1)In order to reduce the cost of labeling text sets manually,this paper studies the automatic labeling method of web-based educational news text sets,and proposes an automatic labeling method of educational news text sets based on CGLTF-IDF feature extraction and semi-supervised clustering.Firstly,the text is acquired by web crawler and cleaned to provide sample set for research.Secondly,the educational geography nouns are formed to form the educational geographical lexicon,and the document-reverse document frequency(TF-IDF)feature extraction method is improved.The feature extraction method suitable for educational news text is designed and named as CGLTF-IDF.Thirdly,in order to form a high-quality training set with labels,a weight-based sample selection marking strategy is proposed.Finally,an automatic marking model of educational news text set based on CGLTF-IDF feature extraction and semi-supervised clustering is constructed and the text set is marked.The experimental results show that this method can effectively mark the network education news text collection,thus providing a trainable data set for later research.,thus providing a training data set for later research.(2)In order to classify educational news texts into administrative regions,this paper designs a regional classification method(V-ECM)for educational news texts based on voting strategies.Firstly,we analyze the existing text classification algorithms and text representation methods,design and implement a regional classification model based on Naive Bayesian,Convolutional Neural Network and Long-Short Memory Network(LSTM).Secondly,according to the characteristics of these three classification models,a regional classification method of educational news texts based on voting strategy is designed to realize the regional classification of educational news texts.Finally,V-ECM is applied to the actual education information system,and meet the needs of users.
Keywords/Search Tags:Educational news, Text classification, Educational geography lexicon, Automatic marking, Feature extraction
PDF Full Text Request
Related items