Font Size: a A A

Research On Short Text Classification Method For Intelligence Analysis

Posted on:2021-04-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y J ZhaoFull Text:PDF
GTID:2518306047482254Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The short text classification technology is particularly important for the field of natural language processing,and the FastText classifier is particularly prominent in the field of text classification.Therefore,the short text classification technology based on the FastText classifier has important research value.Existing short text classification technologies have the common problems of insufficient text semantic information.At the same time,when the training data set is large and the class hierarchy is obvious,the semantics of classification labels are fuzzy and crossed.The current two-stage training architecture of popular short text classification models,such as the BERT model,also has a high training time cost.Facing the massive short text information in the field of Chinese intelligence,how to quickly dig out high-quality intelligence information and gain valuable time for further decision-making is very necessary.Aiming at the above problems,this paper has done research on short text classification methods for intelligence analysis.First of all,this paper proposes CSE encoding semantic enhancement and ASE autonomous semantic enhancement technology to solve the problem of insufficient semantic information in the short text classification of FastText classifier.The specific process is: preprocessing the news intelligence data set,including: data cleaning,format adjustment,Jieba word segmentation,stop words and TF-IDF algorithm to remove high-frequency useless words,and other preparations;Encoding,including: mathematical modeling to construct mathematical language models and encoding extended corpora into short text semantic information;training models,including: partitioning data sets and making full use of high-quality data sets of model tuning to encode short text semantic information.Through the proposed CSE encoding semantic enhancement technology and the improved version of the ASE autonomous semantic enhancement technology FastText CA classifier to classify short text headlines of news information data,the accuracy of the short text source classifier can be improved.Secondly,to deal with the problem of easy semantic crossover of category labels,an MSOA multi-strategy hierarchical optimization algorithm was proposed.The algorithm flow is as follows: artificial feature selection and classification of classification labels;use of mathematical language to describe the hierarchical structure of classification labels;formulation of standardized optimization schemes,and execution of local mode for MSOA algorithm optimization.The classification results classified by this algorithm are more authentic.For example,the classification results of the experiments in this paper are specific to each entity class.Finally,with the help of the FastText CA+MSOA optimization algorithm,using the Chinese news intelligence data set as experimental data,the FastText source classifier experiment,traditional classifier experiment,CSE improved FastText experiment,ASE improved FastText experiment,and CSE+ASE improved version were respectively performed.FastText experiments and FastText CA+MSOA optimization algorithm experiments,and the experimental results are compared with the corresponding baseline Baseline.The analysis results show that the CSE and ASE short text semantic enhancement technology and MSOA optimization algorithm based on the FastText classifier proposed in this paper can effectively improve the model accuracy of the FastText classifier in short text classification,and it is obvious in the hierarchical structure of the dataset This can effectively eliminate the semantic ambiguity and crossover of classification labels.
Keywords/Search Tags:Text categorization, FastText, Semantic Enhancement, Semantic Intersection, News Intelligence
PDF Full Text Request
Related items