Research On Classification Algorithm Of Railwaye Quipment Fault Information Based On Text Recognitionon

Posted on:2022-06-30

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Li

Full Text:PDF

GTID:2492306542489624

Subject:Power electronics and electric drive

Abstract/Summary:

PDF Full Text Request

As the backbone of China’s transportation,railway plays an important role in the development of national economy and people’s livelihood.And safety is the premise of orderly and stable operation of railway system.With the rapid development of railway industry technology,all kinds of new equipment are continuously put into railway operation,but new problems arise,such as: which types of railway equipment have higher failure rate,how to describe different railway equipment faults structurally,and how to use railway equipment fault description to mine its internal rules.To solve the above problems,we need to find a text classification method of railway equipment inertia fault to identify and classify the massive railway fault text information.This paper starts from the source of fault text,expands the word segmentation thesaurus before text vectorization,obtains the equipment name,quality standard and station name related to railway system from authoritative websites such as National Railway Corporation and China railway inspection and Certification Center Co.,Ltd.,and generates the special thesaurus for railway equipment field.Combined with the Jieba word segmentation of railway special thesaurus,the fault description of equipment is segmented and the stop words are removed,so that the generated fault word segmentation text is closer to the effect of manual processing.After obtaining the word segmentation model,word2 vec algorithm is used to vectorize the word segmentation model to obtain the word vector which can represent the fault text;After that,LDA Algorithm is used to extract the features of the generated text vector,which provides a data source for the research of the subsequent classification algorithm.Then,a single classification algorithm model such as decision tree,KNN,support vector machine and gradient lifting decision tree is established for the processed data set,and the overall accuracy,recall rate and F1 value of the model are used as the evaluation criteria of classification effect.Then,according to the ensemble rules of ensemble classifier,each single classifier is used as the base classifier of stacking ensemble learning,and the decision tree is used on the meta classifier.Due to the strong imbalance of the data set used in this paper,we use borderline-Smote algorithm expands the minority classes in the data set,weights the base classifier based on the proportion of the classification accuracy of the base classifier for the minority classes to the overall classification accuracy,and establishes a railway fault text classification model based on weighted stacking ensemble learning.The results of this paper show that the railway domain specific word segmentation thesaurus can effectively represent the semantics of the original text,and the other string correlation and Pearson correlation coefficient can reach nearly0.9.Through the analysis of the experimental results,it is found that the weighted stacking ensemble learning model can effectively improve the accuracy for a small number of classes.Compared with the single classifier,the comprehensive performance is greatly improved,and compared with the traditional stacking model,the evaluation indexes are also improved.

Keywords/Search Tags:

Railway equipment, text vector, borderline smote, weighted stacking ensemble learning

PDF Full Text Request

Related items

1	Fault Of Railway Signal Equipment Based On Text Mining Classification Research
2	Power Load Variable Weighted Comprehensive Forecasting Based On Ensemble Learning
3	A Method Of Recognizing Railway Text Abnormal Labeling Data Based On Ensemble Learning
4	Research And Application Of Urban Waterlogging Depth Prediction Method Based On Stacking Ensemble Learning
5	Prediction And Analysis Of Domestic Used Car Prices Based On Ensemble Learning
6	Research On Electricity Theft Detection Method Based On Stacking Ensemble Learning
7	Research And Application Of Runoff Based On Ensemble Learning
8	Photovoltaic Power Generation Output Forecasting Based On Ensemble Learning
9	Research On Intelligent Operation Control Of High-speed Maglev Train Based On Stacking Ensemble Learning Method
10	Research On Fusion Prediction Model For Shield Equipment Fault Based On Ensemble Learning