Font Size: a A A

Fault Diagnosis Of Signal Equipment On The Lanzhou-xinjiang High-speed Railway Using Machine Learning For Natural Language Processing

Posted on:2023-10-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y L ZhuFull Text:PDF
GTID:2532306848480384Subject:Transportation engineering
Abstract/Summary:PDF Full Text Request
The Lanzhou-Xinjiang High Speed Railway,also known as Lanxin High Speed Railway or Lanxin Passenger Dedicated Line.It is one of the important supports to promote the national "One Belt One Road" strategy.However,due to the complex geographical environment and volatile climate environment of the route,the signal equipment along the route is prone to various failures,which seriously affects the safe and efficient operation of the line.In the long-term operation and maintenance process,the railway signaling and communications department records a large amount of unstructured fault text information in the form of natural language,which includes the occurrence time,location,fault performance,fault category,and follow-up troubleshooting methods of related faults,etc.For a long time,when dealing with on-site faults,maintenance personnel often diagnose the faults manually based on personal experience and expert knowledge,and do not effectively analyze and utilize the corresponding fault data,so they can not mine the huge value contained therein.In order to respond to the China’s big data development strategy and promote the application of big data in the field of railway safety,it is therefore of great importance to investigate a fault diagnosis method that can effectively use the fault recording text,to improve the efficiency of fault diagnosis of signalling equipment and to improve the safety of road transport.Firstly,according to the fact that the fault records of railway signal equipment in China are mostly in the form of unstructured short Chinese text,which contains a large number of professional vocabulary of railway signal,mixed with numbers,letters and some special symbols,which can not be effectively analyzed and utilized under the traditional technology,This paper uses data mining to find high-frequency words,combined with professional vocabulary in the field of railway signal,to build a fault thesaurus in the field of railway signal.On this basis,Jieba Chinese word segmentation technology based on Hidden Markov model is used to segment the fault text and remove the stop words.It can be seen from the results that after using the custom railway signal domain thesaurus,the problem of wrong segmentation and non segmentation in the process of word segmentation is effectively solved,which provides a guarantee for the follow-up feature extraction.Then,the VSM(Vector Space Model)method is used to transform the fault information after word segmentation into the word item feature space.In order to solve the problem of insufficient consideration of the implicit semantic connection of the text in the traditional word item feature method,In this paper,LDA(Latent Dirichlet Allocation)subject model is used to extract the features of railway signal equipment fault records.After selecting the appropriate number of topics through multiple experiments,the original fault information is transformed into the topic feature space in the form of corresponding terms corresponding to different topics,so that the semantics and the features of the terms are associated,and the dimension of the fault text is reduced.It is convenient for subsequent fault diagnosis.Finally,through the statistics of the fault data of Lanzhou-Xinjiang high-speed railway signal equipment,it is found that there is an uneven distribution of fault samples.Therefore,this paper uses the method of machine learning combined with NLP(Natural Language Processing)to diagnose the fault.By comparing the traditional spatial vector model with the topic space model,it combines SVM(Support Vector Machine),NB(Naive Bayes),LR(Logistic Regression)and RF(Random Forests),KNN(K-Nearest Neighbor)and other machine learning classification algorithms train the fault classifier.On this basis,the experimental analysis is carried out with the fault text data of Lanxin high-speed railway signal equipment,and the effectiveness of the proposed method is verified by comparing the methods of different combinations of precision,recall and F1 measure;Experiments show that the accuracy of SVM classification algorithm combined with LDA subject model can reach 0.84,which verifies that the method of natural language processing can effectively use the fault text data recorded by the electrical department for a long time,so as to realize the fault diagnosis of signal equipment,and has certain guiding significance for the maintenance of field signal equipment.
Keywords/Search Tags:Railway Signal Equipment, Natural Language Processing, Topic Model, Support Vector Machine, Fault Diagnosis
PDF Full Text Request
Related items