Font Size: a A A

Research And Application Of Prediction And Classification Model Based On Sequential Feature Analysis

Posted on:2021-04-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y D LiuFull Text:PDF
GTID:2428330605968123Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the advent of the era of big data,how to deal with sequential data has become a research hotspot in the computer field.Sequential data is a set of observation with sequential relationship,which can be seen in all aspects of life,such as ECG,daily temperature,total weekly sales,fund and stock prices and amino acid sequence data.The most important property of sequential data is continuity.Sequential data characterized by numerical values and continuity is always regarded as a whole,rather than a single numeric field.With the development of natural language processing(NLP)technology,more and more front-line work related to sequential data begins to draw lessons from ideas and algorithms in the field of NLP.Although more and more attention has been paid to the related researches of sequential data,most of the researches and applications of sequential data outside NLP field focus on the result prediction of sequential data,such as stock trend prediction,flight passenger prediction,etc.The focus of these studies is that the output of the model is sequential,that is to say,the prediction target is sequential.However,the sequential features in the model construction process have not been given enough attention.Aiming at the above problem,this thesis combined with specific project examples and borrowed natural language processing methods to address the problem with sequential features,a word2vec-LSTM based framework was proposed.In the two categories of fitting and classification,project examples were used for specific application description and result analysis.Aiming at the fitting problems with sequential features,this thesis proposed a gas load prediction model based on the Tem2vec-LSTM framework.The thesis used word2vec algorithm to map the time series feature(temperature)in the gas load forecasting problem into a high-dimensional dense vector with more potential information.The long short-term memory network(LSTM),which has superior performance in processing time series data,was used for modeling.The load data of the same period and the information of month,year and day were fully utilized for short-term load forecasting.This framework plays an important role in the subsequent energy deployment and operation decision-making of energy companies,and it supplements the solution to fitting problems with sequential featuresAiming at the classification problems with sequential features,a neoantigen classification model based on the AA2vec-GRU framework was proposed.AA(amino acid)2vec was used to map the sequential features,amino acid sequence,to a high-dimensional dense vector with more potential information.GRU(Gated Recurrent Unit)neural network,a simplified version of the LSTM neural network with faster convergence speed,was used for modeling.The results were compared with the current popular methods.Experimental results showed that the proposed model performs better on this classification problemThe highlights of the framework proposed in this thesis are as follows(1)For the prediction and classification problems with sequential features,the methods of natural language processing are used as a reference,and a framework based on word2vec-LSTM was proposed.It enables the neural network to deeply discover and learn the intrinsic relationship between the sequential features and improve the accuracy of the model output(2)Aiming at the problem of gas load forecasting,a load forecasting model based on Tem2vec-LSTM was proposed,and the sequential feature in this problem,temperature,was mapped to a high-dimensional vector.By comparison,the model had better fitting results and higher accuracy(3)Aiming at the classification problem of neoantigen,considering the sequence relationship of amino acid sequence characteristics,a classification model based on AA2vec-GRU was proposed.(4)In the model training stage,the thesis introduces several layers of DenseNet into the network model.DenseNets have few parameters,high computing efficiency,good anti fitting performance and strong generalization ability.The combination of LSTM layer and DenseNet layer made the network more efficient and accurate.
Keywords/Search Tags:Machine learning, Sequential features, Word2vec, LSTM, GRU
PDF Full Text Request
Related items