Font Size: a A A

Research On Joint Embedded Multi-label Classification Algorithm

Posted on:2020-08-02Degree:MasterType:Thesis
Country:ChinaCandidate:X Y LengFull Text:PDF
GTID:2428330575454462Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of network technology,data information shows exponential growth,and we have entered the era of data information explosion.Faced with these massive informations,how to analyse,mine and extract useful information for assisting decision-making in real-word application are become hot research topic and urgent problem to be solved in academic and industrial circles.As data analysis and mining technology,classification marking can provide a clear and concise summary of data information.After data marked,it can be quickly indexed and accessed by labels,so as to meet the actual needs of people's production and life.However,facing of the increasingly rich life needs,single label can not better explain and summarize the data information,that is,it can not satisfy the actual task needs well.Therefore,in recent years,the research on multi-label has attracted the attention of scholars at home and abroad.However,for high-dimensional features/labels,most of the proposed algorithmsis too long computational time or inefficient.Faced with the trend of high-dimensional data information,how to effectively solve the multi-label classification of high-dimensional data is a challenge.To solve this problem,two novel joint embedded multi-label classification algorithms are proposed in this paper.One is a joint embedded multi-label classification model(Deep AE-MF)based on stacked auto-encoder and matrix factorization.In the face of the current problem that multi-label data tend to be high-dimensional,the model uses deep learning and dimension reduction technology to learn features and labels in low-dimensional embedding,and uses Canonical Correlation Analysis(CCA)technology to make features and labels interact with each other for joint embedding learning.The model consists of three parts:feature low-dimensional embedding,label low-dimensional embedding and shared subspace learning;In the part of feature,it uses deep learning network Stacked auto-encoder(SAE)to mine deep features;and label part uses Matrix Factorization(MF)to avoid the risk of mistake coding,it not only obtains the effective potential representation of labels,but also indirectly utilizes the dependencies between labels;shared subspace learning uses CCA method to link the feature and label part,so that label information is taken into account in feature learning,and feature information is used in label learning.The model uses deep learning technology,matrix factorization and CCA to seamlessly integrate embedded learning and label prediction.Form a unified joint embedded multi-label classification model.Compared with o other state-of-the-art multi-label approaches on six commonly datasets,it shows that the Deep AE-MF model not only has better classification generalization performance,but also effectively alleviates the inefficiency in the face of high-dimensional data.The other is a joint embedded multi-label classification algorithm based on Attentive BiGRU and matrix factorization.Due to the problem of interdependence of text content information,the uneven distribution of important information vocabulary and the lengthening of text,BiGRU network can make full use of historical and future information in the current moment or word learning,and capture the effective representation in context.Therefore,firstly,using BiGRU network to extract deep features of text;and then attention mechanism is used to automatically recognize important information of text and capture the deep semantic representation by weighting.Finally,it combines with MF method by mutual interaction learning to get a shared subspace.By word representation,bi-directional GRU network and attention mechanism,the algorithm can fully extract effective low-dimensional text representation when faced with uneven distribution and variable length of text content information,and further combine MF method to form an efficient and unified joint embedding multi-label text classification model.Experimental results demonstrate that the deep features of text considering context semantics can improve the performance of classification.
Keywords/Search Tags:Multi-label Classification, Dimension Embedding, Deep Learning, Matrix Factorization
PDF Full Text Request
Related items