Font Size: a A A

The Application Of Multi-label Classification In Text Correlation Mining

Posted on:2019-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:N KeFull Text:PDF
GTID:2348330542498881Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The arrival of the Web2.0 era provides a new way for the generation of Internet data,and the growth of network data is growing faster and faster.In terms of massive amounts of data,Users can't get the data they need based on their preferences.Therefore,how to provide targeted data resources to users has become an urgent problem that Internet companies need to solve.In this context,the recommendation system technology comes into being,and as a hot topic under the recommendation system,the correlation mining has great research value.As the most widely distributed data type in the network,text accounts for the largest proportion of user demand.Therefore,in this paper,text data is used as the research object,and a text association mining method based on multi-label classification is proposed.Traditional text correlation mining method model text data through word vectors,and the similarity of word vectors is used to calculate the correlation of text.It is relatively simple in relevance measure standard and there is vector sparse problem.With the complexity and hierarchy of text categories,single evaluation standard is not sufficient to measure the relevance between the text accurately,and the way in which the text is associated are different in different viewing angles and field.The traditional method is very difficult to solve the correlation analysis in this multi-dimensional angle.In view of the above problems,this paper proposes an improved multi-label classification algorithm,and applies it to text correlation mining.By mapping the text into multi-label vectors,it realizes multi-dimensional correlation analysis,which makes up for the deficiency of traditional algorithm in relation integrity.At the same time,in order to verify the accuracy and feasibility of this algorithm,we used the zhihu open source data to experiment,and verify the effectiveness of the algorithm in the domain of text multi-label classification and the domain of text correlation mining.The chapter one and chapter two of this paper describes the research background and introduces the related concepts and technologies of recommendation system and text correlation mining,then this paper proposes a research route based on multi-label text classification.The chapter three introduces the multi-label classification in detail,which is the core technology of this paper,and we analyzes the technical difficulties involved in this algorithm.The chapters four and chapters 5 of this paper are about the design and experiment of the algorithm.The key content is the improvement of text multi-label classification algorithm.Meanwhile,this paper also introduces the correlation calculation method based on multi-label vectors.Finally,the performance experiment of the proposed method is performed through the zhihu open source data set,which proves the enhancement effect of the paper method in the domain of text correlation mining.
Keywords/Search Tags:text correlation mining, text multi-label classification, correlation calculation, recommend system
PDF Full Text Request
Related items