Font Size: a A A

Research On Online Review Oriented Keyword Extraction And Knowledge Association

Posted on:2018-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:J B HanFull Text:PDF
GTID:2348330536961091Subject:Systems analysis and integration
Abstract/Summary:PDF Full Text Request
In the process of electronic commerce,there can produce a large number of valuable consumer data and word-of-mouth information,these information can provide the most intuitive user experience,and help consumers of different preferences,different quality products selected.The quantity of short texts is big,and the topic which wants to express is obvious.According to the online reviews as a carrier of reputation information mining is a research hotspot,the core of this paper is to extract keywords representative from the mass comment text,and semantic association keyword,which has certain explanation and readability.The difficulty of this problem is to put forward effective keyword extraction method and establish semantic relation between keywords.This paper is divided into the following three parts:First,to keep the comment text information as much as possible,to avoid false segmentation tool to identify unknown words and loss of information,for a large number of word segmentation Chinese in the preprocessing of the debris,the word fragments treatment adding rule model of word recognition,single pieces of the main character of the unknown words and word fragments and unknown words,in order to improve the utilization rate and the accuracy of segmentation on the comment text,related keyword extraction and keyword based semantic foundation and provides more abundant language information.Second,in order to extract keywords from online reviews,a keyword extraction algorithm based on LDA topic model and Word2 vec word vector model combined with TextRank is proposed.The influence factors to the theme into the mutual transfer between nodes,and that the theme of the influence of large probability of node transfer to node theme with little influence to the calculation,the candidate words influence on latent topics the middle layer of the shadow in the document using the LDA theme theme model;and that if a word is the importance of the document is very strong,so in other words and the words in the semantic is more similar,is more important,the correlation between the semantics of eliminating the effects of frequency identification based on keywords,using word vector calculation method between the document vocabulary similarity,the global and local information fusion based on candidate keywords adjacency based semantic structure information of the candidate words;finally,calculated by graph model algorithm lexical node score,finally get the Keyword extraction results for ordered sorting.Third,the use of LDA topic model and distributed Sentence2 vec model representation,in speech classification based on keywords and semantic similarity calculation between the degree of association,according to the correlation degree of different types of semantic association relationship keywords ranking,finally get the results associated with online review text keywords degree of association.Aiming at the keyword knowledge of online reviews,this paper proposes a semantic association mining method,and establishes the semantic relation of product comment information with keyword as the core.The fusion algorithm of keyword extraction and keyword association method,through the experimental analysis of evaluation methods,results show that the algorithm has good performance of keyword extraction and semantic keywords based on part of speech that has certain explanation and relevance.This study further improves the short text information processing,and can provide concise text representation information for users in the face of massive comment text.
Keywords/Search Tags:online review, keyword extraction, knowledge association, LDA topic model, TextRank
PDF Full Text Request
Related items