Font Size: a A A

Research On Chinese Commodity Attribute Word Extraction And Clustering Based On Opinion Mining

Posted on:2021-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:X F WuFull Text:PDF
GTID:2518306521989309Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,a large number of electric business platform,social media quickly into people's life and work,a large number of complex and varied,content rich view of information emerge in front of the user,on the depth of these views information mining,analysis effectively,and help users to make decisions,has become the urgent demand of industry.Commodity comment mining is an extension of the application of opinion mining.It has important research significance for consumers to make correct purchasing decisions quickly,and for merchants to change or make promotion plans and sales decisions in time.However,the current Chinese commodity comment mining has not reached the maturity of English commodity comment mining.This paper studies the problems of low accuracy of Chinese commodity attribute extraction and inaccurate clustering results.First,using the crawlers to get the evaluation data of Chinese Commodity on shopping website,and conduct data preprocessing.An AELSA commodity attributes word selection algorithm was proposed based on the positional relationship between commodity attributes candidate words and emotional words.Through the experimental analysis of more than 60,000 Chinese mobile phone comments.Find out the value of the distance between commodity attribute words and emotion words when the extraction result is optimal.The extraction results of the AELSA commodity attribute word selection algorithm are compared and analyzed with the performance indexes extracted by the OPEN framework.The validity of AELSA product attribute word filtering algorithm is verified.Secondly,two clustering methods,SMC method and WMSA method,are proposed for different length of commodity attribute words and phrases.The SMC clustering method based on the must-link and cannot-link relationships between How Net semantic dictionary and specific attributes is proposed for attribute words.The WMSA clustering method based on position weight is proposed for the calculation of truncated similarity for long attribute phrases.The SMC?WMSA clustering algorithm is formed by combining the two clustering algorithms for hierarchical clustering of commodity attribute data.The SMC?WMSA clustering algorithm is verified by using the mobile property data.The purity index of clustering was used to analyze the experimental results.The validity of SMC?WMSA clustering algorithm is verified.Finally,by using HowNet dictionary and NTUSD dictionary of Affective polarity of Taiwan University to calculate the affective intensity of each attribute class cluster.Generate the structure diagram of commodity entity-attribution-emotion tree with attribute class cluster corresponding to emotion intensity value.And take mobile phone data as an example to generate the structure chart of commodity entity-attribute-emotion tree for display.Users can intuitively see the specific emotional words and emotional intensity values of each commodity attribute class cluster.
Keywords/Search Tags:commodity comment mining, commodity attribute words, emotional words, extraction algorithm, clustering algorithm
PDF Full Text Request
Related items