Font Size: a A A

Research On The Key Technology Of The Price Comparison System Based On Semantic Similarity

Posted on:2016-12-24Degree:MasterType:Thesis
Country:ChinaCandidate:Z M XiaFull Text:PDF
GTID:2308330470960216Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As the maturity of the Internet,the Internet has changed people’s way of life from all aspects.Influenced by the trend of the Internet,the traditional retail industry began to seek a new marketing concept,the e-commerce become more popular.Now,the shopping websites develop more mature such as Taobao,Jingdong,vipshop,Amazon and son on.Generally,they are lacking of the function that compare to the different categories business website in order to highlight the superiority of their website.These shopping sites have so many types of commodities that can not highlight they specially sales what kind of commodities and the price is not the same.So it is difficult for users to pick a favorite commodities in specific categories on these websites. The settled merchants on platform are difficult to develop the product at the suitable price because they do not know the market price online of the upload commodities.They’ll have no competitive advantage in the product prices.In order to solve the problems in the current business websites,this paper propose the price comparison system to improve the user experience of e-commerce platform.The main work is as follows:Firstly,this paper developed a flexible and efficient grab strategy to crawling commodities data based on the Heritrix crawler framework.Agaist to the specific webpage structure for the commodities webpage,it has taken the method of combining with HTMLParser and regular expressions to extract commodities data of webpage,established the commodity database for this system.And the experimental results show the effectiveness of this algorithm.Secondly,although different people describe a product is not the same, but the content of the expression is very similar in the semantics.In order to improve the accuracy of matching commodity description and commodity information from database when compare prices of commodities,this paper proposes a Chinese test similarity algorithm based on Semantic.Calculate the semantic similarity of words using lexical semantic information in How Net,and starting from the angle of words to analyze the weight of words to extract keywords of the text.Then,measure the similarity between texts by calculating the similarity between the keywords of text.Finally, in order to improve the accuracy of product searching further,this paper using some label information such as brand,benefits,etc.of commodities,further determined to be compared with the most similar commodities or the related commodities according to the similarity of these labels,in order to get more accurate results of parity.The paper training the wights of each tag by the number of frequency product label appears in the commodity information,and then calcute the similarity of label according to the matching degree of each label.Experimental results show that the price comparison system proposed have the better results, can get more accurate results for price comparison.
Keywords/Search Tags:price comparison system, web crawlers, web information extraction, semantic similarity
PDF Full Text Request
Related items