Font Size: a A A

Semantic Annotation Method Of Commodity Based On Classification Tree And Crowdsourcing In Electronic Commerce

Posted on:2018-10-10Degree:MasterType:Thesis
Country:ChinaCandidate:H T ZhuFull Text:PDF
GTID:2348330512987262Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Electronic Commerce and the Internet technology,a combination of video and e-commerce business model T20 came into being.The image of the video can be matched by the image matching algorithm and the commodity image in the commodity database,so as to provide the user with the purchase link.If more semantic tags are added for the commodity resources during the construction of commodity database,we can save the browsing time and recommend commodities based on different tags for users.In this paper,we mainly study the semantic annotation of commodity text resources.In the existing research on semantic annotation of text resource,resources,such as documents and web pages are structured text or long text,relying on domain ontology or knowledge base and other knowledge organization system.However,in the field of Electronic Commerce,the commodity description text has the characteristics of"fragmentation" and lacking of context semantic information,as well as lacking of shared domain ontology.To deal with these challenges,we propose a method of semantic annotation based on Word2vec and e-commerce classification tree.The main contents of this paper include the following items:Firstly,we construct the e-commerce classification tree by using the catalogue of online commodity database and properties of large-scale commodity description of resources,including the commodity concepts,concept relations and concept properties;Secondly,we generate the semantic features of the commodity description text by training the Word2vec in the domain of e-commerce;Then,we regard the commodity concepts as known category labels and regard the commodity description text as the non-classified data.We train the commodity classifier based on the Word2vec features and use the classifier to annotate the category to commodity;Then,according to the commodity concept,we obtain the concept properties in the classification tree,and calculate the similarity between the attributes in commodity description text and concept properties from both the form and semantics;Finally,we combine the Crowdsourcing and Active Learning to improve the quality of commodity semantic annotation by improving the accuracy of commodity classifier.The main contributions of this paper are as follows:1.we propose a method of commodity semantic annotation based on the classification tree and Word2vec in the field of Electronic Commerce.We regard the classification tree as a knowledge organization system,which can express the domain knowledge as well as the domain ontology;we generate the semantic feature by using Word2vec,which makes the commodity description text more semantic.Through the combination of the classification tree and Word2vec feature,we can add semantic tags such as categories and attributes to the commodity when building the commodity database.2.We propose a method to improve the quality of commodity semantic annotation with Crowdsourcing and Active Learning.We combine with the advantages of high accuracy of Crowdsourcing and high speed of machine classification,use the sampling strategy of Active Learning,and select the results of low reliability in the machine classification results which are needed to be labeled in the Crowdsourcing.It is possible to use a small number of known classification labels of commodity data and a large number of unknown classification labels to train a higher accuracy commodity classifier.We improve the quality of commodity semantic annotation by improving the accuracy of commodity classifier.In this way,we can improve the quality and save the cost at the same time.
Keywords/Search Tags:E-commerce Classification Tree, Semantic Annotation, Word2vec, Crowdsourcing, Active Learning
PDF Full Text Request
Related items