Font Size: a A A

Research On Object Extraction Of Automobile Product Based On Sequence Labeling

Posted on:2021-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:J WangFull Text:PDF
GTID:2428330629452433Subject:Systems Engineering
Abstract/Summary:PDF Full Text Request
Car-oriented product reviews mean that users post their subjective evaluations of car prices,performance,power,appearing on Weibo,forums,WeChat public accounts and other platforms.In the car reviews,Users often evaluate products to explicit objects,for example,a certain part or a function of a specific product.Therefore,mining car names and attributes in car product reviews has important commercial value for car manufacturers and consumers.In this paper,automobile names and attributes are collectively called to as product objects.In this way,extracting product objects in product reviews is also the basic task of product review analysis and an important research problem of fine grain sentiment analysis.Most of the existing researches extract the product target names alone,without considering the specific object attributes.For the problem of extracting product objects in car reviews,this thesis aims to realize the fine grain sentiment analysis of product reviews,and carries out research on the methods of extracting the names and attributes of car products.The main research works contains three sub-works,which are as follows:(1)The relevant technology and data annotation specificationThe analysis of relevant technology basis of Chinese text distributed representation and relevant methods of product object extraction,and annotation specification of objects data are introduced.First,the relevant methods of Chinese text representation such as Word2 vec model and Cw2 vec model are introduced.At the same time,by analyzing the characteristics of the review data,the corresponding data annotation specifications are formulated,which provide a standard for the annotation of experimental data.(2)The extraction method of product objects based on multi-features fusionThe extraction of product name is regarded as a sequence labeling problem,and a method of product object extraction based on word vectorsand conditional random field is proposed.In this model,except for the statistic features,for examples,word features,part-of-speech,word length,the left and right information entropy,mutual information of the words,the similarity of word embeddings between the word and the words in domain vocabulary is added as an additional feature.We designed the extraction method of multiple features fusion based on CRF model,and achieved good results on the dataset of the product objects.(3)Product name and attribute identification method based on cw2vec-BiLSTM-CRFAiming at the problem that users want to obtain more fine grain evaluation of product names and attributes in product review data,a product name and attribute identification method based on cw2vec-BiLSTM-CRF is designed.First,the cw2 vec model is used to encoder the semantic of Chinese text.On this basis,the bidirectional long short-term memory model is combined with the conditional random field model to identify the names and attributes in the product reviewers.This method uses the long short-term memory module to effectively encoder the context information,while using the conditional random field model as the label inference layer to solve the problem of sequence label dependence.Through experiments on data in the car reviews,the experimental results show that the cw2vec-BiLSTM-CRF model is effective in the task of product name and product attribute identification.
Keywords/Search Tags:Product name and attribute extraction, Sequence labeling, Car review, Conditional random field model
PDF Full Text Request
Related items