Font Size: a A A

Research On Online Retrieval Techniques For Massive Data Based On Entity

Posted on:2015-04-10Degree:MasterType:Thesis
Country:ChinaCandidate:X D ZhangFull Text:PDF
GTID:2298330422990912Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the increasing of web information, it is a serious research issue to make aintegration of web search results to improve users’ efficiency of retrieval. Webinformation always is in the state of semi-structured, such as blog, products. Andit’s from all over the world with different writing habits which makes it hard torecognize the same entities from retrieve results. Most existing entity resolutionmethods are for structured data or relational data. Some algorithms that designed forweb data such as products, they have a lot of shortages: first, they cannot meet therequirement of time efficiency; second, they always have a strong reliance ondomain knowledge, which means they are not domain free; third, they cannot geteffective results like that in relational data.To get rid of shortages listed above, we propose an online entity resolutionalgorithm based on e-commerce data. In this paper, we use attribute extraction tomake products structured, calculate similarity based on their common attributes andfinally cluster products according to referring entities. On this basis, we proposetwo optimization which leverage web information and users’ action. For attributeextraction, we propose two extraction algorithms: one based on rules and anotherbased on distance. They are domain free and without supervision. For attributesextracted, we do synonym discovering in order to eliminate clerical errors or somedifferent expressions. For optimization, we first leverage web information todiscover more attributes for ambiguous products, and then collect three3kinds ofusers’ action to update words’ and attributes’ weights in local database. Finally weevaluated the effectiveness of our algorithms with real-life shopping datasets, andon the basis of theoretical research, we implemented an online product retrievalsystem, shows the effectiveness of our algorithms in application project.
Keywords/Search Tags:entity resolution, e-commerce, user feedback, information retrieval
PDF Full Text Request
Related items