Cumulative Citation Recommendation For Online Knowledge Base

Posted on:2016-02-05

Degree:Doctor

Type:Dissertation

Country:China

Candidate:J G Wang

Full Text:PDF

GTID:1108330503955327

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the rapid growth of Internet, knowledge management and acquisition of human being is transferred from offline to online. Online Knowledge Bases(KBs), such as Wikipedia and Freebase, have become vital data sources of various web applications. These KBs are usually organized around entities such as persons, organizations, locations, and so on.Currently, the maintenance of a KB mainly relies on human editors. However, with the explosion of information, large-scale KBs are hard to be kept up-to-date solely by human editors. The less popular entities cannot be updated in time because they are not so spotlighted as popular entities. An outdated KB severely limits the effectiveness of applications depending on it. This gap could be bridged if relevant documents of KB entities can be automatically detected as soon as they emerge online and then be recommended to the editors with various levels of relevance. This is called the Cumulative Citation Recommendation(CCR).The contributions can be summarized as follows:First, the thesis introduces the background and related areas of CCR in detail. The mainstream approaches for CCR are broadly discussed, including unsupervised learning,semi-supervised learning and supervised learning. The pros and cons of these methods are presented.Second, the thesis focuses on supervised learning methods for CCR, including entitycentric query expansion, classification and learning to rank. This thesis also proposes the semantic and temporal features for supervised learning methods. The experiments on TRECKBA-2013 dataset evaluate the effectiveness of these novel features.Third, to address the data missing problem of less popular entities in CCR, a global discriminative model is achieved as a baseline approach via building a global classifier(ranker)with all training data regardless of the relationship among entities. While the global model cannot guarantee to achieve satisfactory performance for each entity. This thesis proposes an entity class-dependent discriminative mixture model by introducing a latent class layer to model the correlations between target entities and the latent classes. The model can better adjust to different types of entities and achieve better performance when dealing with a broad range of entities.Fourth, both the global model and entity class-dependent mixture model ignore the prior knowledge embedded in documents, hence the quality of recommended documents cannot be promised in CCR. A document class-dependent discriminative model is proposed via introducing a latent layer to capture the correlations between documents and their underlying classes. The model can better adjust to different types of documents and yield flexible performance when dealing with a broad range of documents. Experimental results prove that the document class-dependent mixture model can enhance the precision and accuracy of CCR.Fifth, the thesis studies the cold start CCR, in which target entities are selected from document streams instead of a reference KB. Since there is no KB profile to extract semantic features, the feature space becomes too sparse to build a satisfactory relevance model. To resolve the problem, the thesis proposes a event-based sentence clustering method and extracts sentence-level features for document ranking. These novel features are proven effective in cold start CCR.

Keywords/Search Tags:

Knowledge Base Acceleration, Cumulative Citation Recommendation, Information Filtering, Mixture Model, Cold Start

PDF Full Text Request

Related items

1	Research On Educational Resource Recommendation Method Based On Knowledge Graph
2	Research And Implementation Of Cold Start Problem In Commodity Recommendation System
3	Research On Recommendation Algorithms Against Cold-Start Problem
4	Research On Encyclopedic Knowledge Bases Oriented Entity-document Relevance Classification
5	The Research Of Distributed Collaborative Filtering Recommendation System For Cold-Start
6	Research On User Cold Start Based On Collaborative Filtering Recommendation System
7	Research On Cold-start Problem Of Collaborative Filtering Algorithm
8	Recommendation Technique Research On Personalized Microblog Stream
9	Research On Cold Start Recommendation Algorithm Based On Attribute-Fused Matrix Factorization
10	Recommendation Cold Start Method Based On Multi-armed Bandits