Font Size: a A A

Content Linking Method And Application Based On Word Embedding Model

Posted on:2017-06-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q GaoFull Text:PDF
GTID:2348330518494809Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As we all know,it's very common that contents of the texts used to have links with others.The links could be the references between discourses in papers.And they also could be the links between readers' comments and the original articles in the online forum.These links not only could provide a very good communication channel between users,but also make the contents more objectivity and comprehensive.However,the realization of this task could not just rely on manual work due to the daily growing scale of corpus.Hence it has suggested the urgent need for automated methods to implement the content linking task,which can also help other related applications,such as information retrieval,summarization and content management.Up to now,most of the methods used for content linking are focused on similarity computing based on various traditional grammatical and semantic features.The major problem comes from the disadvantage that they mainly deal with the surface features of texts and words.Recently,the Word Embedding model has performed well in Natural Language Processing(NLP),especially in mining deep semantic information.In this thesis,we propose a new approach for the content linking task based on the feature from the Word Embedding.Firstly,we describe the structure of the model in details.Next,we assess the model with different parameters of the word vectors.Then we make experiments on the corpus of biological papers in the field of English literature as well as Chinese-English online forum including Tianya-free and The Guardian.Finally,we verify the validity of the proposed method through experiments and comparison with traditional ways.
Keywords/Search Tags:content linking, word embedding model, online forum, citation sentence, NLP
PDF Full Text Request
Related items