Font Size: a A A

A Keyword Extraction Algorithm Based On Background Knowledge

Posted on:2015-03-08Degree:MasterType:Thesis
Country:ChinaCandidate:S D QiuFull Text:PDF
GTID:2298330452951423Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The rapid development of the Internet has significantly boosted the number of electronicdocuments. There is a demand for selective reading to identify valuable content from a largecollection of documents. Therefore it is of vital importance to extract the key information (suchas keywords) based on the context of the given document. This dissertation aims to develop analgorithm that automatically extracts the keywords of a document in order to enhance selectivereading.State-of-the-art information extraction algorithms provide little space for performanceimprovement. This is due to the fact that they can only extract information from the contentprovided by no more than the document itself. Targeting patent documentations whose contextsare provided in the form of an external XML dataset, this dissertation identifies a set of novelfeatures which effectively reflect the background of these patents, before presenting a supportvector machine based classification algorithm for automatic keyword selection.Experiments based on the dataset suggest that the proposed algorithm outperforms itscounterparts.
Keywords/Search Tags:background knowledge, patent document
PDF Full Text Request
Related items