Exploiting Topic-based Adversarial Neural Network For Cross-domain Keyphrase Extraction

Posted on:2021-02-09

Degree:Master

Type:Thesis

Country:China

Candidate:Y N Wang

Full Text:PDF

GTID:2428330602499098

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

In today's era of data explosion,concepts such as data,information,and knowl-edge have been related to everyone and various industries.However,raw data of any form conveys no information unless it is processed in some intelligent way.Knowing the most important phrases of textual documents can provide a condensed representa-tion of them which can considerably ease their processing.Keyphrases of a document provide high-level descriptions of the content,which summarise the core topics,con-cepts,ideas or arguments of the document.These descriptive phrases enable algorithms to retrieve relevant information more quickly and effectively,which plays an important role in many areas of document processing,such as document indexing,classification,clustering and summarization.However,most documents lack keyphrases provided by the authors,and manually identifying keyphrases for large collections of documents is infeasible.However,the manual determination of the sets of important phrases for every single document in a large collection of documents is a tedious and expensive task and it often requires expert knowledge.Fortunately,natural language processing techniques can help the automatic generation of keyphrases for documents.At present,solutions for automatic keyphrase extraction mainly rely on manually selected features,such as frequencies and relative occurrence positions.However,such solutions are dataset-dependent,which often need to be purposely modified to work for documents of different lengths,discourse modes,and disciplines.This is due to the fact that the performance of such algorithms heavily relies on the selections of features,which turns the development of automatic keyphrase extraction algorithms into a time consuming and labor-intensive exercise.First,although supervised methods perform well in this task,it requires a large amount of labeled data which is extremely expensive and time-consuming to collect in many application scenarios.Second,most existing methods focus on single domain keyphrase extraction,which does not fully utilize the data in the resource-rich domains.Therefore,aiming at the above research problems,we investigate an under-explored problem of cross-domain keyphrase extraction.The major work and contributions are as follow:1.We investigate an under-explored problem of cross-domain keyphrase extrac-tion.We show that it is possible to use both labeled data from resource-rich domains and unlabeled data in the source and target domains for improving the performance of keyphrase extraction in the unlabeled target domain.2.We propose a novel topic-based adversarial neural network that can learn trans-ferable knowledge across domains efficiently by performing adversarial training.To the best of our knowledge,we are the first to exploit the adversarial learning technique for keyphrase extraction.3.We design a topic correlation layer to incorporate the topic-based representation of the document.Moreover,we also propose to reconstruct the document in the target domain from both forward and backward directions to learn the domain-private features.

Keywords/Search Tags:

Adversarial Network, Transfer Learning, Keyphrase Extraction

PDF Full Text Request

Related items

1	Research On Keyphrase Extraction Algorithm Based On Multi-scalable Learning
2	Research On Graph-based Keyphrase Extraction Integrating Multiple Attributes
3	Micro-blog Feature Discovery And Topic Keyphrase Extraction Based On Language Network
4	Research On Deep Transfer Learning Method Based On Adversarial Network
5	Research On Keyphrase Extraction Algorithm Based On Word Embeddings Learning
6	The Research On Keyphrase Extraction Method Of Scientific Literature Based On Feature Representation
7	Chinese Keyphrases Extraction Technique
8	Research And Implementation Of Cross-domain Relation Extraction Based On Adversarial Network
9	Research On Automatic Keyphrase Technology In Academic Corpus
10	Research On The Keyphrase Extraction And Relevant Technology