Research On Keyword Extraction Algorithms Based On Semantic Features

Posted on:2020-06-19

Degree:Master

Type:Thesis

Country:China

Candidate:J Z Zhou

Full Text:PDF

GTID:2428330590976550

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Keyword extraction is a widely used technology.In the early stage,it was extracted manually.Later,scholars put forward some automatic methods,and the geometric growth of information needs more effective methods.Traditional algorithms are mainly based on statistical methods,and keywords themselves lack standards.At present,the deep learning method can automatically learn the characteristics of data and output good results,so it uses deep learning technology to learn the semantic features between keywords and documents to achieve better algorithm.This paper mainly makes the following innovations:1.Use word vector to improve Text Rank.Fast Text is used to represent the document set by word vectors.Based on the idea of implicit topic distribution,this idea holds that a document is composed of words belonging to different topics,and the difference between the central words of each topic is the greatest.Therefore,using semantic differences between words,the probability transfer matrix of Text Rank is improved.Let the weight transfer more to the words with large semantic differences,so as to increase the weight of the subject headwords,and improve the effect of the original algorithm;2.Construct document-keyword pairs and transform keyword extraction into two-category task.In the process of keyword extraction,we usually only focus on the document itself,but not make good use of the training data with annotations.This paper assumes that there is a certain distribution between the document and the keywords.The keywords are obtained by sampling.By constructing the document-keyword pair and learning the distribution through the model,the keyword extraction is transformed into a two-category task,and realized the learning of semantic features between documents and keywords.3.Extraction of keywords by generative adversarial networks.Generating antagonistic networks can learn the true distribution of data very well,so the hypothesis of point 2can be realized.The generator uses Seq2 Seq model and attention mechanism to learn the semantic features of words in order to improve the possibility of keywords being extracted.In addition,because the keywords are discrete data,the network is trained by gradient updating using the policy gradient in reinforcement learning.

Keywords/Search Tags:

Keyword extraction, Semantic features, word vector, two-category, Generative Adversarial Networks

PDF Full Text Request

Related items

1	Research On Chinese Semantic Keyword Extraction Method Based On Multiple Features
2	Multiple Documents Automatically Summary Based On Semantic Word Vector
3	Research And Application Of Audio Keyword Recognition Technology Based On Generative Adversarial Network
4	Topic Modeling Research Based On Word Embedding And Generative Neural Networks
5	Weakly Supervised Learning Of A Generative Adversarial Nets For Semantic Segmentation
6	Keyword Extraction From News Web Pages
7	Research On Image Semantic Segmentation Method Based On Generative Adversarial Network
8	Research And Implementation Of News Keyword Extraction Method Based On Semantic Clustering And Weighted TextRank
9	Automatic Keyword Extraction Algorithms Based On Word Embedding And Multiple Features Fusion
10	Research On Keyword Extraction Method Based On Semantics Features