Research On Keyword Extraction Method Based On Semantics Features

Posted on:2022-07-21

Degree:Master

Type:Thesis

Country:China

Candidate:N Su

Full Text:PDF

GTID:2518306575967099

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Today's Internet environment makes our life very convenient.These cannot be separated from the rapid development of the Internet,especially the rapid development of mobile Internet in recent years.With the progress of technology,the speed and scale of information generation and dissemination have reached an unprecedented level.Massive data comes to us,among which text data occupies a large part.It is very important to quickly acquire the main content information of these text data.Keywords are one of the main ways to quickly obtain text information.It is easy to understand and can cover the theme information of text well is an important feature of key words.Through a few short keywords,people can quickly understand the main content of a text,but it is not only time-consuming and hard to extract keywords by manual way,but also cannot deal with massive data.Automatic keyword technology can solve this problem well,and it is one of the important ways to deal with these massive data.Traditional keyword extraction methods only rely on statistical information to extract keywords,ignoring the important feature of semantic information in text,and cannot cover the subject information of documents.Based on this,this thesis studies the keyword extraction algorithm,and combines the statistical information,the deep learning semantic information and the topic information to design and implement the algorithm.The main work of this thesis is as follows:1.in view of the semantic deficiency of traditional keyword extraction methods,the thesis uses the pre training language model based on deep learning to obtain the vector representation of text as an important semantic information.Combined with statistical information,an automatic key word extraction algorithm combining semantic features is proposed.Through a lot of experiments,the algorithm has achieved good results.2.in view of the problem that only relying on statistical information to extract keywords is not enough to completely cover the subject content of the target document,the topic model knowledge is introduced based on the combination of semantic information and statistical information,and a keyword extraction method combining semantic features with theme model is proposed.3.in order to verify the practical application value and effect of the algorithm proposed in this thesis,the algorithm proposed in this thesis is the core algorithm of the system.From the perspective of software engineering,a prototype system of keyword extraction is designed and implemented.

Keywords/Search Tags:

Keyphrase extraction, Pre-trained language model, Word vector, Topic model

PDF Full Text Request

Related items

1	Research On Text Keyphrase Generation Method Based On Pre-trained Language Model
2	Micro-blog Feature Discovery And Topic Keyphrase Extraction Based On Language Network
3	Relation Extraction Based On Dualchannel Attention And Pre-Trained Language Model
4	Research On Keyphrase Extraction Algorithm Based On Word Embeddings Learning
5	Keyphrase Extraction Using LDA Topic Models
6	Study On Feature Extraction And Text Representation Technology In Topic Tracking
7	Research On News Text Summarization Algorithm Based On Pre-trained Language Model
8	Automatic Topic Labelling Based On Word Vectors
9	Text Classification Based On Word Vector And Topic Vector
10	Research On Semantic Reinforcement Based On Topic And Word Features For RNN Language Model