Research On Keyword Extraction Based On News Corpus

Posted on:2022-05-07

Degree:Master

Type:Thesis

Country:China

Candidate:S X You

Full Text:PDF

GTID:2518306494971089

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the development of the Internet,data such as webpage data and new media texts are increasing.The efficiency of full-text information retrieval is no longer sufficient to support the retrieval of massive data.Therefore,keyword extraction technology is widely used in search engines(such as Baidu search)and new media.Services and other fields(such as news retrieval).Traditional keyword extraction methods judge the criticality of words based on the context information and grammatical information of the words in the document.This algorithm is simple and effective,but cannot obtain the deep information and features in the document,and cannot achieve the accuracy of manual extraction in terms of extraction effect.In response to the above problems,this article proposes the Fusion Model that includes multiple feature information and multiple methods,and improves and optimizes the keyword extraction model from two aspects:1.Propose the Fusion Model that combines multiple algorithms and neural network models.The two traditional algorithms of TF-IDF and Text Rank are optimized for normalization and smoothing,so that the results of the two can be compared and mixed.Use the Bi LSTM model to label the input documents with keywords,and optimize them with the conditional random field.In order to solve the problem of insufficient generalization of deep learning models,this paper uses the results of traditional keyword extraction models to conduct feedback training on deep learning models,so as to continuously optimize the overall efficiency of the Fusion Model.After experimental demonstration,the F1 value of keyword extraction based on the Fusion Model is increased by 21.02% compared with the traditional model,which is 5.05%higher than that of the currently popular Bi LSTM-CRF sequence labeling model.2.Propose an algorithm for fusing a variety of artificial features with the Bi LSTM-CRF model,and propose a "LMRSN" sequence labeling method that is more suitable for the Fusion Model in this article.The Fusion Model uses a variety of algorithms to collect features such as part of speech,word frequency,word length,and word position of the document,and encodes the artificial features and the word embedding layer to obtain a word embedding vector containing artificial features.Multi-dimensional feature information can extract the deep feature information of keywords in a more comprehensive auxiliary model.And when dealing with tagging tasks,this paper proposes to use the "LMRSN" method for tagging,so as to effectively solve the problem of not being able to extract key phrases.After completing the research on keyword extraction technology,this paper continues to study the application direction of keywords,applies the keyword extraction technology based on fusion model to the task of news recommendation,and proposes a variety of effective candidate news document selection methods and the calculation method of recommendation index between news documents.Finally,the effectiveness of keyword extraction based on fusion model is demonstrated by experiments.

Keywords/Search Tags:

Keywords extraction, LSTM, news recommendation, deep learning

PDF Full Text Request

Related items

1	Research On Chinese Word Segmentation And Keywords Extraction
2	Research On Intelligent Resource Recommendation Method Based On Deep Learning
3	Research On News Recommendation Algorithm Based On Deep Learning
4	Design And Implementation Of Expert Recommendation System Based On Deep Learning And Community Question Answering
5	Research On Question Keywords Extraction Techniques For Question Answering
6	Design And Implementation Of Personalized News Recommendation System Based On Deep Learning
7	Design And Implementation Of News Recommendation System Based On Deep Learning
8	Research On Movie Hybrid Recommendation System Based On Deep Learning
9	Research On The Extraction Of Weibo Character Relations Based On Deep Learning And Relationship Path
10	Research On Personalized News Recommendation Method Based On Deep Learning