Font Size: a A A

Research On Keyword Extraction Method For Public Security News

Posted on:2021-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:L M LiuFull Text:PDF
GTID:2416330614471689Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Keyword extraction designed to extract a set of words related to the topic of text,is a basic natural language processing tasks.With the advent of the Internet information age,a considerable amount of unstructured data has been generated in the field of public security.In particular,a large amount of text data represented by public security news has appeared.Faced with so much text data,extracting the keywords can not only help the public security organ to archive and classify these data,but also improve the information level of public security.It has high research significance and application value.Most of the existing keyword extraction methods are for social media,scientific papers and other types of text extraction keywords,and few research results of keyword extraction methods based on public security news have been published.This article mainly studies keyword extraction methods for public security news.The research work is as follows:(1)A keyword extraction method based on position features has been proposed.In view of the feature that the keywords in the public security news often appear near the beginning of the text,this method believes that the candidate words that appear earlier are more important.The position information of the candidate words is used to design the weight of the candidate words.Improved Text Rank algorithm is used to extract keywords.Experimental results show that this method can produce better keyword extraction results compared with the various position-based deformation algorithms proposed in this paper and existing unsupervised keyword extraction methods.(2)A keyword extraction method based on word embedding has been proposed.This method can accurately explain the topic information expressed by the public security news.If the semantics of the candidate word and the target document are closer,the candidate word is more important.Firstly,Word2 Vec is used to learn word vectors based on the public security news dataset.Secondly,based on the semantic similarity between the candidate word and the target document,the sigmod function is used to calculate the weight of the candidate word.Finally,the improved Text Rank algorithm is used to extract keywords.Experimental results show that,compared with the existing unsupervised keyword extraction algorithm,this method uses word vector features to improve the keyword extraction effect for public security news.(3)The data set of public security news is constructed.This article designs a Java-based public security news gathering system.The system constructed the first public security news data set by means of web crawling.The data set covers different types of documents such as alarms,meetings,activities,and announcements.In summary,the two keyword extraction methods mentioned in this article can effectively extract the keywords in the public security news.The keyword extraction method for public security news greatly facilitates the retrieval of relevant data by police officers,and effectively improves the work efficiency of police officers.
Keywords/Search Tags:Keyword extraction, Public security news, Position, Word embedding, Word2Vec, Graph model
PDF Full Text Request
Related items