English Short Text Measurement Method Based On Part Of Speech And Keyword

Posted on:2019-11-28

Degree:Master

Type:Thesis

Country:China

Candidate:M Y Zhao

Full Text:PDF

GTID:2428330548963456

Subject:Computer application technology

Abstract/Summary:

With the development of information technology and the intelligentization of mobile terminals,social media has also seen rapid development.Nowadays,a large number of users use social media every day,and the amount of information transmitted in social media also grows rapidly.Get these information spread in the social network,grasp the law of its dissemination and development,have important research value for hotspot excavation,commercial marketing and public opinion control.For mining data,the key point is to get the similarity between documents.How to solve the text similarity has attracted more and more researchers' attention.The early text similarity mainly focused on long texts.In recent years,due to restrictions on characters on social media,people prefer to use short texts to express their opinions when using social media.At this time,short texts are similar.The degree of measurement becomes even more important.However,short texts contain much less information than long texts.Therefore,the traditional method for measuring similarity of long texts is not very effective in measuring short text similarity.Therefore,how to measure short text similarity is particularly important.Therefore,this article proposes a short text measurement method based on part of speech and keywords and applies it to the prediction of prevalence.The main tasks are as follows:1.Improve the Word Mover's Distance(WMD)algorithm for short text measurement: The WMD algorithm first uses word2 vec to represent the words in the text on the vector space,and then calculates the similarity between each word to calculate the two short texts.The distance,WMD algorithm has achieved good results in a variety of data sets.However,this method gives equal weight to all words in a sentence,without considering the differences in different parts of speech and the importance of keywords.Therefore,this article considers the importance of part of speech and keyword,assigns different words to different weights when computing text similarity,and puts forward a calculation weight optimization algorithm.Experiments based on the classification of Weibo emotion tendency indicate that the improved WMD algorithm in this paper can achieve better performance.2.Apply the improved WMD algorithm to Weibo popularity prediction: This paper uses the improved algorithm and the original WMD algorithm method to extract the similarity features,and adopts SVM and logistic regression models to predict the popularity of Weibo..Through comparative experiments,it is found that using the improved WMD algorithm can obtain higher accuracy in the prevalence of Weibo.

Keywords/Search Tags:

Related items

1	The Popularity Of Micro-blog Predicting Based On Logistic Regression
2	Research On Prediction Model Of Repoverty Based On Logistic Regression Analysis
3	Media Popularity Prediction Algorithm Based On Multiple Attributions
4	Research On The Prediction Of Insurance Payment Based On Logistic Regression Model
5	Research On The Application Of Math Grade Prediction System Based On Logistic Regression
6	Analysis Of Online News Popularity Based On Ensemble Learning
7	The Study Of A Prediction Method For Search Ad CTR Based On Logistic Regression Model
8	Popularity Prediction Based On Microblog Mining
9	Design And Implementation Of Content Click Through Rate Prediction System Based On Logistic Regression With Elastic Net
10	Research And Implementation Of Search Advertising Click Through Rate Prediction Algorithm