Font Size: a A A

Research On Prediction Method Of Microblog Information Dissemination Based On User Interest Characteristics

Posted on:2021-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:J W ZhangFull Text:PDF
GTID:2428330632454235Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Micro log is a mobile information social network platform based on the relationship of registered users,information sharing,dissemination and information acquisition.Micro log information dissemination mode and speed are different from news,forum,etc.Micro log more comprehensive,fast,concise and clear to clarify the event.As of March 2019,there were far more active registered users of micro blog than social network platform users in the same quarter.Forwarding micro blog is one of the main ways to spread information in micro blog.Prediction of micro blog forwarding plays an important role in information recommendation,advertising information precise delivery,emergency warning and other directions.To solve the problem of text classification based on user's interest feature,TF-IDF algorithm is used to fuse LDA topic model,TF-IDF algorithm is used to extract words that appear more frequently in an article or text,but appear less frequently in all text databases to be analyzed.In an article or text,some unimportant words,such as "de","Di" and so on,which appear frequently,will directly affect our analysis of words,The number of times of using IDF in corpus will also reduce the whole result.The higher the TF value is,the more times the word appears in the article,the more important the proof word is,The smaller the IDF value is,the less important the proof word is to screen out the real high frequency words that can represent the main idea of the article.In this paper,we will combine TF-IDF algorithm with LDA theme model.Stop words extracted by TF-IDF algorithm are integrated into LDA topic decision model to extract word sequence.In this way,the topic extracted from the word matrix can accurately represent the whole article.In order to improve the accuracy of micro log forwarding,an online passive attack algorithm is proposed to solve the problem,The online passive attack algorithm is used to study online data.Previous research thought that the user's interest is fixed and unchangeable.In fact,the user's interest will change in a certain period of time.We improve the traditional PA algorithm to add the change of users' interest,and analyze whether users forward micro log or not.PA algorithm considers that micro log information and user interest are a continuous and changing sequence.Whether the information in each sequence is forwarded needs to be predicted.After the prediction,whether the user will forward will appear,The algorithm will have instantaneous loss rate to reflect the error rate of prediction,and use the new attributes to be studied and the data to be studied to update the adopted rules and use the new rules for new analysis.The prediction algorithm is to input the Weibo interest features into the prediction model.The user attributes include the number of users' attention,the number of fans,the number of Weibo and the Weibo attributes,including the publishing time and the content of the blog,The PA algorithm is initialized and trained with the improved online passive attack algorithm which integrates the characteristics of interest to adjust the weight so as to achieve the highest prediction accuracy.The experimental data is the real-time news published by the blogger of Xinhua viewpoint using python.By crawling through this micro log owner's micro log from January 1,2019 to March 25,2019 and its 6 million registered fans.As there are many meaningless micro log users due to the huge micro log data,we need to try to remove some users whose number of micro log fans is less than 50 with the possibility of water army.We mainly study the users who rank top in the interaction rate and will have a secondary impact on the forwarded content the forwarded micro log sees and forwards to the fans of the blogger.
Keywords/Search Tags:Micro log information dissemination, prediction, user interest features, Python, PA algorithm
PDF Full Text Request
Related items