Font Size: a A A

Research On Online User Behavior Similarity Based On Feature Attribute

Posted on:2022-08-19Degree:MasterType:Thesis
Country:ChinaCandidate:K X ZhangFull Text:PDF
GTID:2518306500955939Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
The mining and modeling of online user behavior is a hot spot in sociology and complexity science research.In the study of online user behavior,the emergence of technologies such as big data and smart terminals have provided technical and data support for them.By analyzing the similarity of online user behavior,the mechanism and law of human factors cognition which behind the data can be revealed.Due to the different factors considered in the research,the calculation of user behavior similarity is also different,and there is no unified calculation standard yet.At the same time,there are relatively deficient researches aimed on the contribution of online users’ own characteristic attributes to their behavior similarity.Therefore,based on online user behavior logs,this paper verifies that there are different degrees of similarity and difference in user behavior in virtual space.Furthermore,combined with the user’s own basic information,the influence degree of different characteristic attributes and their combinations on the similarity of online users’ behavior is studied.The main research content is the following three aspects:First of all,similarity calculation of online user behavior.Based on the idea of sequence alignment in biology,and by introducing the time factor,a user behavior similarity calculation algorithm SA-OUBSC based on sequence alignment is proposed.The algorithm first converts the user’s click stream data into a click sequence,afterwards,calculates the distance between different click sequences through operations,such as insertion,deletion,compensation and so on,so as to obtain the similarity of the click behavior between users,And express this similarity in the form of a matrix,which finally verifies that different users in the virtual space have different degrees of similarity and difference in their clicking behaviors.Secondly,online user group identification.On the basis of traditional hierarchical clustering,the OUBS-GR algorithm is proposed by introducing priority queues.This method first extracts six-dimensional characteristic attributes of users from behavior logs;secondly,clusters users according to behavior similarity matrix to identify and distinguish different user groups.Furthermore,the evaluation method based on entropy and purity compares the effect of traditional hierarchical clustering on user group identification,and verifies the superiority of the OUBS-GR algorithm.Finally,according to the calculation results of the evaluation method,it is preliminarily concluded that different characteristic attributes will have different degrees of influence on user behavior.Thirdly,the contribution of characteristic attributes and their combinations to behavioral similarity.This research builds a GraphSAGE-based user click behavior prediction model and analyze the accuracy of models built based on different feature attributes,this is used for studying the contribution of user feature attributes and their combinations to the similarity of user behavior.The model first builds a user relationship graph that conforms to the power-law distribution based on the user similarity matrix;then predicts the accuracy of the user’s click behavior through operations such as neighbor sampling and aggregating neighbors.The experimental results show that the model based on gender in the single attribute has the highest prediction accuracy,which is 80%.The model based on the combination of gender +education level in the attribute combination has a prediction accuracy of 65.3%.This combination has the higher effect and contribution on the similarity of user behavior.Contribution is high.
Keywords/Search Tags:Online User Behavior, Sequence Alignment, Group Identification, GraphSAGE Algorithm, Contribution
PDF Full Text Request
Related items