Font Size: a A A

Research On Micro-blog User Interest Modeling Approach Based On Tag Feature Space

Posted on:2019-07-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y R WangFull Text:PDF
GTID:2428330545982387Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The data of ‘China Internet network development state statistical report' show that up until June 2016,Chinese netizens have reached 710 million,with the Internet penetration rate reaching 51.7% and mobile phone users reaching 656 million;up until June 2017,Chinese netizens have reached 751 million,with the Internet penetration rate reaching 54.3% and mobile phone users reaching 724 million.Among them,the number of monthly active micro-blogs has increased from 297 million in 2016 to 376 million in 2017.There was a sharp rise concerning the number of micro-blog users,together with the amount released micro-blogs.With the huge amount of micro-blog browse,it is extremely important develop new techniques on how to locate the desired information for users from the overloaded big data.Tags are always utilized to represent the interests and properties of micro-blog users,we propose an improved micro-blog user interest modeling approach based on tag feature space and user relationship via analyzing tag correlation and the limitations of the existing micro-blog user interest models.Firstly,the co-occurrence frequency of tag pair is calculated from the micro-blog user collection to obtain the inner correlation between tag pairs,the path is constructed based on the linking tags for each tag pair and the outer correlations of tag pairs is obtained via the shared entropy.And then we combine the above two correlations to acquire the semantic correlation relation matrix,based on which the user tag matrix can be updated,thus the micro-blog user interest model based on multi-tag semantic correlation can be obtained.Secondly,by considering the social relationships between users,such as the static relationship between users' background and attributes,together with the dynamic relationship between attention and concern,the user relationship matrix can be constructed.The multi-tag semantic relations matrix is integrated with the user relationship matrix to form a tag feature space and user relationship based micro-blog user interest model.The main work of this paper is as follows:1)To alleviate the sparse problem of the original user tag matrix,we use the relationship among multiple tags instead of a relationship between two tags.In order to excavate the external connection among multiple tags,we defined the path,and calculated the sharing entropy of the connection tags on the path.Then update the original user tag matrix through the multi-tag semantic association relational matrix that we have built.2)To solve the problem of higher dimension of primitive user tag matrix,we obtained the tag representative element matrix via cluster the tags according to the similarity between tags,and then update the user tag matrix again.We reduce the diversity and fuzziness of user labels effectively by using this method.3)In order to show that the user's interest vividly,we combined the multi-tag relationship with the user's social relationship.We have got a new user social relationship matrix by calculating the static social relationships and dynamic social relationships of the users.And the final label feature space is obtained for represent the user's interest.We evaluate our methods through a series of experiments based on a data set crawled from the open API and the results are analyzed.The results show that our method performs better than traditional user interest discovering methods.
Keywords/Search Tags:Tag correlation relationship, Tag semantic features, User social relationship, User interest model
PDF Full Text Request
Related items