Font Size: a A A

Research On User Identity Recognition Based On IPTV Big Data

Posted on:2021-08-05Degree:MasterType:Thesis
Country:ChinaCandidate:L WangFull Text:PDF
GTID:2518306464982879Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
User identification is one of the important issues in data analysis and mining,which aims to verify the identity of people associated with certain resource information.Under normal circumstances,researchers use user-related data resources to extract features that can be used as user IDs,through features such as dimensionality reduction and selection,and use similarity measurement methods to match user features.The result with the greatest similarity is output as the recognition result.The user identification problem has a wide range of applications and important research value in the fields of personalized recommendation,information forensics and privacy protection.In the research of user recognition,the selection and construction of features,and the similarity matching of features are two important aspects related to the recognition effect.The thesis focuses on these two aspects to study related issues.The main work is as follows:The feature set construction aspect: The thesis uses the IPTV user viewing record data set,pays attention to the channel item,generates the user's item sequence in time sequence,and then extracts features from the item sequence.Regarding feature extraction and construction methods,the thesis proposes a hybrid multi-modulus item set heat ranking processing method and a scalable multi-modulus item set construction method.From two different perspectives of the fixed number and fixed ratio of the feature set construction,the item sets with higher frequency are extracted to form the feature set.Experiments show that the construction of multi-modulus itemsets really helps to improve the accuracy of recognition,and the two proposed feature construction methods are indeed effective.Similarity measurement: In order to accurately measure the similarity between user feature sets,based on the Jaccard coefficient,the thesis proposes a similarity measurement method based on the influence value,which can achieve a recognition accuracy equal to or higher than the Jaccard coefficient.Combined with KL divergence,the thesis also proposes the SJKL method,which is more effective than Jaccard coefficient and KL divergence.Recognition result decision-making: The thesis proposes a fusion decision-making scheme based on multiple similarity measurement methods based on intersection.Although it has a certain rejection rate,this scheme can achieve higher accuracy in user identification decision-making than using a similarity measurement method alone.All in all,the thesis focuses on IPTV user data and strives to realize user identification.By studying the related issues in user recognition,the thesis proposes a feature construction method,a similarity measurement method,and a recognition result decision-making method that can improve the accuracy of recognition.It provides an important reference for the research in the fields of personalized recommendation and privacy protection.At the same time,the method proposed in the thesis is also applicable to other similar scenarios of user identification,and has a certain general applicability.
Keywords/Search Tags:User Recognition, Feature Extraction, Similarity Measurement, Multi-modulus Itemsets
PDF Full Text Request
Related items