Font Size: a A A

Research Of Prediction Algorithm And System For Video Hits

Posted on:2017-01-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:W X HuFull Text:PDF
GTID:1108330485969038Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of big data technology, the increasing dataset size, the algorithm model variation, and the evolution of hardware platform have brought new challenges to the quantitative prediction field. How to design an effective quantitative prediction algorithm to build a timely, accurate and durable prediction system draw attention from many researchers. Quantitative prediction algorithm has an important role in supporting system decision and help leading the system to the right direction. Quantitative prediction technology gradually penetrates into the field of Video-on-Demand, where stable user demand ensures healthy development of the website.During the past few decades, numerous of prediction algorithms have been developed. The data constraints, impact factor mining, and building predictive models are still major problems in the field of quantitative prediction. Meanwhile, the prediction of user demand of VoD system remains highly challenging because of the continuity and the long life cycle of video. The research focusing on this particular area is still rare.Studies show that the user demand of VoD system has strong correlation with microblogging comments and the searching statistics. The accurate short term or mid-to-long term prediction based on the domain features and multivariate data is the key problem this thesis focuses on. To address the data restriction problem that is caused by limited historical data in the early phase of online video life circle, the thesis proposes a novel algorithm to classify the emotional tendency in microblog comments. It also presents a VoD user demand prediction algorithm utilizing social media data and searching statistics, more specifically based upon external data mining impact factor. Beside of that, the thesis presents a KNN regression based prediction algorithm that finds neighbors by similarity. A mid-to-long term prediction algorithm is also proposed to resolve the model selection difficulty caused by the long life span of online video. It uses the user demand historical data to detect the time slot in the stable period, and predict in the stable period. The specific contents are as follows:(1) First, we propose microblog comments emotional tendency classification algorithm (OSS-TS). Emotional tendency in the microblogging comments can affect the amount of user demand. The classification algorithm utilizes various technologies including dynamic emotion dictionary, syntax analysis, emotion feature extraction and SVM classification. It resolves the problem caused by low density of valuable comments, wide diversity of comment objects, positive/negative emotional bias and emotional tendency reverse, and produces accurate classification of emotional tendency.(2) VoD user demand prediction algorithm based on the social data and search data (SoStLVL). Study shows VoD user demand has strong correlation with micro-blogging subscriber amount, micro-blogging comments amount, micro-blogging comment sentiment and searching volume. The algorithm uses period difference between social/searching data and VoD user demand prediction, mines impact factor, determines emotional paranoia items. The external data is used to improve the timeliness and objectivity of prediction algorithm.(3) Early stage prediction algorithm based on user demand history and KNN regression (KSSSP). In order to resolve the noise in VoD user demand data, peak magnitude differences and other issues, it proposed dynamic curve fitting algorithm based on correlation KNN regression model. The algorithm integrates smoothing, scaling and shifting operations, predicts user demand in the early phase of on-line video life circle, and improves the timeliness and accuracy of prediction.(4) Mid-to-Long user demand prediction algorithm utilizing linear regression in historical data (SDLRP). Considering the difference between VoD demand trends and base, a platform time slot detection algorithm (SD) and stable period prediction algorithm (LR) are presented to achieve a durable mid-to-long term prediction with improved accuracy.(5) Finally, based on the above algorithms, a VoD user demand prediction system has been implemented. Both external and internal user demand data are used. The system consists of data fetching, data storage, analysis and prediction, and presentation modules. It can provide accurate prediction in timely manner. The implementation of this system shows quantitative prediction algorithms developed in this thesis have great theoretical and practical value.
Keywords/Search Tags:Quantitative Prediction, Emotional Tendency, Impact Factor Mining, Data Preprocessing, Multiple Linear Regression, K-Nearest Neighbor Regression
PDF Full Text Request
Related items