Font Size: a A A

Recommended Model Q Recommendation Based On Machine Learning System Problems

Posted on:2014-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:Z F MaFull Text:PDF
GTID:2268330392962844Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The recommendation system described in this work is based on one of theworld’s largest interactive Chinese Q&A platform. Registered users of the platformpropose targeted questions from his/her own needs, and encourage other users toanswer questions through the reward mechanisms. At the same time, the problemitself and the answer serve as a result of the search engine to users who confront withsimilar questions, in order to achieve the purpose of knowledge sharing. Currently,there are a large number of unsolved problems on the platform. In this context,personalized recommendation system recommends related questions to users basedon his/her behaviors, such as logon, browser and answer to reduce the cost of solvinga question, and increase the amount of solved questions to achieve better quantityand quality of knowledge sharing.The original recommendation model of this recommendation system usesmachine learning techniques to build content-based recommendation algorithm. Italso borrows the idea from precisely targeted advertising system which try tooptimize the click-through rate (CTR) of the recommended questions. By combiningthe Chinese word segmentation[76,77,78,79], keyword extraction, named entityrecognition (Named Entity Recognition, NER)[81,82,83,84]and other techniques, theoriginal recommendation model is able to build the click-through rate (CTR)prediction model to match the user with the question. The CTR prediction modelcalculate the conditional probability P (click=true|user=uid, question=qid)which measure the probability of a new question being clicked by the user, and usethe maximum entropy (Max Entropy) to fit the above conditions probability. The original recommendation model has two shortcomings: first, therecommendation model uses only a small amount of features, which will be very likelyto cause the model to become under fitting the data being used, secondly, the staticmodel cannot adapt to the impact of changes in the distribution of data.This work is to improve the performance of the original recommendation model,specifically including the following two aspects:1. By introduction of semantic features, combinational features and bias terms,combined with model selection and regularization techniques, we improve theaccuracy of the recommended model. The improved model using probabilisticlatent semantic analysis (pLSA) techniques to extract the semantic features ofthe text. Manipulating the text at the semantic level can obtain better resultsthan in the lexical level. All these improvements raise recommendationaccuracy rate by7%from88%to95%on the benchmark data sets.2. Designed and implemented the offline recommendation model trainingsystem. The offline training system is able to automatically accomplish thetasks of the download of basic data, feature extraction, model training andmodel selection, so as to update the model on a regular basis. Offline trainingsystem aimed at regularly producing new recommendation model.Experimental results show that the distribution of the data used to train therecommendation model is temporal, and static model cannot adapt to theimpact of this changes.The improved recommendation models as well as the offline training system havebeen on-line to provide more accurate personalized recommendation service to theuser of the interactive Chinese Q&A platform.
Keywords/Search Tags:Recommendation System, CTR Prediction, pLSA, Probabilistic LatentSemantic Analysis
PDF Full Text Request
Related items