| SMS filtering based on content recognition refers to text classification. As the rapid growth of SMS service, the purpose of SMS filtering has changed deeply. This dissertation focuses on several key problems of SMS filtering systems based on topic model, such as personalization and fast convergence of classifier, and proposes the novel result as follows.(1) Supervised Dual-PLSA for Personalized SMS Filteringwe propose a novel supervised Dual-PLSA which estimate topics with many kinds of observable data, i.e. labeled and unlabeled documents, supervised information about topics. Experiments show the Dual-PLSA has a very fast convergence. Within 100 gold standard feedback, Dual-PLSA's cumulative error rate drops to 9%. Its total error rate is 6.94%, which is the lowest among all the filters.(2) A Feature-Enhanced smoothing method for LDA model applied to Text Classificationwe propose a Feature-Enhanced smoothing method in the idea that words not appeared in the training corpus can help to improve the classification performance. The key point is fully considering the relativity between the new document and training corpus, and enhancing the document's class feature by regarding the words not appeared in the training corpus. Evaluations on 20newsgroups show Feature-Enhanced smoothing can significantly improve the performance in bi-class text classification. (3) Users's Feedback based on Mobile Phone's ROM MonitoringUsers's feedback is a premise to personlize SMS classifier. We propose a novel method to get users'feedback by monitoring his mobile phone's ROM. The key idea is that, users's operation with different SMS can be recognized by monitoring the capacity of his mobile phone's ROM. |