Font Size: a A A

Research Of The Short-text Classification Based On The Domain Knowledge Base

Posted on:2013-11-16Degree:MasterType:Thesis
Country:ChinaCandidate:J ChenFull Text:PDF
GTID:2248330395971615Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Short text refers to the text length is shorter, the number of characters is less than fifty,expressed in the abstract text, short text such as news headlines and complaints are typical.At this stage of development due to the massive increase in information resources,completely artificial way to manage increasingly unable to meet actual needs.Classification to a certain extent finishing network a variety of messy information,user-friendly and accurately find the information and existing information to be reasonablyclassified, the classification of the text to the user a massive text processing and effectivetext reading.Short text classification and general text classification method has a lot of differences,according to the characteristics of a short text on the classification of short textclassification. The experimental data for the complaint information short text. Throughin-depth analysis of the short text-specific law and related fields, for short textclassification algorithm based on domain knowledge base. This paper will identify thecategories of short text for the application background, the use of the word frequency, andshort text classification technique based on support vector machine (SVM) based onstatistical methods to evaluate the similarity of short text and category. The short textclassification algorithm based on domain knowledge base and classification accuracy toimprove as the main research directions.
Keywords/Search Tags:Text classification, The vector space model, The SVM, The short text, Wordfrequency statistics
PDF Full Text Request
Related items