Font Size: a A A

Illegal Experimental Application Classifier Based On Keywords

Posted on:2016-11-08Degree:MasterType:Thesis
Country:ChinaCandidate:S H TuFull Text:PDF
GTID:2308330461466609Subject:Agricultural informatization
Abstract/Summary:PDF Full Text Request
“National Horticultural Experimental Teaching Demonstration Center” has 37 education experiment, 168 experimental projects. Including 32 scientific research and 120000 people take experimental teaching task each year. So the center needs a “Horticultural Experiment Teaching Demonstration Center Reservation System” to complete the experimental task of teaching management. The open experiment reservation applications are submitted by the student. So there are a lot of illegal applications in the experiment reservation applications. So we need to filter the experimental application.The emphasis of this paper is to design “Horticultural Experiment Teaching Demonstration Center Reservation System”. And to achieve intelligent identification and classification which aimed at illegal trial application, to filter the illegal trial application which do not meet requirements and to reduce the workload of manual identification. Regards less illegal trial application training samples, and mostly haven’t been labeled. The paper through the keywords to expand training samples, and then achieve the goal which only by means of keywords and unlabeled samples can build intelligent identification and classification system to against illegal application. The following aspects are the main work of this paper:(1) Get the labeled training samplesAs to this case, only a small number of labeled training samples existed. This study use TF/IDF weight model to calculate the Similarity between the keyword and the documents in Chinese corpus and then labeled positive; The next step using iterative method to extract more positive sample from the unlabeled sample, and then remove the stop words and select the features.(2) Building the text classifierFor classified samples in large quantities, this study based on One-Class classification and a variety of classification algorithms about PU(Positive Unlabeled) to construct classifier, the experiments showed all of the 1F of PU algorithms are above 85% and it’s better than One-Class SVM, so we choose the best one Spy-SVM to construct the classifier.(3) Realizing “Horticultural Experiment Teaching Demonstration Center Reservation System”.For the classifier we’ve already acquired, it was applied to real-life scenarios in this study. Adopting B/S framework to develop a kind of “ Horticultural Experiment Teaching Demonstration Center Reservation System”. To achieve different functions for different users, and to filter illegal trial applications.
Keywords/Search Tags:text classification, single classification, PU learning, keywords, experiment reservation system
PDF Full Text Request
Related items