Research On Incremental Learning Method Based On Bayes Theory

Posted on:2017-07-26

Degree:Master

Type:Thesis

Country:China

Candidate:N Li

Full Text:PDF

GTID:2428330482480962

Subject:Communication and Information System

Abstract/Summary:

PDF Full Text Request

The training of the traditional text classifier needs a large number of labeled corpus sets,these corpus sets are obtained by inefficient manual classification based on knowledge engineering.Incremental learning techniques can enhance the knowledge reserve of classifier by learning the unlabeled corpus set,which can solve the problem that the training corpus set is not marked,large quantity of corpus collection and memory limitation.Due to characteristics of high efficiency,high classification accuracy and take full advantage of prior knowledge and sample information,Naive Bayesian makes itself a natural choice for incremental learning.Based on the 0-1 classification loss,select the complete data add to the original corpus by sequence,but there are some problems:(1)When the new corpus size is too large,the complete data need to be found by iteration and then added to the original corpus,which greatly weaken the incremental learning efficiency.(2)Lack of knowledge reserve in the initial stage of classifier,the incremental learning of the error classification text will easily lead to a sharp deterioration of the performance of the classifier.In view of above question,this paper proposes an incremental learning method of confidence level adjustable and sequence selectable.Based on previous work,the following optimizations and improvements are made:(1)Counting Bloom Filter is introduced to incremental frequency of word statistics and dimension reduction,improve the early performance of the classifier,Counting Bloom Filter,to a certain extent,resolves memory limit,increase the number of text in an incremental learning,and also increases the probability of complete data;(2)Establish dynamic confidence threshold value window,in the initial stage of classifier,the knowledge reserve of classifier is incomplete,raising the confidence level can let complete data join in the original classifier with high probability,In the late stage of classification,relax confidence threshold value,speeds up the efficiency of incremental learning,balance the incremental learning of complete sample data and high efficiency of incremental learning.(3)Sequence selection,select corpus with high category bias,that is complete data.

Keywords/Search Tags:

Bayesian classification, incremental learning, confidence level, sequence selectable

PDF Full Text Request

Related items

1	Research On Confidence Machine Learning Methods Based On Controllability
2	Research On Self-Assistance Registration System Based On Bayesian Incremental Learning Model
3	Bayesian Question Classification In The Field Of Research, Based On Incremental Improvement
4	Incremental Learning Of Naive Bayes Chinese Classification System
5	Research And Applications Of Incremental Bayesian Network Learning Algorightm Based On Big Data Platform
6	Research On Question Classifying Of Chinese Question Answering System Based On Bayesian Classification
7	Study Of The Bayesian Network Structures Incremental Learning
8	Research And Implementation Of Incremental Bayesian Algorithm For Mobile Phone Virus Mining Engine
9	Research On Intelligent Classification Technique For Semi-structured Drug Data And System Implementation (Full-time Professional Degree)
10	The Research Of Precision Advertising System Based On Incremental Learning