There is a statistical model in the Classification technology of data mining. It is divided into generative model and discriminative model. In recent years, these two methods become the research focus of data mining and machine learning. Generative model learning the joint probability, while discriminative model learning the conditional probability. There are many different points of generative model and discriminative model. Such as generative model primarily focuses on the distribution of various class of data,while the discriminative model focuses on the classification boundaries of different class of data. Certainly, there exist two distinct regimes of performance between the generative and discriminative classifiers. And some people give theoretical and empirical comparisons. If we consider the pros and cons of both discriminative and generative approaches, it is natural to exploit the best of both worlds, so hybrid generative/discriminative model born.This thesis first describes several hybrid model frameworks. In every framework,some former people gave the concrete forms of hybrid methods. Then researches the concept, learning methods, statistical properties of generative and discriminative model, gives comparisons of generative and discriminative classifiers, and analyzes their applications. Finally, we propose two efficient generative/discriminative hybrid classifiers: (1) Based on the AdaBoost ensemble framework, a learning algorithm of generative/discriminative combination classifier was proposed. In each round of the algorithm, a generative classifier and discriminative classifier were learned, and the classifier with the smaller error rate was selected as the individual classifier, and then all the individual classifiers were combined by a weighted approach. Experimental results showed that this method was very good for accuracy and convergence speed. (2) Using the genetic programming as a core, an algorithm of generative / discriminative hybrid classification method was proposed. The algorithm used symbolic regression of genetic programming to learn the symbolic expression between generative classification probability and discriminative conditional probability. This method was a general approach, and it had many advantages. First, it avoided the weight problems of the generative part and the discriminative part. Second, the mathematical form of hybrid model changed with the characteristics of data set. So it was more adaptable. Experimental results showed that the method was superior to a single generative model and a single discriminative model. To a certain extent, it improved the classification accuracy. |