Large-scale Patent Classification Based On Parallel Machine Learning

Posted on:2012-07-13

Degree:Master

Type:Thesis

Country:China

Candidate:Q Kong

Full Text:PDF

GTID:2178330338984137

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Many practical problems in today's society can be considered as a large-scalepattern recognition problem, such as the data mining of web and the analysis of thepassengers of transport system. However, for large-scale problems, lots of conven-tional classifiers are hard to overcome it even if efficient algorithms such as SVM. Onthe other hand, more and more computing resources are available. Using the abundantlarge-scale parallel computing resources to solve the real-world problem is a feasiblemethod.Patent text classification is a large-scale, imbalanced patent classification problemwith high practical significance, such as analyzing the trend of a field of technology. Inorder to solve practical problems such as the patent classification, we use the algorithmbased on parallel structures based on the abundant computing resources , in order toachieve effective model for classification of the original problem. Bao-liang Lu and hiscollaborators have proposed a parallel network, called the Min-Max modular network(M3), which is based on"divide and conquer"to solve large-scale problems.A single large-scale problem is decomposed into a large number of small-scaleproblems in order to achieve parallelism in M3. These small modules are simple andeasy to solve, and independent of each other, and finally sub-solution of the problem.We will merge the modules by rules to get the solution of the original problem.The precision is the most important in classification problem. In order to solvethe problem, we used asymmetric selection algorithm, symmetric selection algorithmand decision tree selection algorithm. Based on them, we proposed assistant classifiermodule selection strategy (ACMSS). Experiments show that ACMSS can effectivelyimprove the classification performance.We use a variety of decomposition strategies and combination methods. Com- pared with the conventional support vector machine, the ACMSS algorithm combinedwith the prior knowledge decomposition strategy provides much better performance.Assistant classifier module selection strategy has generalization ability and strongadaptability. It can compute the weights of sub-classifiers automatically witch hasbeen proved by a large number of experiments.

Keywords/Search Tags:

Min-Max modular network (M~3), large-scale textclassification, parallel machine learning, patent classification, assis-tant classifier module selection strategy(ACMSS)

PDF Full Text Request

Related items

1	Parallel Min-Max Modular Support Vector Machine With Application To Patent Classification
2	Machine Learning Based Patent Categorization
3	Research And Parallel Application Of Supervised Learning Algorithms For Large-scale Data Classification Problems
4	Research On Patent Value Classification Prediction Model Based On Machine Learning
5	Research On Ensemble Learning
6	Research On New Hierarchical Fuzzy Classification Learning Method
7	Research On Identification Method Of Large-scale Network Traffic Based On Machine Learning
8	Large-Scale Machine Learning for Classification and Search
9	On Network Optimization Technology For MXNet-based Large-scale Distributed Machine Learning
10	Research On Automatic Categorization Technology For Chinese Patent Documentation