Font Size: a A A

The Study Of Automatic Function Information Extraction And Classification Approach For Chinese Patent

Posted on:2017-12-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y N ZhaoFull Text:PDF
GTID:2428330596957445Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Patent as an important source of technical knowledge,the effective extraction of important knowledge in patents and a reasonable representation of patent knowledge can help to achieve an important supporting role in the product innovation.Functional innovation is the basis of product innovation,so extracting functional information from the patent has become a hot research.Common methods of information extraction include principles of extraction based on statistics,ontology-based extraction and rule-based extraction,etc,which have some validity in text information extraction,but they have higher requirements on the corpus of the training set,and they do not have good portability.This paper takes the abstract text of patent documents as the object of functional knowledge after analyzing the organizational structure and knowledge distribution of a large number of patent documents.First,we get the template set for functional knowledge extraction,and then extracting functional information using the template set.Finally,the patent documents are classified according to their functions,so as to realize the uniform expression of the patent function.In this paper,a method based on Bootstrapping is proposed to extract functional information from patents which is based on natural language processing technology.This method makes use of the initial seed template to continuously acquire new functional knowledge and templates in the incremental iteration process.Using natural language processing methods,combining the POS features to obtain the candidate templates,calculating the correlation between the function words and the templates to filter the candidate templates,extracting the function information by a pattern matching method.,and on this basis,using the vector space model to calculate the document similarity for functional classification.By comparing the information extraction method based on the statistical model with the information extraction method based on template set proposed in this paper,the results show that the method described in this paper has some validity in the automatic acquisition of template set,and in the accuracy of information extraction has been further improved;Commonly,the object of document classification by computing similarity is the document itself;This paper,based on the collection of function information,carries out the similarity calculation to realize the patent document function classification,which has a higher accuracy rate.
Keywords/Search Tags:Patent, Template, Function Information, Natural Language Processing, Vector Space Model
PDF Full Text Request
Related items