Font Size: a A A

Research On Chinese Patent Infringement Retrieval Model

Posted on:2013-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:W S MaFull Text:PDF
GTID:2248330362968705Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the development and progress of society, the human pay more attention tothe intellectual property. The number of patent applications surge and the following ispatent infringement and patent invalid quickly amplified. These problems mainly dueto the current level of information retrieval: Information recall and precison low. Itcan not present all information which related to the topic of literature in the flood ofpatents and other information. Besides, there is a large number of irrelevantinformation. All these cause huge disruption to the users. In this paper, buildingChinese patent infringement retrieval model using text mining based on the status ofinformation retrieval and patent infringement. Patent infringement retrieval is dividedinto two types: avoid infringement retrieval and active infringement retrieval. Avoidinfringement retrieval is applied for find out the patents which may be infringed byuser’s own patent、the necessary technical features of product(which have maybeapplied or not), the technical features of the study direction. The initiative to activeinfringement is that according to user’s own patent (already authorized), retrievalwhether having the same patent has been repeated authorized.The main content of this paper include: data acquisition and text preprocessing,patent infringement retrieval model building, system achiving, the assessment ofexperimental effects and the summary and outlook of the research. In this study, thepatent experimental data published by the China State Intellectual Property Office,including the invention, utility model patents. By a series of processing operations tothe independent rights requirements of the patents, reveals thethe suspected infringingpatent. The work of data acquisition and text preprocessing part includs: first,converte the patent rights request of image format to plain text using such as OCRtool. Second, summary the character recognition errors and format errors in theconversion process and correct these errors. Third, improve ICTCLAS wordsegmentation system of the Chinese Academy of Sciences, propose a segmentationalgorithm for Chinese patent claims and make word segementation process to theexperimental data. Finally, according to need, extrate the bibliographic entries,patenttext and segmentation results and save as an XML text. All the experienced datacompose the XML database. In the part of building patent infringement retrievalmodel, By analysis to the patent infringement principles and the characteristics of the patent claims, Put forward coverage calculation of patent necessary technicalfeatures collection instead of the traditional text vector angle cosine similaritycalculation and experimental results show that the method is feasible. Besides,descript the Ontology building and inverted index building. In the part of realizationof the system and experiment results, state development environmental of the system,main technich, part of the core codes and the experimental effect of the algorithm.The innovation points of this paper are: the first, convert the PDF file to text fileusing the OCR and deal with the fault; second, word segement according to theChinese patent claim characteristics and feature extraction; third, put forward patentnecessary technical features coverage algorithm to judge patent infringement.
Keywords/Search Tags:Chinese patent claims, word segmentation, text mining and infringementretrieval
PDF Full Text Request
Related items