Font Size: a A A

Research On Key Technologies Of Generating Patent Claims

Posted on:2021-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:E B ZhaoFull Text:PDF
GTID:2428330611998151Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The emergence of Artificial Intelligence(AI)has promoted the rapid development of many fields,among them,the field of natural language processing has achieved multiple research results on machine translation and automatic writing through computer comprehension and the application of human linguistics.Besides,the fast development of modern science has also expedited the establishment of the patent system to encourage invention as well as to protect inventors' patent rights.When applying for a patent,the applicant is required to submit a patent specification and a claim.The former can be regarded as a guide that explains the invention,and the latter clarifies the contents of this invention and limit the scope of its' patent right.Since the patent claim is composed based on the specification,it can be therefore automatically generated by applying the technology of natural language processing.The purpose of this paper is to explore the patent claim automatic generation technology,and the exploration will be conducted from the following three aspects: 1)The identification of patent claims in the patent specification;2)Features of patent claims;3)The automatic generation of dependent claim(i.e.limited part and claim reference part).And it will contribute to the theoretical research of automatic patent claim generation in the following aspects:Firstly,the exploration of claim identification technology in the patent specification.This attempt was done by formalizing the patent specification into a machine reading comprehension task because it involves in-depth understanding and reasoning.By cleaning the data and designing an alignment algorithm to align the patent claims to the content in the specification,pseudo-annotated data is generated.After that,the influence on the recognition effect has been explored by four pre-designed question forms,and complete the training based on the BERT Model.Lastly,the Bi DAF network has been added to the BERT Model to enhance model performance by capturing the connections between questions and context.The model has achieved a good test result with an EM value of35.54 and a F1 value of 38.08.Secondly,the exploration of automatic generation technology of claim feature and limited parts by formalizing it into an automatic text summarization task.This exploration has 3 steps,the first step was to implement the Text Summary Model based on seq2seq;followed by the introduction of attention mechanism at step 2;and lastly verified the effect of three forms of dot,general and concat.Besides,a copy mechanism is introduced to combine the extractive and generative summaries to improve the summarizing performance,while a coverage mechanism is introduced concurrently to alleviate the problem of duplicates.Consequently,the model has achieved F1 value of 82.47 on Rouge-L.Thirdly,the exploration claims reference generation technology.The reference part has been formalized into a text classification task because the primary function of it was to state the cited claims.In terms of the process,the claim number of the reference part has been extracted by means of regular expression matching in the first place to construct a data set.Then,explored the impacts of different sample structures on the performance of the model by balancing the proportion of categories in the training set,and undertook training based on the BERT Model.And lastly,captured the hidden relationship between sentence pairs by introducing SLTM,CNN,RCNN,and DPCNN structures based on the BERT Model.In the end,the model has reached 90.32 in F1 value test set.
Keywords/Search Tags:patent, machine reading comprehension, automatic text summarization, text classification
PDF Full Text Request
Related items