Font Size: a A A

Computer Aided Patent Processing: Natural Language Processing, Machine Learning, and Information Retrieva

Posted on:2018-04-30Degree:Ph.DType:Dissertation
University:Drexel UniversityCandidate:Hu, MengkeFull Text:PDF
GTID:1448390002998018Subject:Electrical engineering
Abstract/Summary:
The intellectual property economy, and, more narrowly, the patent economy, form an incredibly wide-reaching and important part of the economic activity of the United States, and of the broader world. The patent ecosystem involves a diverse collection of players and interests. Inventors conceive of an idea, then patent agents and attorneys help them author and defend/refine a patent application, interacting with patent examiners at the patent offices who review it, checking for novelty, prior art, and usefulness (or industrial applicability). Once the patent is issued, and even sometimes while it is still in the application phase, middlemen companies may buy and sell it. A company owning a patent may then request that other companies license its use. Standards organizations, such as ETSI 3GPP or IEEE, are also important participants in the lifecycle of many important patents, since often certain patents are required in order to implement standards, and hence license agreement structures, such as the FRAND (fair reasonable and non-discriminatory) agreement, are set up to coordinate their licensing for use. Finally, if a company decides that another company is likely infringing on its patent, it may bring a case, which the federal courts must then hear.;Today each of these problems are done by humans with minimal, near non-existent use of any custom software or algorithmic tools designed specifically to help with these tasks. The work carried out in this dissertation is part of a broader intellectual agenda which aims to address this deficiency. In order to maximize novelty, this dissertation selects two problems that are especially overlooked. One is extracting deep meaning from patent claims. The other one is using deep meaning in recommender systems mapping between patents and standards. Furthermore, in order to maximize impact and to kickstart further research in these problems, the work in this dissertation focuses on the creation and curation of datasets germane to these two problems, as well as the creation and evaluation of prototype baseline systems to solve them. In particular, a dataset of grammar annotations of patent claims is firstly curated via Amazon Mechanical Turk. Then a baseline natural language processing system is trained and evaluated for extracting deep meaning from patent claims on this dataset, showing that substantial improvements over existing techniques for extracting this deep meaning are possible by leveraging this new dataset. Next, two ground truth datasets associating patent claims and sections of standards from information provided in intellectual property rights (IPR) disclosures to the European Telecommunications Standards Institute (ETSI) are extracted and curated. Following that, a new machine learning based retrieval system is designed for mapping between patent claims and standards. Finally, the new machine learning based retrieval system is evaluated on both datasets and their subsets and reveals substantial improvements comparing with SVM and retrieval baseline systems.
Keywords/Search Tags:Patent, Machine learning, Processing, Deep meaning
Related items