Font Size: a A A

Research On Retrieval Method Of Science And Technology Resources Based On Machine Reading Comprehension

Posted on:2022-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:G L YeFull Text:PDF
GTID:2518306524478224Subject:Mechanical engineering
Abstract/Summary:PDF Full Text Request
The retrieval of sci-tech resources is an important means of sci-tech resources service and a key link to determine whether sci-tech resources can be effectively and effectively used.However,there are many problems in the process of retrieval,such as low retrieval precision,large manpower input and unsatisfactory results.The existing retrieval method of scientific and technological resources is mainly to obtain a series of relevant results through user input query keywords,and then find the appropriate retrieval results after manual screening and judgment,which makes the retrieval results redundant,inaccurate,and time-consuming.The main problem of the existing methods is the lack of matching of actual requirements and understanding of the semantics of scientific and technological text resources.Because the scientific and technological resources are different from the or-dinary daily life and production resources,and have significant specialization,knowledge and complexity,the keyword search method can not understand its characteristics,and as a result,it is difficult to meet the demand of the scientific and technological resources service.Therefore,it is an urgent need and a major task for sci-tech resource retrieval to understand the semantic meaning of sci-tech text resources and match the actual demand of sci-tech resource services.To this end,this paper focuses on the national key research and development pro-gram "Development and Application Demonstration of Enterprise Cloud ERP Platform in Support of Open and Ecological Development"(project number:2019YFB1704104)and"Distributed Resource Giant System and Resource Cooperation Theory"(project num-ber:2017YFB1400301).As the research background,the goal of building the resource system and resource sharing mode of science and technology service industry around the subject is to support the tasks of search,analysis,matching,evaluation and optimization of cross-industry distributed science and technology resources.To project task requires ten thousand party service platform and ningbo science and technology information institute of science and technology public service platform of unstructured text resources of sci-ence and technology support for the data,to solve the problem of science and technology in the process of text retrieval needs matching and semantic understanding problems,key research support text mining application of science and technology resources of science and technology retrieval method,This paper proposes a retrieval scheme of scientific and technological resources based on machine reading comprehension.The main research con-tents are as follows:(1)In view of the existing resources of science and technology retrieval results re-turned redundancy,imprecise problems,in the text of science and technology resources characteristics and retrieval method based on the analysis of the problems,study and put forward the overall technology based on machine reading comprehension text retrieval technology implementation scheme,this scheme by text matching model and machine reading comprehension model of two most,The research is carried out on the requirement of scientific text matching and text understanding.(2)For text data source of science and technology such as large noise and a lot of pro-fessional vocabulary characteristics,and the Chinese text points out some problems such as stop words,text preprocessing for science and technology,concrete including noise removing text,word segmentation,to stop words,such as training term vectors,guaran-tee for the data sequence of science and technology after text retrieval work and formal support.(3)Aiming at the low matching accuracy of scientific and technological texts in the retrieval process,the TF-IDF text matching method based on N-gram is proposed.By in-troducing the N-gram algorithm,this method can not only obtain the word frequency TF and inverse text frequency IDF of the scientific text words,but also fully consider the word order problem,and improve the matching accuracy of relevant text.In this paper,the effectiveness of the proposed algorithm is verified by experiments on Chinese and English datasets.(4)Aiming at the problem that the existing retrieval model does not have the ability to understand,a machine reading understanding algorithm is proposed.After the algorithm inputs the query,the corresponding text query results are obtained as output through the model's internal coding module,matching module and prediction module respectively.The function of coding module is to extract questions and text features,the function of match-ing module is to strengthen the interaction between query words and text,and the func-tion of prediction module is to obtain query results.Experiments on Chinese and English datasets demonstrate the effectiveness of the proposed machine reading comprehension algorithm.(5)To solve the problem that retrieval model does not have multi-document reason-ing ability,a reasoning method based on hierarchical attention pointer network is pro-posed.This method is applied to the inference module of machine reading comprehension model.The hierarchical attention mechanism is used to match attention at word level and sentence level respectively,and the pointer network is used to carry out sentence infer-ence.In this paper,the effectiveness of the proposed method is verified by a number of comparative experiments on Chinese and English datasets.
Keywords/Search Tags:Scientific text resources, Text matching, Reading comprehension, Text reasoning, Hierarchical Attention Pointer Networks, Retrieval
PDF Full Text Request
Related items