Font Size: a A A

Design And Implementation Of Semantic Model Extraction For Academic Literature Retrieval Results

Posted on:2022-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:H Y CongFull Text:PDF
GTID:2518306740983139Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The keyword retrieval of documents is easy to use,but due to the lack of semantics and relationships of the documents in the original retrieval result set,scholars still need to perform inefficient secondary screening on the retrieval results of PR(Page Ranking)linear ranking.The paper focuses on constructing the fine-grained semantics of the academic retrieval result set and defining the relationship between the documents through the fine-grained semantics,so as to improve the secondary screening efficiency of scholars,and transform the realization of the fine-grained semantic model of the literature into the entity recognition of natural language processing And relation extraction.Specifically,starting from the two key information elements of the "problem to be solved and the method used" contained in the literature,the problem-method and the semantic model of the relationship between the documents based on this are established,and the document semantic model is further mapped to the document collection display The model realizes the multidimensional display of the document knowledge of the retrieval result document collection,the semantic relationship between the documents and the PR ranking structure,and improves the screening efficiency of the retrieval results by scholars.Specific work includes:1)Document semantic model and display model design: describe the fine-grained semantic information of the document in two dimensions of question-method,and express the semantic knowledge and documents in the document through the semantic relationship of question-method,question-question and method-method Therefore,the document semantic model can provide a multigranularity and multi-dimensional semantic representation of the document collection;the threedimensional document semantic model is mapped to the display model,providing a multidimensional,multi-granular document display interactive graphical interface.2)Document semantic model construction: document problem-method model construction is transformed into document problem and method entity identification and the extraction of the relationship between them.Aiming at the characteristics of document entity recognition,an IDCNNBi LSTM-CRF model combined with multi-head self-attention is proposed.Introduce the characterlevel long sequence feature extraction method to solve the lack of information extraction between the character-level words in the mainstream method;use IDCNN and Bi LSTM serial work,combined with the self-attention mechanism to extract the local context and dependent features of the sequence to solve the mainstream method The problem of insufficient extraction of long-distance features and local features.Aiming at the characteristics of document entity relationship extraction,a dual-channel GCN-CNN-Bi LSTM-ATT model fused with syntactic structure information is proposed.GCN is introduced to extract sequence syntactic structure features to solve the traditional method's ignorance of syntactic structure information;CNN and Bi LSTM are used in parallel to extract local information and long-distance dependent information features of general input.In the end,the F1 values of the two models on the dataset Science IE increased by 0.8% and 0.4%,respectively,compared with the mainstream methods.3)System prototype design and implementation: Based on the above work,the thesis constructed an academic document agency retrieval system that represented other academic retrieval engines and displayed the retrieval results in multiple dimensions,and tested the retrieval agency module,knowledge extraction module and visualization module of the system,And finally conducted an integration test on the entire system.The test results show that the system can meet the retrieval and visualization needs of scholars.
Keywords/Search Tags:academic search results, entity recognition, relation extraction, document semantic model, display model
PDF Full Text Request
Related items