| With the development of information technology,the number of scientific literatures is increasing rapidly.Searching for scientific literature is a complex task,large-scale training data can often be obtained through crowdsourcing,however annotated scientific literature data is difficult and costly to collect due to the expertise required for high-quality annotation.Many different terms can describe the same concept,leading to problems in formulating precise queries.Based on deep learning technologies such as feature extraction,feature fusion,graph attention mechanism,and graph neural network,this thesis conducts research on multi-view feature extraction and fusion of scientific literature,semantic representation learning of scientific literature,and search services for scientific literature.Based on microservice technology,a semantic representation learning and search service component for scientific literature is constructed to achieve a search for scientific literature.Components allow users to better explore the search space for scientific literature than traditional search components based solely on queries.The main work done in this thesis is as follows:(1)A feature extraction and fusion method of scientific literature based on multi-view learning is proposed.Aiming at the feature representation of scientific literature,we propose an adaptive feature extraction and fusion method,where the title view and summary view of scientific literature are extracted using the attention mechanism and BiSLTM,and the keyword view of scientific literature is extracted using the word bag model CBOW.By exploiting the adaptive feature fusion method,the features of three views of title,abstract and keyword of scientific literature are fused,and the final feature vector of scientific literature is obtained.Therefore,the relationship and coordination of the three views of title,abstract and keyword are fully considered,and the information contained in the feature vector of scientific literature is enriched.(2)An unsupervised semantic representation learning method for scientific literature based on graph attention mechanism and maximum mutual information is proposed.The graph attention mechanism is used to sum the features of scientific literature with citation relationship,and give each scientific literature different feature weights,so as to better express the correlation between the features of different scientific literature.An unsupervised scientific literature graph comparison learning strategy is used to solve the unlabeled and scalable problem on large-scale scientific literature graph network.By comparing the mutual information between the positive and negative local semantic representation of scientific literature and the global graph semantic representation in the potential space,GNN can extract both local and global information,which can improve the learning ability of the semantic representation of scientific literature.(3)A new approach for scientific literature searching using GNN is proposed.This approach is based on a built scientific literature graph network,which can be used as an arelevance matching problem.By transforming the text sequence of search words into word embedding sequence,the word embedding sequence is added to the scientific literature graph neural network as an additional node,thus enriching and updating the scientific literature graph neural network.Based on the correlation between scientific literature and search term embedded sequences,the corpus of related scientific literature and unrelated scientific literature is constructed.We leverage the contrast perception layer to calculate the matching score of search words and scientific literature.Expand the score of search words and related scientific literature,and reduce the score of search words and unrelated scientific literature.Utilize the real sorting results obtained from the click sorting layer to minimize sorting errors to optimize the inverse logarithm and further improve the accuracy of search services.(4)The semantic representation learning and search service component of scientific literature is realized.It integrates data processing and feature extraction and fusion components of scientific literature,semantic representation components of scientific literature,and search service components of scientific literature.Based on the SpringBoot architecture,the semantic representation learning and search service components of scientific literature are designed,and the ElasticSearch search engine and Redis cache database are integrated,so as to comprehensively improve the performance of each component.The semantic representation learning and search service component of scientific literature provides functions such as scientific literature search,scientific literature field selection,scientific literature-related information display,and similar semantic scientific literature recommendation.The component has complete functions and a friendly user interface. |