Font Size: a A A

Research On Efficient Retrieval And Query For Multi Domain And Cross Media Big Data Of Science And Technology

Posted on:2022-03-01Degree:MasterType:Thesis
Country:ChinaCandidate:W J GuoFull Text:PDF
GTID:2518306332967669Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the unremitting efforts of scholars and researchers and the continuous development and popularization of information technology,massive scientific and technological big data has been rapidly accumulated on the Internet.Different from the explosive growth of news,social media and other information on the Internet,big data of science and technology has its unique side.Scientific and technological data is mainly composed of academic style resources such as papers and scholar information.It has a large number of data,but less redundant information.It has the characteristics of strong professionalism but great differences between different fields.In the retrieval of science and technology resources,on the one hand,due to the unique data characteristics of science and technology big data,the traditional retrieval algorithm is difficult to meet the needs of scholars and users.On the other hand,higher requirements regarding the acquisition and processing of scientific and technological resources are put forward due to the multi-modal and heterogeneous nature of big data.Based on this background,it is of great significance to study the accurate,efficient and personalized retrieval and query of multi field and cross media technology big data.The main work of this thesis is as follows:(1)This thesis proposes a deep feature extraction and representation method of multi-modal technology big data.For the text,this thesis proposes a feature representation algorithm based on dense convolution attention(FR-DCA),which uses dense convolution structure combined with bidirectional LSTM recurrent neural network to extract the deep text features;for the image resource,starting from the problem of non-uniformity of input size,this thesis proposes a convolution neural network with spatial pyramid pooling layer.The experimental results indicate superiority of the two methods over comparison algorithm considering precision,recall and F1 value.(2)This thesis proposes a semantic space learning and analysis method for multi domain and cross media technology big data.Based on the proposed dense convolutional attention model and convolutional neural network model with spatial pyramid pooling,a multi-modal adversarial learning algorithm of science and technology resources based on semantic constraints(MASC)is proposed.By cutting the text and image,MASC algorithm fully extracts fine-grained context information,introduces semantic constraint function for adversarial learning,and models the common semantic space that maintains cross modal semantic relevance.The experimental results show that,compared with the comparative experiment,the retrieval evaluation indexes of the cross media retrieval based on MASC algorithm are significantly improved.(3)This thesis proposes the retrieval,query,prediction and visualization methods of multi domain and cross media science and technology big data.Firstly,aiming at the problem of scholar influence calculation in scientific research field,an expert discovery algorithm based on cooperation and citation influence fusion(CF-Rank)is proposed.CF-Rank algorithm calculates scholar influence through the fusion of scholar cooperation graph and paper citation graph.The experimental results show that the algorithm is better than the comparative experiment in coverage and artificial evaluation index.Then we propose a time window attention based scholar interest extraction(IE-TWA)algorithm.IE-TWA algorithm considers the effect of time factor on researchers' research interest,and proposes to combine the attention mechanism of time window to express scholars' short-term interest.The experimental results indicate promising significance brought by this algorithm for the prediction of scholars'research interest.Finally,combining the two algorithms,we design the scientific and technological resources retrieval and query mechanism based on the secondary reordering,which realizes the efficient,accurate and personalized retrieval of scientific and technological big data.(4)This thesis designs and implements a multi domain and cross media technology big data efficient retrieval and query system.The system includes three modules:multi domain cross media technology big data acquisition and feature representation,multi domain cross media technology big data semantic learning,multi domain cross media technology big data retrieval query,prediction and visualization,which fully verifies the effectiveness and feasibility of the proposed algorithms.
Keywords/Search Tags:big data of science and technology, cross media, semantic learning, retrieval and query
PDF Full Text Request
Related items