Font Size: a A A

Research On Similarity Detection Method Of Science And Technology Project Application Text Based On Deep Learning

Posted on:2021-05-15Degree:MasterType:Thesis
Country:ChinaCandidate:X Y YangFull Text:PDF
GTID:2428330602971277Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Since 2012,with the establishment of the core development strategies of "Rejuvenating the Country through Science and Eucation","Innovation-driven" and "Strengthening the Country through Talent",Our country's support for scientific research has been increasing,and the number and funding of various scientific and technological projects has achieved a historic breakthrough for five consecutive years.Increasing the number of science and technology project undoubtedly promoted the rapid development of science and technology innovation enterprise in our country.However,huge amounts of project of science and technology and the rapid growth of scientific research subject,to the reasonable effective review and audit of science and technology project with great difficulty,then the "Cross Declaration","Bull to Declaration" phenomenon is the effective development of science and technology plan deadlock in our country.Therefore,how to establish an efficient project management mechanism,avoid the disorderly development and low-level duplication of science and technology projects,and ensure the innovation and advancement of scientific research has become one of the key problems to be solved urgently by the science and technology planning management departments in China.According to the problem of duplication of scientific and technological projects,this thsis studied its similarity evaluation method,in view of the present science and technology project repetitive detection of named entity it is difficult to identify,entity relationship is difficult to extract,semantic mining ability is limited,and poor similarity evaluation precision four problems,based on the deep learning of science and technology project to declare the text similarity detection method.Firstly,a named entity recognition model based on transfer learning is proposed to solve the problem that it is difficult to identify named entities effectively in the text of science and technology projects.This model uses the learning rate restart mechanism to optimize the BERT model,and combines with the large-scale scientific and technological corpus to conduct the secondary pre-training of the baseline model,so as to learn the rich semantic relations and grammatical logic of scientific and technological texts.On this basis,the transfer training of the secondary pre-training model is carried out in combination with the target corpus to further learn the deep semantics of the application text of science and technology projects,so as to achieve the accurate recognition of named entities and the effective improvement of word segmentation effect.Secondly,an entity relationship extraction algorithm based on entity group co-occurrence rate is proposed to solve the problem that entity relationship is difficult to extract effectively.Through the adaptive window length and entity group co-occurrence rate,the algorithm makes a reasonable selection of entity relations,so that the extracted entity relations have a strong semantic relationship with the theme of science and technology projects.On this basis,aiming at the poor performance of the existing science project repeatability evaluation method of semantic mining,this thsis proposes a siamese network based text matching model,the model according to the terms of the context information,location information,and information on the entity relationship embedded processing,and shared by weight BiLSTM model to extract the text of the deep semantic characteristics,thus the text to evaluate different component of the semantic equivalence.Finally,aiming at the poor results of text similarity detection in science and technology projects,a semi-structured text similarity evaluation method combining graph structure similarity and text matching degree is designed.This method adopts multiple linear regression method to conduct joint learning of different parts of the text,so as to train their importance of text similarity assessment,so as to achieve the scientific detection of text similarity of science and technology project declaration,so as to provide certain technical support for the similarity detection of science and technology projects.The method in this thsis can improve the accuracy of similarity assessment of science and technology projects,alleviate the problem of repeated project approval,and ensure the effective use of scientific research funds.On the other hand,it can assist the examiners to make reasonable decisions,so as to promote the intelligent management of our country's science and technology project examination.Therefore,the research of this thsis has certain theoretical guidance and application value.
Keywords/Search Tags:Deep learning, Transfer learning, Named entity, Siamese network, Text similarity
PDF Full Text Request
Related items