Font Size: a A A

Document Similarity Analysis Based On Knowledge

Posted on:2021-02-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y W SuFull Text:PDF
GTID:2428330647457216Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In order to promote the development of scientific research,the state invests a large amount of scientific research costs every year,and the number of scientific research projects from different fields increases dramatically,which increases the difficulty of repeated scientific research projects and the judgment of scientific research project results.On the one hand,we can accurately search relevant documents set according to the specified documents,quantify the degree of repeatability of research projects,and on the other hand,we can accurately find the relevant technology and research level,and assist in evaluating the advanced nature of projects.However,there are still great technical challenges in the current technology to achieve the accurate understanding of project documents,and there are problems such as weak semantic understanding ability and insufficient search accuracy.Therefore,this paper focuses on the needs of similarity analysis of project related documents such as project proposal.The research work of this paper mainly includes the following two aspects:1.Document correlation analysis technology:because the content of each part of the document has more redundant information,this part describes the document as multi-dimensional and multi granularity knowledge items by using knowledge portrait,gives reasonable weight to different dimensions,uses feature matching and key point knowledge analysis to replace natural language Speech full-text matching,improve the search accuracy and recall rate.2.The core content of the project proposal is the creative part.Aiming at the similarity detection of this part,this paper studies the multi-source combination of document weight analysis technology.In this part,we take the item as the granularity of analysis,the synthesis of a variety of features for calculation.The above research provides key technical support for a project.And uses the actual data for testing,the experimental results show that the system can quickly and accurately search the relevant documents,and has a good identification of creative parts from multiple documents.
Keywords/Search Tags:Knowledge portrait, similarity analysis, multi-dimensional andmulti-granularity computing model, Multiple sources combined class weight determination algorithm
PDF Full Text Request
Related items