Font Size: a A A

The Design And Implementation Of Data Management And Checking Repeat System For Scientific Research Text

Posted on:2018-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:B Y LiFull Text:PDF
GTID:2348330536981622Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the 21 st century,the wave of information age,along with the progress of science and technology,network information explosion,a large number of documents in the existence of a large number of similar information.How to accurately manage these documents and how to accurately and quickly find similar documents content,document retrieval technology came into being and the rapid development of the reasons.Document search is to determine whether the text content with the document library in one or more of the contents of the text there is a similar part of the similarity and higher.In this paper,the scientific research data management system as the background,for a large number of scientific research data environment,accurate and rapid detection of scientific research documents exist between the similar relationship for the latter part of the research project research and research focus to guide the basis to achieve The management of scientific research data documents,resulting in scientific research data management and check the sub-system,to the scientific research data management and check the weight.In this paper,the resource management and investigation of the scientific research project needs analysis,design and implementation of the resource management subsystem and check the sub-system for scientific research projects of scientific research information management process to provide information management,to speed up the research data entry,Such as the speed of the assessment,the system can save labor costs,solve the problem of high efficiency of the problem of low processing rate of the problem;for scientific research projects in the future whether to repeat the development or repeat the study to provide quantifiable information,the display of the project or inter-text duplication Text,found between the text of the repeatability.In order to manage the scientific data to produce scientific research data management subsystem.Including the upload of the list,the addition of data,the query of the data page,the query data to the form of excel works to download,the library needs,guides,contracts,applications and other documents to upload and download the guide file Issue and review form.Using the SSM framework to complete the resource management subsystem function.As a data source for the search subsystem,the document search subsystem function is a display of documents that have been imported in a document library with high similarity,and when a document is entered,it shows in detail which documents of the document are similar And show,and for the specified two topics under thedesignated a document,compared to the two documents in the similar paragraphs and statements.And can specify any two questions within the document to check the operation.For the similarity calculation between sentences,the similarity between paragraphs is calculated by combining the simhash fingerprint distance and the key coincidence degree between the paragraphs.The similarity between the paragraphs is calculated by using the method of calculating the similarity between sentences.And synonym conversion to eliminate the synonyms brought about by ambiguity,the use of redis memory data to quickly read the calculation.At present,scientific research data management subsystem and check the sub-system have been officially launched.System functions and performance are very good performance,greatly improving the management of scientific research projects and the convenience of the text check the business processing capacity.
Keywords/Search Tags:check repeat, the multiple characteristics of the sentence, simhash, SSM
PDF Full Text Request
Related items