Font Size: a A A

Study On Two Key Techniques In Archive Digitalization

Posted on:2008-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:F ZhangFull Text:PDF
GTID:2178360272967820Subject:Systems Engineering
Abstract/Summary:PDF Full Text Request
In recent years, archive processing technology has been developing unimaginably towards the direction of digital, informational and networking at very fast speed. Traditional paper-based archival processing methods to some extent limit the sharing of files and information inquiries,large qualities of archive brings about new challenge to the trend. Focused on two key techniques in archive information process: symbol recognition and search mapping, a comprehensive and in-depth discuss was made in this thesis from three aspects as theoretical foundation, application methods, and analysis simulation.Symbol recognition is the base and core of the entire process. In traditional barcode-based information recognition application, massive archive files burden archive workers as well as barcode attaching is a rather complex work and error prone. Meanwhile, barcode undermine the original appearance of the files. Symbol recognition technique made full use of pattern classification and neural network as core technique, file scanning image processing technique as the basic principle, symbol as separator between two files, manual preprocessing to guarantee correspondence, which successfully replaced original barcode, lowed down redundancy in archive database, improved efficiency of the inquiries, and brought out considerable convenience to the following step: search mapping.Search mapping is the goal and end-result of the whole process. Traditional paper-based archival retrieval method is no doubt of low efficiency. When facing large amounts of unrelated data, like Internet Web information retrieval, archive information retrieval is also more and more challenged. Apply modern internet retrieval technique into archive information retrieval; make full use of text mining as basis, brought forward the concept of correlation degree between archives, which makes automatic clustering between archives possible to follow. Meanwhile, I use PageRank algorithm of search engine Google for reference and provide different ranks of priority in face of the users, which thereby means"full, accurate, fast"searching goal is coming true, and it's also a successful application of network searching technique into archive information retrieving.Via modeling and simulation to the application technique, real archive data was made used of as training samples, testing result was exported, integrated evaluation indicators of this system was also established, which facilitated optimizing the system. In the end, a full summarization and conclusion of the key techniques was made, mentioned where should be ameliorated, and a solid foundation for the next step: establishing distributed sharing archive information platform was in the meantime constituted.
Keywords/Search Tags:symbol recognition, search mapping, pattern classifying, text mining
PDF Full Text Request
Related items