Font Size: a A A

Research And Achieve On Cloud-based Electronic Document ECM System

Posted on:2014-11-19Degree:MasterType:Thesis
Country:ChinaCandidate:K YanFull Text:PDF
GTID:2268330422454255Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Document electronic management is the inevitable trend of document management intod’ays industrial environment which information building becomes more and moreimportant. Compare with traditional paper document management, it has obviousadvantages on cost, management, safety and transmission. ECM (Enterprise ContentManagement) system which is a kind of document electronic management system becomespopular under this background.However, there are still three key problems in traditional ECM system. The first isrestricted by hardware. With electronic documents increasing, It will meet lack of harddisk storage problem; backup process is not good enough and so on. Due to currenthardware limitation, it will be difficult to update flexibly. The second is still lack the waysof classifying documents automatically. The content management is still stop on thesearch level. The last is process speed problem when meet a large number of documentsneeded to do OCR (Optical Character Recognition). The main feature of ECM system isto management document based on content. So the OCR process is very important.However, the process of content recognition will occupy many resources of CPU, so thespeed problem is still a concern in ECM system.With the cloud computing appearing, it provides a new platform for electronicdocument. As a business computing model, it can distribute computing tasks to theresource pool which is composed by many computers. It can make users get computingability, storage and information service based on needs. It makes to solve the threeproblems in the traditional ECM system become possible. This thesis is based on actualproject to analysis how to build ECM system based on cloud computing platform. Toprovide an achievable structure which is built on Amazon cloud platform and can connect with other SaaS applications. This thesis introduces the way of building ECM system oncloud platform, the way of storing on cloud and the communication way between servers oncloud. Further, this thesis also discusses and researches on the problems of classifyingdocuments through OCR and operating efficiency. Based on the keyword list which isd"ecided by experience and used in actual project, plus using “information gain algorithm toincrease document classification accuracy. Also give a test and analysis to evaluate theeffect. At last through importing load balancing algorithm, discuss how to increase theoperating efficiency.Finally, this thesis summary the whole paper and prospects the future of current system.Current system should deepen collaborative with cloud, and do more research about thesafety of documents on cloud at the same time. To achieve a system that can be usedanytime anywhere safely and easily. About document classification, it should focus on thespecial documents which are used during work lfow in the company. Though combiningthe expert experience and machine learning algorithms, to increase the documentclassification accuracy, to make document classification based on content function can bereally used by user.
Keywords/Search Tags:Cloud, Electronic document, OCR, Information Gain, LoadBalancing
PDF Full Text Request
Related items