Font Size: a A A

Research And Implementation Of The Evidence-protection System Based On Hadoop

Posted on:2015-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2308330473451698Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With rapid development of internet and mobile internet, the data has already shown exponential order growth. Facing with great challenge which have brought by mass data, internet companies at home and abroad have applied cloud computing into commercial service,then have operated cloud services separately.Cloud service is the serve which is supplied by kinds of computing resources and commercial application programs based on internet. Benefiting from those serves, data processing change from personal computer or personal serve to internet data center in order to reduce investment on aspects of hardware, software and professional skill. Now cloud service have been widely applied in kinds of Business Situations and developed to very mature business services models. Based on Hadoop, the thesis should mainly finish work as below:1, Design and implement the evidence-protection system on cloud service. evidence-protection system should realized function as below:At first, it build gateway between cloud service providers and user, acquire HTTP requirement on cloud service API of all users,which are based on cloud service providers appointed filter conditions, then get users feature information. User feature information mainly conclude user name, user requested time, user requested cloud serve as well as cloud serve parameter Then gateway leads user feature information to data analysis system, which is used to analysis user feature information according to data appointed by cloud service providers and reach resulted shown by report form. At last data analysis system filing user information according to cloud service’s requirement and store them forever. For a large user base of cloud service suppliers, data volume should be kept in PB level which are dealt by evidence-protection system, thus it has realized that evidence-protection system would apply Hadoop as base course of data analysis system and storm system2,Evidence-protection system store user feature information into HDFS regularly according to filing requirement. At same time, Filing operation divide user feature information into lots of documents, which concludes lots of GB level large files as well as few KB level small files. HDFS is designed specially for large file store. while if storing too many small files, the whole clustering performance will be reduced. So, after reading Hadoop sound code carefully, the paper analysis reasons which lead to reduction on performance of HDFS when it is used to store large quantity small files. Then on the bese of that, the paper give suggestion on aggregation strategies on HDFS client side, build index in order to realize optimization of storing small files.
Keywords/Search Tags:cloud service, Hadoop, small files store, Evidence-protection system
PDF Full Text Request
Related items