Font Size: a A A

Research On Data Cleaning Algorithm Based On RFID Middleware

Posted on:2016-04-11Degree:MasterType:Thesis
Country:ChinaCandidate:G K WangFull Text:PDF
GTID:2348330476955747Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Because of the contactless Identification features, RFID has been more and more widely used. Each application has the potential to form a huge market, it is widely considered that RFID technology has become an important economic growth engine. As the core of RFID systems, RFID middleware is directly relates to the effectiveness of application based on RFID technology, it has attracted interests among academics and industry. As the core functionality of RFID,RFID data cleaning efficiency and quality directly affect the service level of RFID application system. Due to the complex and diverse reality environments, RIFD reader produces a large number of unreliable data, this data will seriously hamper the accuracy and stability of RFID applications. Therefore, this thesis focuses on RFID middleware based on data cleaning algorithms.Firstly, this thesis extended designs a lightweight RFID-based middleware EPC Global standards and gives a specific design data cleaning module. Hence, Existing RFID middleware are not suitable for the combination of the two data cleaning algorithms. In order to obtain better data cleaning effect. In this thesis, we design a suitable RFID middleware architecture based on two algorithms. The middleware is divided into three modules: Device adaptation module, Data processing module and the Transaction processing module. A specific design of The Data processing module is given a specific design. Data processing module is further divided into three portions: Data pre-cleaning module, Data buffer queue, Data cleaning module. Two algorithms will be embedded in the improved data processing module for cleaning dirty data at different levels.Secondly, we improve an algorithm for cleaning missing data and redundant data under single label environment. Existing single-label data cleaning algorithms can't clean a variety of dirty data and the cleaning effect is not ideal. In order to clean missing data and redundant data under single label environment, this thesis analyzes SCBD algorithm that is based on the binomial distribution formula model, designing an improved SCBD algorithm for single-label cleaning GSCBD. Through the tag-level data cleaning algorithm SMURF(Statistical Smoothing for Unreliable RFID data), SCBD and GSCBD comparative experiments and results analysis, we demonstrated that the GSCBD algorithm is better than SMURF and SCBD on cleaning results in tag-level redundant data and drain cleaning aspects of the read data.Finally, this thesis improves an efficient data cleaning algorithm for the reader-level redundant data in the RFID data stream. Most algorithms that solve the duplicate data are based on the sort/merge ideas. Using these algorithms to clean the reader-level data are not suitable for the RFID data cleaning field. According to the characteristics of RFID data, the improvement of the SNM(Sorted Neighborhood Method)makes it more suitable for Data-cleaning under RFID environment. Through simulation and result analysis, we illustrate that the SNM-TS algorithm obtain high quality RFID data and effectively improve the efficiency of data cleaning.
Keywords/Search Tags:RFID, RFID Middleware, Redundant Data, Missing Data
PDF Full Text Request
Related items