Research On Internet Of Things Oriented Cleaning And Storage For Unreliable RFID Data Set

Posted on:2014-12-05

Degree:Doctor

Type:Dissertation

Country:China

Candidate:H Fan

Full Text:PDF

GTID:1268330422974311

Subject:Computer Science and Technology

Abstract/Summary:

Internet of Things (IoT) is proposed to achieve the organic combination of humansociety and the physical world, which can make the human cognize the world in a morerefined and dynamic way and realize the management and control to improve theinformation level as a whole. RFID technology is one of the most important informationtechnologies in the field of Internet of Things, which is widely used in the field oflogistics warehousing, supply chain management, asset management, personnelmonitoring, indoor positioning and tracking, and so on. RFID technology is anon-contact radio frequency identification technology. By scanning the RFID tags,readers can obtain the location and time information of the tags in real-time in order toachieve the tracking and positioning of the RFID tags and the corresponding itemswhile the corresponding data is usually expressed in the form of (tag_ID, loc, time).However, due to plenty of miss reading phenomenon (approximately30%RFID datahave been missed), there is serious unreliability and incompleteness of the original datacollected by the RFID readers. How to clean these unreliable data and store themefficiently is the key of RFID technology in the field of Internet of Things applications,but also the focus of this paper.Based on the in-depth analysis on unreliability, highly redundancy and massivefeatures of RFID data, this paper focuses on improving the precision and efficiency ofinformation inquiry in the RFID technology based Internet of Things applications aswell as reducing RFID data storage overhead, and presents the corresponding modelsand solutions for data cleansing at the physical layer, data filling at the logical layer aswell as storage of massive data and query optimization. The main contributions of thispaper are as follows:1. We propose an unreliable RFID data cleaning method based on a probabilitymodel for the motion of tags. At the physical layer, in the light of the problem of RFIDdata leakage caused by miss reading, we model the RFID data stream by Bernoullibinomial distribution and introduce a probability model of RFID tag motion state. Thenwe create a conversion relationship between the raw RFID data and motion stateinformation of tags (speed, direction and displacement) so that the missed data can befilled according to the motion state information of tags. Finally, a reverse filteringmechanism for a data sequence is proposed to further ensure that the motion stateinformation of tags can be captured. Experimental results show that the cleaning methodhas higher accuracy than the classic sliding window smoothing technique.2. We propose a Hidden Markov Model based RFID trajectory data cleaningmethod. At the logical layer, in the light of the incompleteness of trajectoryinformation in the indoor tracking and positioning system based on RFID technology, we first map the reading sequence of readers in the system to the observable statesequence of the Hidden Markov Model while we map the position sequencecorresponding to the tag to the hidden state sequence in the Hidden Markov Model, sothat the trajectory cleaning problem of the tags is transformed into a classic decodingproblem based on the Hidden Markov Model. Based on the classical decodingalgorithm-Viterbi algorithm, an efficient algorithm for path decoding is presented.Experimental results show that the proposed algorithm can efficiently and accurately fillthe missed trajectories, provide a guarantee for the accurate information query, and theaccuracy and processing performance of data cleaning have been greatly improved thantraditional methods.3. We propose a Bayesian inference based approach for unreliable RFID datacleaning. Miss reading can cause supply-chain companies to mistakenly respond to themarket demand and bring huge economic losses. In order to accurately obtain thereal-time receiving and shipping information of the items being tracked, this paper firstpresents the path code schema based path matching algorithm, which can efficientlyobtain the distribution information of the tags which have the same historical path withthe current tag. Thus, a path information based differentiated decision model isproposed to provide differentiated decision program for the missed tags with differenthistorical path information and make the cleaning results more accurate. Finally, thesliding time window model which can effectively save the computational overhead ofthe model is introduced and it uses the maximum entropy model to dynamically adjustthe size of the time window so that the efficiency and accuracy of the model canperform a better balance. Experimental results show that the proposed cleaning methodnot only can effectively improve the data cleansing accuracy of the supply chain field,but also have better scalability.4. For the massive RFID data storage and query optimization, we propose asplit-path schema-based RFID data storage model. More and more space and time areneeded to store and process such huge RFID data, and there is an increasing realizationthat the existing approaches cannot satisfy the requirement of RFID data management.First, on the basis of the path framework based storage solutions, a tree structure basedpath splitting approach is proposed to split the movement paths of products intelligentlyand automatically according to the requirement of users. Further, we present a split-pathschema based RFID data storage model. With a data separation mechanism, the massiveRFID data produced in supply chain manage systems can be clustered, stored andprocessed more efficiently. Finally, based on the proposed new storage model, wedesign the relational schema to store the path information and time information of tags,and some typical query templates and SQL statements are defined. Experimental resultsshow that compared with the path encoding schema-based storage model, the proposedstorage model can significantly improve the path-oriented query performance. Moreover, the storage overhead of our model is only about12%of that of the raw RFID data.

Keywords/Search Tags:

Internet of Things, RFID Technology, Unreliable Data, DataCleaning, Data Storage, Bayesian inference, Split-Path Schema

Related items

1	Compressed Storage And Query Method Of RFID Data In The Internet Of Things
2	The Key Technology Research Of Data Collection For Workshop Based On Internet Of Things
3	Research Of Manufacturing Data Management Technology For Discrete Workshop Based On Internet Of Things
4	Study On Big Data Cluster Analysis Method And Technology Of Internet Of Things
5	Application Of RFID Technology Based On Internet Of Things In Material Storage Management System
6	The Technology Research Of Massive Data Fusion For Workshop Based On Internet Of Things
7	Research And Design Of Data Services Technology For The Internet Of Things
8	Research On The Technology Of Data Cleaning In Manu-Facturing Of Internet Of Things
9	Research And Implementation Of Data Stream Technology Based On RFID
10	Research Of Intelligent Medical Terminal Data Integration Technology Based On Internet Of Things