Font Size: a A A

Research On Data Clustering Methods In Wireless Sensor Networks

Posted on:2015-02-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:J H HuangFull Text:PDF
GTID:1268330431462445Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of wireless communication techniques, embedded computing techniques and microelectronics, Wireless Sensor Networks (WSNs) are being widely used in many fields, such as military defense, environment monitoring and transport. How to efficiently deal with huge amounts of sensor data in wireless sensor networks, as well as how to acquire useful knowledge, becomes a new challenge. Clustering analysis in data mining is one of the methods to solve these problems. However, it is difficult to be used directly for traditional data clustering methods in sensor networks due to limited resources on sensor node and sensor data with temporal and spatial correlation. In this thesis, we put forward some new methods and ideas for the characteristics of the sensor data in the wireless sensor network, with the main contents outlined as follows:1. An efficient distributed dual clustering algorithm based on grid is proposed for such characteristics as the limited resources and dual attributes (location informations and sensor data) on sensor node. The proposed algorithm consists of two levels of clustering:local clustering and global clustering. First, data space is divided into hyper-rectangle grid cells according to the locations of sensor nodes and sensor data. Second, adjacent grid cells are merged by sensor nodes being location connected and similar in the same, and the features of local clustering are extracted. Then, these local features are sent to sink where global clustering is obtained based on those features. The proposed algorithm reduces the energy consumption of the network by reducing single-hop communication distance and passing data structures. The experimental results show that the proposed algorithm has a better clustering effect for sensor data, has a good scalability for the size of the data set, and can deal with large-scale data set, and find clusters with arbitrary shapes.2. An efficient dual clustering algorithm based on fuzzy c-means is proposed for dual attributes (location informations and sensor data) on sensor data. The proposed algorithm increases positions information of the sensor nodes into the conventional fuzzy c-means algorithm, modifying membership function of the fuzzy c-means algorithm, and improves the performance of the algorithm. Subtractive clustering algorithm is used to determine the number of classes and the initial clustering center due to being difficult to determine in advance the number of classes, thus speeding up the convergence process of clustering algorithm and to avoid falling into local optimal solution. The distributed clustering is used for resource limits on sensor node, which reducing single-hop communication distance of sensor nodes and passing data structures, thereby reducing network energy consumption. Experimental results show that the algorithm has better clustering effect for sensor data and reduces network energy consumption.3. An efficient fuzzy c-means clustering algorithm based on spatial constraints is proposed for sensor data between adjacent nodes being a strong correlation. The algorithm refers to the idea of image segmentation and incorporates the spatial information of adjacent nodes and sensor data into the conventional fuzzy c-means algorithm in a novel fuzzy way. The clustering results are the process to partition the input sensor data set into several groups in such a way that each group forms a compact region in the geographic domain while being similar in the non-geographic domain. The proposed algorithm can overcome the disadvantages of the known fuzzy c-means algorithms and at the same time enhances the clustering performance of the algorithm. The experimental results show that the algorithm has better clustering effect for sensor data.4. A new rough fuzzy c-means clustering algorithm based on spatial constraints is proposed for being not very good for the fuzzy c-means algorithm to handle overlap of clusters and uncertainty involved in class boundary. The algorithm alters the distribution of fuzzy membership function by combining the lower approximation and upper approximation. Accordingly, the computation of clustering centroid and fuzzy membership is modified. The proposed algorithm can overcome the disadvantages of the known fuzzy c-means algorithms and rough c-means algorithms, reducing the computational complexity, increasing the resolution of boundary overlap. Experimental results show that the performance has a very good improvement with respect to the fuzzy c-means clustering algorithm based on spatial constraints.5. Gaussian mixture model is very popular in density estimation and clustering for its expression and flexible. However, the application of Gaussian mixture model to sensor data clustering faces some difficulties. First, the estimation of the number of components is still an open question. Second, mixture-based data clustering does not consider spatial information of the sensor node, which is important for smooth regions to be obtained in the sensor data clustering results. Gaussian mixture model based on spatial information is proposed. The spatial information is used as a prior knowledge of the number of components. An expectation maximization (EM) based algorithm is developed to estimate these parameters of the proposed model using the prior knowledge of the number of components, and automatically determines the number of components. Experimental results show that the EM-based algorithm that estimates these parameters of the proposed model is capable of estimating the number of components accurately and has better clustering effect for sensor data.
Keywords/Search Tags:wireless sensor networks, double clustering, fuzzy c-means, gaussian mixture model, rough c-means
PDF Full Text Request
Related items