WLAN based indoor positioning method has become one of the most popular techniques in both commercial and research area. From aspects of construction costs, the wide range of user applications and technical complexity, WLAN based positioning method has some advantages compared to other indoor positioning techniques, in addition to increasing WLAN coverage in public service facilities in many major cities. Therefore, WLAN based indoor positioning system may use the existing infrastructure to save hardware costs, achieve indoor hotspot coverage as the positioning area, and take advantage of convenient mobile interface and application development. It undoutbly has a broad market potential in civilian and commercial areas. Moreover, in terms of scientific research, many innovative technologies continue to emerge in recent years, WLAN indoor positioning system has been further developed with new features in combination with those improved techniques, thereafter become the most potential one. Based on extensive technical background analysis and indoor positioning methods comparison, supported by research project fundation, this paper aims to further study the WLAN based fingerprinting indoor positioning system.Firstly, to solve the problem that clustering algorithms classify radio map based on the received signal strength of the reference point, without taking the continuity of actual position coordinates into account, this paper presents a space division clustering algorithm to participate the radio map, eliminating singular points which occurs frequently when deploying traditional clustering algorithms. In addition, after the process that radio map has been divided into several sub-regions, each reference point automatically possesss a category label. It actually means that the unsupervised clustering process has been converted into semi-supervised learning process. Therefore, based on machine learning theory we propose a random forest method combing with genetic algorithm for parameter optimization as the solution for sub-region classification. Experimental results show that, compared with the classical clustering algorithms, the proposed space division method is more capable of achieving a high positioning accuracy, and corse positioning process of the space division method combining genetic algorithms and random forest algorithms could achieve a classification accuracy of 98.9% overall, and the classification rate in each sub-region is balanced, which shows a relatively high classification performance and stability.Secondly, to solve the problem that fingerprint data or information overload of the positioning system may cause dimension disaster in large indoor environments, we propose a low-dimensional feature extraction algorithm based on maximum likelihood estimation integrated with the kernel principal component analysis to reduce data dimension, remove redundant radio map data, and improve the stability in noisy environments. The maximum likelihood estimation is designed to effectively determine the optimal target dimensionality, which may save a lot of time from system simulation or experiments for finding the optimal dimension. While the kernel principal component analysis, according to the experimental results, shows that it suffers less RSS fluctuation in noise environment, and the confidence probability can be maintained around 80% with 2 meters error range, which illustrates a better anti-noise performance. Meanwhile, it reduces the size of the radio map by 74%, thereby saving a large number of mobile terminal storage.Thirdly, according to the fact that human trafic may greatly affect the signal propagation in an indoor environment, we observed certain rules or patterns of human traffic in weekdays and weekends, and thus set mobility factors to compensate positioning systems by adjusting RSS values based on different moments, i.e., setting a time-dependent radio map. Experiment results show that, during the peak hours of human trafic in weekdays, the locating accuracy of traditional methods fell nearly 10% compared to an idle period, while the improved system, which has been introduced mobility factor, only loses around 4% positioning accuracy. It effectively makes a compensation on the noise interference caused by human traffic.Finally, to accommodate the limited mobile computing power and storage space, we propose an indoor positioning model based on radio map partition and dimension reduction, which enables the terminal to run the positioning application independently, outputs location coordinates timely, and has strong scalability. Based on the model, we further propose the SDK indoor positioning system, which combines the Space Division method and Kernel principal component analysis method mentioned before, to be able to reasonably partition the indoor region, reduce positioning errors, save amounts of storage space and enhance anti-noise capacity. Experimental results demonstrate that system positioning accuracy in sub-region may reach 85%, shows an outstanding advantage with low dimension in noisy environment. In addition, we further introduced the number of APs and the sampling interval as two key environment variables to verify their impacts on positioning performance. The results illustrate that, on a certain coverage basis, the increasing number of APs may provide limited help on positioning, but sometimes does harm. It is supposed to use criterion such as maximum mean value or information entropy to choose the optimal APs for positioning. In terms of the sampling interval, dense reference points sampling may introduce more noise interference with abundant labor cost, while sparse sampling apparently may lower the accuracy. In the given environment, where the width of the corridor is 2.5 m, the optimal reference point interval is 1 meter which offers the best performance. |