Font Size: a A A

Research On Intelligent Analysis And Optimization For Three Kinds Of Application Data

Posted on:2017-03-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Y KeFull Text:PDF
GTID:1108330491960001Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The big data era has arrived. Data has become an important factor of production, and penetrates into every area of today’s industries and business. Analyzing and optimizing massive amounts of data paves the way for productivity growth and the surplus of the consumer. The data come from everywhere:smart phone, tablet, PC, mobile Internet, IoT, cloud computing, vehicle networks and widely deployed sensors in every corner of the earth. Basing on the collected data, big data applications have sprung up, such as personalized commodity recommendations, prediction of football matches and the applications in the power system. In recent year, the researches in big data focus on how to effectively analyze data, and how to improve the effectiveness of key technologies to enhance the experience of existing applications. However, faced with different kinds of data and increasingly complex scenarios, data analysis and optimization face many different challenges, such as data acquisition, data processing and storage, etc.; the complexity of some data analysis method is relatively high and hard to be deployed in real scenarios; the difference between the application scenario and the inconsistent data distribution deviates the performance of the same algorithm. Facing these challenges, the intelligent analysis and optimization are carried out for three kinds of application data, and the major contribution of this thesis are as follows:1. Power generation optimal dispatching algorithm based on power dataTraditional power dispatching algorithms are slow when they face with the newly generated data of the machine unit. Thereby they possess less flexibility. The scheduling is made by the human experts, and the historical data are not being fully taken advantaged. In early 2014, Anhui province has completed the real-time data collection work of generating set emissions. According to these data, we design an optimized, light-weighted power dispatching scheme. In this thesis, we use the regression analysis to fit the model of the relationship between power and the emission levels. Basing on this model, we design an optimal dispatching algorithm. As the consequence, we can reduce the total pollutant emissions.2. Demographic prediction model based on E-commerce dataIn various applications like personalized search and recommendation, full demographic information is a precondition for many applications’ well performance, but such ideal dataset rarely exists in practical scenarios. What’s worse, absence of key characteristics (e.g., age and gender) makes these applications struggle. In this paper, we design a prediction model to solve the problem of time-dependent demographic prediction. The key insight behind our approach is, we leverage a time-back-propagation method to take the internal time correlation of historical behaviours into consideration and collect all available data to train a classier, which is a mapping from user’s historical behaviours to the demographic information.3. Character recognition method based on spatial magnetic-sensing dataWe propose Magemite, a fine-grained input system that exploits the around device space as an expansion of the limited input area. The key insight underlying Magemite is, magnetic sensor integrated in smart devices can sense nearby magnetic field strength. Using a permanent magnet, users could "write" in the around device space to communicate with matched devices. Different from previous magnetic-sensing schemes that recognize only coarse-grained gestures, Magemite can recognize user’s fine-grained input like characters. However, individual’s diverse writing patterns affect the recognition accuracy. To address this challenge, we preprocess the input trajectories and abstract different features of the trajectories to uniquely identify user’s input, then use these feature vectors to train several pattern recognition models for character recognition.Finally, the experimental results show that:in the work of power generation optimal dispatching, the regression model between power and emission levels can achieve 97.02%in terms of average accuracy. Experiments on 10 generating sets show that our optimal dispatching algorithm can reduce the total pollutant emissions by 4%, and achieve the goal of energy conservation and emissions reduction. We demonstrate the effectiveness of the demographic prediction model through experiments of baby’s age prediction. Our algorithm performs more balanced on each age group, and can predict baby’s age accurately in 78.2% on a real-world dataset of a major E-commerce site. In the final work, Magemite can recognize user’s fine-grained input, and can achieve average recognition accuracy over 85% in various scenarios.
Keywords/Search Tags:Big data, Power generation dispatching, Demographic prediction, Inputting system, Trajectory recognition
PDF Full Text Request
Related items