Font Size: a A A

Research On Discovering WDMS In Massive Spectra Data

Posted on:2016-01-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:W Y WangFull Text:PDF
GTID:1220330461485524Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the creation and running of the telescopes all over the world, the amount of mass data about astronomy soars as well. Therefore, the processing of these astronomical data including photometrical data, spectral data and astronomical image data becomes an urgent problem. This thesis aims at the research of preprocessing and auto classifying spectral data because in the massive spectral data, there are many time-varying and sparse asters, even some unknown asters. The discover of special asters is meaningful for the research of the origin of life and the rules of cosmic evolvement. However, even though the spectral data of these asters have already obtained, it is hardly to make sure the types of them. It is difficult and time-consuming to find some new asters just relying on the astronomical observation. As a result, many researchers from the area of computer science pay more attention to the automatic classification of spectrum and the methods of mining special asters.Many researchers are using automatic classification of computers to finish the clas-sification of stellar spectrum by researching. And the finding of special asters is a very important sphere. According to the massive spectral data from SDSS and the character-istic of spectral data of particular star, this thesis analyzes high-dimensional features of spectral data, finds the suitable dimension reduction algorithm and determine features in the best dimension. Through the comparison and optimization of algorithm, the paper creates different classification models and finally determines a model with high accuracy rate through experimental comparison to find new special asters named WDMS (White Dwarf+M sequence) and replenish the result found by pioneers. In this case, a better condition can provided in this thesis for researching the evolvement, density distribution and structure of asters. Meanwhile, it makes a great difference on probing the formation and evolvement of the Milky Way.The methods are probed of effective feature extraction from high-dimensional spec-tral data to make sure the best dimension of WDMS in this thesis. In this thesis, linear and nonlinear feature extraction methods respectively are used to reduce dimension of these data. For linear feature extraction method, the paper mainly uses PCA (Principal Com- ponent Analysis) to extract main features and build spectral feature matrix. By using the PCA, the paper can get a group of linear transformation P from sample set, which could use lower dimension to represent the sample based on reserving the variance on the max-imum extent. Thus the redundancy of data can be reduced. For nonlinear feature method, the paper mainly uses manifold learning like ISOMAP (Isometric Feature Mapping) and SAE (Stack Auto Encoders). ISOMAP takes advantages of geodesic distance instead of Euclidean distance. SAE can extract features from new inputting spectral data and obtain the data of spectral features by combining the weight from training with spectrum lin-early. At last, the paper contrasts the methods of linear and nonlinear feature extraction, determining the most suitable method according to efficiency and accuracy. Combined with classification algorithm, the paper makes sure the best dimension of WDMS. The innovation points completed in this thesis can be listed as follows:1. Using deep learning to reduce the dimension of spectral data low noise-signal ratio. Nowadays, most researchers make use of spectrum with high noise-signal ratio because its linear feature extraction has a higher accuracy rate. However, the result of classification of spectrum with low noise-signal ratio is not satisfactory. As a result, the research of spectrum with low noise-signal ratio is difficult because its spectral features are not obvious. But the paper proves by experiments that using deep learning to extract features of spectrum with low noise-signal ratio has great results as well.2. Building classification model for WDMS. Based on determined dimension re-duction algorithm, different classification models are builded and compared the results of them in this thesis. At last, the classification model for all the data from SDSS (Sloan Digital Sky Survey) DR10 is determined. In this model, classification algorithm and clus-tering algorithm are contrasted in the the accuracy rate, and combined to delete lots of asters that are not WDMS through clustering algorithm. Applying classification methods to the remaining spectrums and optimizing them, a model is builded with high accuracy rate to find WDMS based on clustering and classification. At last, through the model, 4986 results are founded,4240 of which are WDMS. And after confirmation,22 of them are not found at present. The experiment shows that using effective data mining methods for automatic search of special asters has a quick speed, high accuracy rate and obvious result, which could be applied to other data from telescopes.3. Building color feature model for WDMS that are already found. The photomet-ric criterion proposed by Szkody provides us with an effective feasible basis for relevant study. The color features of WDMS are probed, and the color feature model is builded with better classification result by building the neutral network with polynomial charac-ters and RBF in this thesis. Then the model is applied to the photometric data of SDSS and sift effective data from them. In this way, the efficiency of data mining can be largely increased. Meanwhile, as data preprocessing model of classification mentioned above, the model is used to finish the data screening of massive data mining. Then,using the model, the classification work is completed with improving the efficiency and accuracy rate of classification.
Keywords/Search Tags:WDMS, Data Mining, ISOMAP, Support Vector Machine, Neural Network, Spectral Decomposition
PDF Full Text Request
Related items