Font Size: a A A

The Application Research Of Data Mining Technology In The Discovery Of Special Celestial Bodies

Posted on:2010-03-12Degree:MasterType:Thesis
Country:ChinaCandidate:L H ZhangFull Text:PDF
GTID:2178360278472681Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the operation of the large-scale observations of LAMOST projects, more than 10,000 spectra will be received every night. Spectra of celestial objects contain important information, so spectroanalysis plays an important role in celestial bodies research. Since human understanding of the universe is still relatively limited, one of the LAMOST missions is to find some new, special type of celestial bodies. So how to make use of data mining technology to find some unknown and extraordinary objects and celestial laws from the mass spectra data of celestial bodies is very worthy of study and exploration.Data mining technology has been widely applied in many fields. By data mining technology, the functions of correlative prediction, classification, and clustering, isolated point discovering and time-series analysis can come true. At present, many mining algorithms with high-dimensional data become research hotspots. The spectra of celestial data are also high-dimensional. Thus, data mining technology can provide good support for the found of special celestial bodies.According to the aim of LAMOST, the classification of spectra data can be divided into two parts: rough classification and careful classification. The first step of rough classification is to divide spectra of celestial bodies into normal objects and emission-line objects. And then normal objects are divided into normal galaxies and stars, while emission-line objects are divided into starburst galaxies and Active Galactic Nuclei.This paper is just focus on the stars after rough classification for data mining. The main jobs of this thesis are summarized as follows:(1) Due to the star's high-dimensional features, this paper uses the PCA method to construct spectral principal component. Firstly, we put the principal component as axis, and then project sample point directly to the axis in the principal components. So we get the sample feature points on two-dimensional plane and greatly reduce the dimensions of spectral data.(2) Do some research on the basic knowledge and the basic theory of density clustering algorithm and analysis its strengths and weaknesses. Since this thesis is aim to find the special objects , this paper propose DBFO algorithm to improve DBSCAN algorithm. DBFO algorithm sorts all the objects by distance based on the shortest distance between clusters, eventually to detect outliers for the purpose.(3) According to the general steps of data mining, the study built special celestial mining system based on stars after rough classification and introduced mining flow and the system module. This system mainly includes data pre-processing, dimensionality reduction projection, clustering modules. And then using Matlab to display the special celestial bodies has been excavated. Finally, this paper Compared the advantages and disadvantages of DBFO algorithm and clustering trees method by the analysis of experimental results, and show the difference between special spectrum and general spectrum by example.
Keywords/Search Tags:Spectrum, PCA, Clustering, Outlier, Data Mining
PDF Full Text Request
Related items