Font Size: a A A

Multinomial Distribution Model And Its Application In Correlation A Nalysis Of Phenotypic Traits In Soybean

Posted on:2018-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:G Q SunFull Text:PDF
GTID:2348330536471419Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data mining technology is an important research method in computer science,which can search the important information in the data from a large number of data.In order to solve the practical problems,we do the macro and micro statistics,analysis and inference of the sample data,and get the potential beneficial information.Help us to Policy decision about scientific practice.In the scientific method in data mining the mathematical statistics is indispensable.The data collection,collation,analysis and reasoning in science,applying the method of mathematical statistics to dig out the sample data of successfully extracted the important information of the sample data,and provide a theoretical basis for taking decisions and actions.In the statistical analysis of the classic,the multinomial distribution is a typical representative,its probability distribution depends on the parameter vector,this paper gives the parameters estimation method,deduce asymptotic normality of the multinomial distribution.By fitting the normal distribution to study the multinomial distribution,Improved the tedious process of the multinomial distribution in data processing.This article 40 kinds of soybean varieties of Jilin province,the data accord with the multinomial distribution,research 14 characters and the effect on the yield of correlation analysis.Put multinomial distribution to be used the research of Soybean Data association,study to asymptotic normality of multinomial distribution,the normal distribution is more extensive in application,2? test of goodness of fit combined with agriculture to improve the operability of the system,at the same time,the complexity of operation is reduced.The normal distribution in the application is more extensive,variable effects can be achieved and the intended purpose,2? test of goodness of fit combine with agriculture can improve the operability of the system and reduce the complexity of the operation.In the data collation and analysis,is used for principal component analysis and cluster analysis,to mining initial data,In 14 phenotypic traits of soybean.First,to be in progress cluster analysis,the results are not clear,new data were obtained through the screening of principal component analysis,again to play cluster analysis.Results show cluster analysis of principal components,alysis on the effect of experimental varieties of 14 traits on yield,application of statistical analysis software MATLAB of 40 soybean varieties of fourteen characters,on the basis of principal component analysis,using principal component analysis and cluster analysis method of the combination of the two.The mining of the original data in the 14 phenotypic traits of soybean in the first cluster analysis results is not clear,the improved screened principal component analysis,five principal components are extracted first,and the second principal components as new data,clustering analysis.The results show that soybean varieties can be divided into four categories,and we classify the results directly to the breed,cluster analysis of principal components,no important factors affecting loss,improving efficiency,and provides a theoretical basis for the selection and breeding of soybean.The main content of this paper:(1)The analysis of soybean related data using Matlab software simulation and experimental test,by means of 2? test of goodness of fit to excavate and sort out the data,verify the soybean phenotype data obey the multinomial distribution and asymptotic normality of multinomial distribution.(2)In the traditional clustering analysis based on the method of principal component analysis and cluster analysis combining,the method of direct clustering of varieties is improved,no important factors affecting loss,improving efficiency,and provides a theoretical basis for the breeding of soybean.
Keywords/Search Tags:Data mining, Multinomial distribution, Principal Component Analysis, Cluster analysis, Soybean, Phenotypic traits
PDF Full Text Request
Related items