Font Size: a A A

A New Multiway Data Analysis Method-Multiple Bilinear Decompose And Its Initial Application

Posted on:2011-03-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:H W JiangFull Text:PDF
GTID:1114360305475539Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Multiway data is one type of data commonly encountered in many science research fields, such as chemometrics, engineering, medicine, psychology, pharmaceutics, and so on. Typically, multiway data are assembled from sets of subjects and variables measured under several conditions (or on several occasions). Such data are characterized by their multidimensional, rich and complex structure.In literature, there are two main types of multiway data analysis methods used most in applications. In many science research fields, two-way singular value decomposition (SVD), CANDECOMP-PARAFAC decomposition (PARAFAC), and Tucker decomposition (TUKER) are three main array decomposition methods. There are disadvantages with the three methods. If multiway data are indeed multilinear, PARAFAC and TUCKER can provide more robust and interpretable models compared to two-way SVD. However, PARAFAC is sometimes numerically unstable, and TUCKER can not guarantee the uniqueness of an approximate solution.Recently, some researchers have suggested several important properties of the "ideal" extension of two-way SVD, such as unique and interpretable model, closed-form decomposition, successive decomposition algorithm not based on alternating least squares (ALS), and robust computation with respect to noise. And we consider that the assumption underlying multiway methods (i.e., that multiway data are multilinear) is too strict to comprehensively explore the rich and complex structure in multiway data. Motivated by these suggestions and this consideration, this paper proposes a new array decomposition model with multiple bilinear structure. Then, utilizing this model, a new method, called multiple bilinear decomposition (MBD), is proposed as a generalization of two-way SVD.By introduction of several basic definitions (e.g., MBD polyad, reshaped Kronecker product), MBD extends two-way SVD to higher ways. Mathematically, a MBD model is assumed to possess a multiple bilinear structure where an array is written as a sum of sets of Kronecker products of loading vectors and score arrays in every mode. Based on this model, a method is established to obtain the estimation of parameters. Theoretically, the proposed method has an advantage over PARAFAC and TUCKER in its three important properties, including orthonormality of loading vectors, closed-form decomposition, and successive decomposition of variation.An algorithm is established to successively decompose array without a full decomposition, which does not based on alternating least squares. Each decomposition cannot be successively determined without computing the entire decomposition. It conduces to the numerical stability of MBD algorithm, which needs a further proof.In this paper, a simulation results based on orthogonal PARAFAC models show that the proposed method outperforms PARAFAC with respect to accuracy and robustness of loading estimate and data-fitting of model, even though the former does not use the priori information of multilinear structure. And, especially in the simulation under no noise, the equivalence of loading estimates indicates that as a successive decomposition, MBD is a superior alternative to PARAFAC. An application for the surveillance of adolescence schoolgirls shows that MBD has better interpretabilities than PARAFAC.The proposed method is illustrated via an analysis of a subset of data from the health surveillance of adolescence schoolgirls. The purpose of the study was to identify and evaluate blood biochemical parameters relating to nutrition anemia in adolescence girls, one of major public health problem. Four blood biochemical parameters, including blood hemoglobin (Hb), red blood cell count (RBC), mean corpuscular volume (MCV), and hematocrit (HCT), were measured at four time points. The intention of the analysis for this study case was used to illustrate MBD, which would emphasize the investigation of what was changing over time. The main points about the comparison between MBD and PARAFAC are based on interpret the results from medicine point of view. The fitness and interpretation of loading vectors estimated by MBD are easier and bettern than those estimated by PARAFAC. From the application point of view, MBD is outperforms PARAFAC.In summary, in terms of interpretability and fitness of model, MBD is an alternative method for exploring the latent structures and interrelations underlying multiway data, and prior to the classic multiway array decomposes, including PARAFAC, TUCKER, etc. Therefore, the proposed method can be applied widely in many fields to analyse multiway data.
Keywords/Search Tags:Multiway array, Low-rank decomposition, Singular value decomposition, PARAFAC, TUCKER
PDF Full Text Request
Related items