Font Size: a A A

Research On The Applications Of Trilinear Decomposition Algorithms To Dynamic And Multi-State Protein Systems

Posted on:2015-03-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:S R ZhangFull Text:PDF
GTID:1261330431450334Subject:Analytical Chemistry
Abstract/Summary:PDF Full Text Request
It has been seen in recent decades that the analytical scientists have more and more instruments to make deep researches of interest subjects. There was no longer any doubt that analyst could only get one data point in one experiment. On the contrary, the size of data from a single experiment is still growing. It is extremely needed a tool for data analysis in chemistry and its interdiscipline. This is the objective reason why the chemometrics is rising sharply. In analytical chemistry, researchers prefer to get quantitative information besides the qualitative information full of chemical/physical meaning. The zero-order calibration methods can’t handle the system with interference, and the first-order calibration methods work well in white system except gray and black system. The second-order calibration based on trilinear decomposition has uniqueness in decomposition and "second-order advantage". The "second-order advantage" would help researcher get quantitative and qualitative information full of chemical/physical meaning while there are unknown interferences in the system. Studies presented in the thesis discuss the application of trilinear in the dynamic system and multi-state system. And the research for the basic theory of trilinear decomposition was another focus of this thesis.(1) In chapter2, an original analysis tool named error transmission structure (ETS) was proposed for iterative trilinear decomposition algorithms. ETS can elucidate almost all the characteristics of ATLD (Alternating Trilinear Decomposition) and PARAFAC (Parallel Factor Analysis), such as convergence rate, excessive component number sensitiveness, and even the quality of the resolved profile, though it is a bit complicate. By contrast, it is hard to find out the roots of these characteristics by studying the fitting residual, because it always converges no matter whether the component number is over estimated. It was pointed out that characteristic for component number sensitiveness was only cause by the middle terms of ETS which originate from the deviations of component matrices. The middle terms will generate an extremely high perturbation like a wall which blocks PARAFAC algorithm to reach the objectives solution. The last term indicated that the residues of the trilinear model will enlarge the deviations of component matrices when the algorithms close to objectives solution under the condition that the estimating of component number is correct. Though the main phenomenon of each algorithm was not the same in dynamic track when the component number is over estimated, it was found out some analogous behavior which would be a hint for designing a new method for component estimating.(2) In chapter3, we introduce the differential analysis which comes from the fields of statistics and econometrics. The optimization strategy and the information for solution space of a specific objective function can be investigated by the differential analysis. The results of the differential analysis for several algorithms would shed light on the high-performance algorithm design. If the shape of object function for a specific trilinear decomposition algorithm is "convex", the corresponding algorithm will be sensitive to excessive component number; if the shape of objective function is "strictly convex", the corresponding algorithm will not be sensitive to excessive component number. The concept of solution set discussed in this research complements the theory for the uniqueness/non-uniqueness of the trilinear decomposition. The objective solution is still one of the feasible solutions (even absolute minimum) of the objective function when the component number is over estimated. The root causes why some of the algorithms can reach the objective solution but the others can’t will be provided by ETS and the differential analysis.(3) In chapter4, the quantitative investigation for the multi-state fluorescence of3-methylindole (MI) was discussed. The multi-state fluorescence characteristics of MI make its spectra rich in chemical information and the spectral interpretation rather challenging. The trilinear decomposition method could be competent for this task and provide a deeper insight into the hydrogen binding to MI. Taking the excitation fluorescence spectra together with the emission counterparts to formulate a three-way data array and solving the data array using Alternating Trilinear Decomposition (ATLD) algorithm is beneficial for studying hydrogen binding to MI in several aspects. Firstly, making full use of the excitation spectra could guarantee that the experimentally collected data contain sufficient amount of information necessary for investigating signals originated from the weak interactions buried in the strong interaction background. Secondly, the resolution of a three-way data array could theoretically guarantee the uniqueness of the resolved component spectra with actual physical meaning. And thirdly, the ATLD algorithm resolves a spectrum of a complex mixture and find out the spectra of corresponding individual components of different states without disturbing the complex chemical equilibrium involved. The hydrogen bonding interaction of MI with other molecules has been studied using the ATLD algorithm. A detailed investigation has been undertaken for the1La and1Lb states as the lowest excited singlet states which dominate the fluorescence emission of MI depending on the effect of other molecules and the surrounding microenvironment. The hydrogen bonding between indole derivatives and other molecules has been examined and some association constants involving hydrogen bond formation have been estimated and compared with theoretical simulation results or experimental observations of previous researchers.(4) In chapter5, the present investigation attempted to separate the variation of time domain from the steady-state fluorescence and make a quantitative discussion for state-switching of a-chymotrypsin (CHT). ANS (1-anilinonaphthalene-8-sulfonate) have two states of fluorescence corresponding to different excitation and emission processes respectively. The Excitation-Emission Matrix fluorescence (EEM) can record all the excitation and emission signals for the ANS-CHT complex system, though it is a steady-state technique. And the three-way data set constructed by EEM data of different sample may provide excitation spectra indicating specific excitation processes, emission spectra indicating specific emission processes and quantitative description for the time domain processes after decomposed by trilinear decomposition method. Besides detail discussion for the excitation-emission processes of ANS, the quantitative investigation for state-switching of CHT had been possible. Because the S1,ct fluorescence of ANS is sensitive to the solvation environment which is one of the indicators for CHT’s activity. The switch output curve of ANS-CHT system in wide pH range was obtained at last. This work proposed a convenient and economical protocol for investigating the state-switching for proteins.(5) In chapter6, the applicability of different trilinear decomposition algorithm to LC-MS data measured from multiple samples had been discussed. An actual LC-MS data set contained a low abundance peptide was adopted to make a test for these algorithms. The bilinear method was not able to handle this type of low abundance situation and made a mathematical separation as expected. It had been found out that the famous trilinear decomposition algorithm could not be used in the LC-MS data directly. The most probable reason is the sparsity property of the pure MS spectra, which means they have positive response values at some m/z coordinates where the ions emerged and zero values elsewhere. A novel algorithm named NNATLD (Non-Negative Alternating Trilinear Decomposition) has been designed by the present authors to make an effective trilinear decomposition for the three-way data set constructed by LC-MS data. The new algorithm adapts the property of MS spectra, saves the computing resources, and converges fast.(6) In chapter7, the present research made a further elucidation for the internal relationship of LC-MS data which was lack of systematic understanding in the traditional Shotgun Proteomics works. The relationship is called as multi-linearity, and could be handled efficiently by tri-or quadri-linear (and even multi-linear) decomposition. In order to make the trilinear decomposition method fully adaptable for the LC-MS data, a novel trilinear decomposition algorithm has been developed by the present authors. The trilinear decomposition method could achieve quantitative and qualitative results simultaneously. Its resolution ability may enhance the peak capacity of the LC systems significantly, which would make the1D-LC more efficient than the conventional single-dimension LC methods which do not use "mathematical separation" but consume hours to implement physical/chemical separation. The trilinear decomposition algorithm could get pure mass spectra for the reason that it collects information according to the chemical/physical meaning, which is starkly different from conventional methods. This protocol works quite efficiently for both high abundance and low abundance peptides. Another dramatic feature of the new protocol is that it can simultaneously attain quantitative information when qualitative information being achieved. The rich information content mined by the new protocol could support further in-depth investigation for the proteomic research object. The online analysis for the familiar interaction of HSA (human serum albumin) with trypsin has been implemented in this work. The difference of activities between different areas of HSA had been demonstrated clearly, a kind of information very difficult to obtain for traditional Shotgun Proteomics methods.
Keywords/Search Tags:Chemometrics, Trilinear decomposition, Dynamic system analysis, Multi-state analysis, Proteomics, LC-MS, Error transmissionstructure, Differential analysis
PDF Full Text Request
Related items