| In the field of nanoparticles research,the nanoparticles in environmental samples have large number and various types,so it is necessary to analyze a large number of nanoparticles in the sample with the help of computer technology to determine the sample’s composition,and then realize environmental monitoring.However,in the existing research in this field,it is common to extract the elements unique to nanoparticles in the sample experimentally,or directly using the unsupervised method to cluster the nanoparticles in the sample to achieve the purpose of sample traceability.The above methods cannot quantify the results of sample traceability,and the information contained in the nanoparticle data is not fully excavated.In view of the problems existing in the research field of nanoparticles,the sample’s particle data obtained by the single-particle inductively coupled plasma time-of-flight mass spectrometer(SP-ICP-TOF-MS)are analyzed and studied,and the concept of "nanoparticle fingerprint" is proposed to represent the co-occurring isotopes combinations in sample’s nanoparticles.The nanoparticle fingerprint extracts some special isotopic combinations in samples,and the type of substance contained in the sample can be understood by classifying the substances to which the nanoparticle fingerprint belongs;By matching the nanoparticles corresponding to the nanoparticle fingerprints,the isotopic distribution of each nanoparticle fingerprint is counted;Based on the class and isotopic distribution of nanoparticle fingerprints,the traceability results of environmental samples can be quantitatively analyzed.Based on the concept of nanoparticles fingerprint proposed in this paper,the main contributions made in view of the problems and challenges in the field of nanoparticle research are as follows:(1)The "iterative method" of the nanoparticle screening method was improved,and the number of isotopes that could be processed simultaneously by the improved iterative method was increased from 7 to 34;A new nanoparticle screening method "Poisson method" was realized,which changed the status quo that the method only stopped at theoretical research.(2)The concept of "nanoparticle fingerprint" is proposed,which expresses the cooccurring isotope combinations of sample’s nanoparticles.The nanoparticle fingerprint extraction task is converted into a frequent term extraction task,and the nanoparticle fingerprint extraction algorithm is established based on the frequent term mining algorithm and the weighted algorithm TF-IDF in the field of natural language processing.The algorithm is suitable for the extraction of nanoparticle fingerprints of pure samples,configuration samples and real samples,all of which perform well.(3)A sample traceability algorithm based on supervised learning is established.On the basis of the isotopic distribution data of the fingerprint of the sample’s nanoparticles,the characteristics of the nanoparticle fingerprint are constructed by using the external data magpie containing the atomic properties of the elements(electronegativity,atomic mass,valence and electron properties,etc.),and the first supervised nanoparticle fingerprint classification model is established based on these features,and the material distribution of the sample species is calculated according to the classification results of the nanoparticle fingerprints and the number of nanoparticles included,so as to realize the traceability of the sample.Based on the nanoparticle fingerprint and the mass distribution of the configured sample,the configured sample label generation algorithm is designed to calculate the substance content of the configured sample and use it as a label to achieve supervised sample traceability.(4)The design and implementation of the nanoparticle analysis system MVNAS,to achieve a complete analysis of the nanoparticles in the sample from screening to traceability.This paper defines the analysis process corresponding to different levels of support as a version,and the nanoparticle fingerprint extraction,data set construction,model training,and sample traceability of samples are carried out in different support levels in terms of versions.MVNAS supports the analysis and processing of the same batch of experimental data under different versions,as well as the reuse of models under different versions.MVNAS realizes the whole life cycle management of experimental data based on the idea of data lake,which is the first management system for sample nanoparticles data,which realizes the backtracking of data in the middle of the processing process and the visualization of nanoparticle fingerprinting results,as well as the reuse of different versions of the model. |