Font Size: a A A

Research On Bio Brick Quality Assessment With Machine Learning

Posted on:2022-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:S LiuFull Text:PDF
GTID:2480306560955119Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Bio Brick is a gene fragment that meets the assembly standard of synthetic biology.The rapid growth of the number of Bio Brick leads to complex and controversial data quality issues.In this dissertation,by comparing a variety of machine learning methods,we build the quality assessment method of Bio Brick by data-driven to reduce the burden of researchers in the field to judge the quality of Bio Brick one by one.In this dissertation,the quality assessment method of Bio Brick is proposed from two dimensions: the ability of accurate recognition by classification model and the ability to keep consistent with the shape of the same type of data,that is,accuracy and consistency.(1)Accuracy based Bio Brick data quality assessment method: through two-step feature transformation,the Bio Brick is described by digital representation,and uses the under-sampling method to solve the problem of class imbalance.Statistical process control technology was used as the filtering method to implement the preliminary quality screening.Based on the biological labels provided in the standard Bio Brick database,the machine learning based sequence recognition method is used to compare the sensitivity of various classification methods,and a classifier is constructed as a data-driven quality assessment model.According to the Bio Brick sequence recognition results,the accuracy score was calculated.(2)Consistency based Bio Brick data quality assessment method: the average shape of each type of Bio Bricks is obtained by calculating the sequence shape similarity of Bio Bricks,and the quantitative shape description of Bio Bricks is provided.The similarity between the Bio Brick and each center was calculated to generate the consistency score.Combined with the two dimensional quality assessment methods,the quality assessment model of Bio Brick was constructed.Finally,the quality distribution of Bio Bricks in the standard database was assessed and visualized,and2899 Bio Bricks with high quality were obtained.In this dissertation,quality assessment methods for Bio Brick based on machine learning algorithms are proposed,which is objective and effective in accuracy and consistency.This method reduces the difference of quality assessment results caused by the difference of professional level of researchers or the difference of detection equipment,and the quality assessment of Bio Brick can be realized quickly without providing quality labels in advance.
Keywords/Search Tags:Bio Brick, Machine learning, Quality assessment, Sequence recognition, Shape similarity
PDF Full Text Request
Related items