With the rapid development of my country’s economy and society,the application of bridge structure health monitoring system in bridge operation and maintenance has gradually received attention.However,due to the influence of environmental factors and system failures,the monitoring data are frequently distorted,and structural performance analysis and early warning become empty talk.The main goal of this paper is to integrate big data technology and deep learning ideas to build a standardized,automated,and precise bridge big data cleaning platform,so that the data of the bridge structure health monitoring system can better serve the assessment and early warning analysis.The main content and contributions of this paper are as follows:(1)Based on the multi-source monitoring samples of different bridge records,a unified classification standard for distortion types is formulated,which can be classified into five categories: missing,outlier,drift,anomaly trend and noise.For each type of distortion,the feature description and examples are given,and the discriminant algorithm and remediation algorithm suitable for mass data processing on big data platform are summarized.(2)For the existence of the large-area missing,gradual drift,abnormal trend and other distortions that are difficult to repair by general algorithms in the monitoring data,this paper introduces the concept of Long Short-Term Memory neural network(LSTM),and establishes "single point to single point" data correlation model and "multi point to single point" data association model.Analyzes its data mapping effect,verifies the robustness of the model,and combines the generalized 3σ criterion to successfully establish a long-term effective high-precision mapping model to achieve precise positioning and repair.(3)Propose a bridge monitoring data association model based on Conditional Generative Adversarial Network(CGAN).Verify the relationship between the model’s complement effect and sample length,data quality,and number of data sources through measured data.The research shows that the model is suitable for identifying the correlation characteristics between low-quality data and completing the repair task in small batches,which is a supplement to the LSTM data association model.(4)The relevant technologies of big data platform are studied,and the architecture solutions of "HDFS+Spark" and "Spark Streaming+Kafka" are respectively adopted to perform offline data analysis and online data analysis functions.The specific platform architecture and related algorithm flow are further designed.The system integrates the data cleaning algorithm based on statistics,the LSTM data association model algorithm and the CGAN data association model algorithm.The experimental results show that the false positive rate,accuracy rate,sensitivity and other indicators of the algorithm process meet the expected requirements. |