Font Size: a A A

Research On IIoT Time Series Similarity Search Technologies

Posted on:2022-06-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:R KangFull Text:PDF
GTID:1488306746456304Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Industrial Internet of Things(IIoT),big data and artificial intelligence,traditional industries have begun to involve more digitization and intelligence.A large amount of data has been generated in the new generation of intelli-gent industrial architecture.Intelligent decision-making based on data analysis plays an important role in fault diagnosis and predictive maintenance.Time series data is a very important data type in IIoT,and time series similarity search is the foundation of many analysis tasks.However,in different scenarios of simi-larity search,the available information(such as domain knowledge and data with semantic labels)is quite different.It has become a challenge to complete the task of IIoT time series similarity search with given available information in different industrial scenarios.This paper focuses on the similarity search for IIoT time series.The novel contribu-tions are as follows: In the scenario with domain knowledge,to solve the problem that the traditional sin-gle threshold pattern can not effectively describe the multi-stage industrial process,a compound pattern with multi-segments and multi-thresholds is summarized.An index based on equal-length block representation is proposed for the subsequence matching.The proposed method reduces the time complexity while guaranteeing the no-false-dismissals. In the scenario without domain knowledge but with semantic labels,to solve the problem that the traditional data-independent distance measure can not effectively characterize the similarity of IIoT time series,the Maximum-Margin Hamming Hashing(MMHH)is proposed for the whole matching with semantic information.By introducing the industrial similarity labels,the model improves the ability to characterize the semantic similarity between IIoT time series.By introducing the maximum margin and semi-batch optimization,the robustness of the model to IIoT data with unbalanced and low-quality similarity labels has been improved.The pro-posed method outperforms the traditional distance functions in terms of the simi-larity measure on IIoT time series. In the scenario where both domain knowledge and semantic labels are insufficient,the experience of domain experts is necessary.To solve the problem that the tra- ditional technologies for similarity search can't achieve enough recall in response time constraint,the whole matching using tree-based indexes is studied.Two op-timization methods based on the neural network and quantization are proposed to optimize the data access strategy of tree-based indexes.The proposed method sig-nificantly improves the recall performance in the response time constraint. To solve the problem of the insufficient integration of similarity indexing methods in time-series database management systems,an expandable indexing framework is designed and implemented in Apache Io TDB,a time-series database management system for IIoT.The indexing framework enables Io TDB to support the similarity search including whole matching and subsequence matching,and provides a plat-form for index developers to integrate more indexing methods into Io TDB.
Keywords/Search Tags:Industrial Internet of Things, time series, similarity search, indexing, machine learning
PDF Full Text Request
Related items