Font Size: a A A

Query Processing Techniques Based On Time Series Analysis

Posted on:2020-02-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:P P WangFull Text:PDF
GTID:1488306353964119Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of science and the progress of the technology,the era of "big data" has come.People have more and more ways to obtain data.The growth of data volume has reached an unprecedented speed.Among all kinds of data,time series is one of the most important data type.Time series is a kind of widely existing data.It is a numerical sequence obtained by equal time interval observation of a physical quantity,which can reflect the state and condition of the monitored things.The sampling points of time series data have the characteristics of continuity and numerical value.The whole time series can be regarded as a whole data object.Time series are widely used in many fields such as multimedia,medicine.speech recognition,economic and financial analysis.physics,astronomy and so on.The characteristics of time series data are large scale,high dimension and dynamic.In addition,time series has many complex manifestations:massive,high dimension,complex structure,the existence of noise,similarity changes and so on.Because of these inherent characteristics,the research of time series data mining has lots of challenges.The time series data query technology is similar to the general data query technology task.It is focus on mining and extracting knowledge representation for analysis and Application.Although human beings can intuitively and naturally learn the potential knowledge of each time series through the characteristics of the "shape" of the time series data.For any computer,it can only accomplish basic mechanical computing tasks.No computer can directly acquire the ability of perception,understanding,recognition and similar human beings.Therefore,in the fields of data mining and machine learning,the fundamental task is to design corresponding models and algorithms,which can make the computer acquire such intelligent abilities as perception,understanding and recognition through programs.In the past ten years,a large number of scientists have devoted themselves to time series data mining.Many research results have been achieved in time series feature-based query technology.However,due to the complexity of time series data,time series data mining still faces many new problems and many key problems to be solved urgently in practical application.This thesis explores the approximate representation,similarity measurement and time series clustering in time series data query technology.This thesis analysis and research four different time series data:one-dimensional music data,high-dimensional music data,medical data stream data,speech data.It studies the key problems faced by music humming query technology,music Top-k retrieval technology,wearable sensor anomaly detection technology,speech emotion fuzzy recognition technology and so on.Facing different kinds of problems,a variety of methods are proposed,which using efficient index method and fast query matching method.The content of this thesis is divided into four parts:(1)One-dimensional Music Data Retrieval Technology Oriented to Time Series CharacteristicsA fast music humming retrieval technology is proposed,which can realize fast music humming retrieval.Firstly,according to the characteristics of humming music,music database and user-provided humming fragments are divided into music sentences according to the natural pause.At the same time,the K-means clustering algorithm is used to calculate the pitch similarity of music sentence fragments,and the location-specific score matrix is extracted according to the clustering situation.In addition,NA matching algorithm,sequential forward scoring SLS algorithm and permutation matrix forward scoring PLA algorithm are proposed based on the scoring matrix.The experiment results on real data sets show that SLS algorithm and PLA algorithm can achieve fast and effective humming music retrieval results.(2)Multi-dimensional Music Data Retrieval Technology Oriented to Time Series CharacteristicsA distance function MDTW for multi-dimensional sequence matching and a subsequence matching method MDTWsub are proposed.The results show that this method has high application value in music retrieval.In this thesis,the music is represented by a two-dimensional time series in which each dimension holds information about the pitch or duration of each note.In order to improve the efficiency,we use inversion table and q-gram technology to process the music database,and q-chunk technology to process humming tracks.TopK-Brute and TopK-LB algorithms are proposed to search for Top-k songs.The experimental results show that the TopK-LB algorithm is effective and efficient.(3)Wearable Sensor Anomaly Detection Technology Oriented to Time Series CharacteristicsAn anomaly detection algorithm for a medical wireless sensor is proposed.In the field of medicine,it exists lots of time series data.Wearable health equipment uses wearable sensors to collect and monitor biological and physiological parameters of human motion flow data,at which time the data are arranged in chronological order according to the medical records of each observation.We propose three algorithms to detect anomalies.There are BF,ET and PF.The proposed method can effectively detect patients' anomalies while maintaining reasonable alarm accuracy and recall rate.Our experimental results on real patient datasets show that the proposed method can detect abnormalities quickly,with high accuracy and recall rate.(4)Speech Emotional Fuzzy Recognition Oriented to Time Series CharacteristicsThe theory of fuzzy clustering is studied,and an adaptive fuzzy clustering algorithm for speech emotion recognition is proposed by using the clustering radius of different data sets.Speech emotion recognition process mainly includes speech signal preprocessing.speech emotion feature extraction analysis and speech emotion classification and recognition.Because the semantic variables of emotional information i are ambiguous and uncertain,it is difficult to accurately identify emotional states.When the dimension of affective feature parameters is very high in the process of recognition,it increases the difficulty of recognition and reduces the recognition rate.Therefore,for speech emotion recognition method based on fuzzy theory,this thesis studies the adaptive fuzzy C-means algorithm SEAF of speech emotion recognition method based on speech emotion.The experimental results show that the performance of SEAF algorithm is affected by the fuzzy weighting index.It is found that SEAF has better recognition effect than FCM,and the anti-noise performance of SEAF algorithm is better than that of FCM.The proposed algorithm has good speech recognition performance,simple design,good applicability and scalability.
Keywords/Search Tags:time series, music humming query, data anomaly detection, speech fuzzy recognition, similarity query
PDF Full Text Request
Related items