Font Size: a A A

Research On Deep Computation Model For Heterogeneous Data

Posted on:2019-12-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:F Y BuFull Text:PDF
GTID:1368330545469098Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Feature learning plays an important role on heterogenedous data analytics and mining.It aims to learn the features of data to extract the useful information for classification and prediction.Recently,deep computation model was proposed for heterogenedous data feature learning based on the tensor data representation by generalizing the traditional deep learning model to the tensor space.However,the deep computation model still has several drawbacks.For example,it usually produces over-fitting,reducing the accuracy of classification.Moreover,it ignores the interactive correlations over the multi-modal features underlying in different subspaces,leading to a low effectiveness in learning features for heterogeneous data.In addition,it has a high computational complexity,resulting in a low efficiency for heterogeneous data feature learning.To tackle these issues,the paper studies the deep computation model further for heterogeneous data feature learning.In detail,the major contents of this paper include the following four points:(1)An adaptive dropout deep computation model.Current deep computation models usually produce over-fitting,reducing the classification accuracy.Aiming at this problem,the paper proposes an adaptive dropout deep computation model.Specially,the paper designs an adaptive distribution function to set the dropout rate of each hidden layer in the deep computation models for preventing over-fitting.Furthermore,we improve the high-order propagation algorithm to train the parameters of the adaptive dropout deep computation model.Experimental results demonstrate that the proposed model can prevent the over-fitting and improve the classification accuracy for heterogeneous data.(2)A deep computation model based on crowdsourcing.The lack of public available labeled samples poses a challenging issue on the fine-tuning of the parameters of the deep computation model.To tackle this issue,the paper develops a deep computation model based on crowdsourcing.First,we design an outsourcing sample selection algorithm based on maximum entropy and fuzzy clustering to choose the samples to outsource.To find the ground-truth labels of samples,we present an answer aggregation scheme based on expectation maximum to calculate the level of annotators' expertise and update the parameters simultaneously.Experimental results indicate that the proposed scheme can fine-tune the parameters effectively using the answers given by the annotators.(3)A double-projection deep computation model.Current deep computation models ignore interactive correlations over the multi-modal features underlying in different subspaces,leading to a low effectiveness in learning features for heterogeneous data.For this problem,the paper presents a double-projection deep computation model.The proposed model projects the input into two separate subspaces in the hidden layers to learn interacted features from the underlying factors of heterogeneous data by replacing the hidden layers of the deep computation model with double-projection layers.Furthermore,we devise learning algorithms to train the double-projection deep computation model.Experimental results indicate that the proposed model can capture the complex correlations over the multiple modalities to improve the classification for heterogeneous data.(4)A high-order CFS(Clustering by Fast Search and Find of Density Peaks)clustering algorithm based on deep computation.Current CFS algorithm is difficult to cluster heterogeneous.Therefore,this paper proposes a high-order CFS algorithm based on deep computation for heterogeneous data clustering.We first use deep computation to learn the feature from each modality and then use the tensor outer product to fuse the learnt features for tensor feature for each object.Finally,we extend the current CFS algorithm from the vector space to the tensor space for heterogeneous data clustering.Experimental results demonstrate the proposed algorithm can cluster heterogeneous data effectively.
Keywords/Search Tags:heterogeneous data feature learning, deep computation model, crowdsourcing, supspace projection, CFS clustering
PDF Full Text Request
Related items