Font Size: a A A

Convolutional Neural Network Based Latent Dirichlet Allocation Video Retrieval

Posted on:2018-10-14Degree:MasterType:Thesis
Country:ChinaCandidate:P X SunFull Text:PDF
GTID:2348330518996527Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In image and video analysis, Convolutional Neural Network (CNN)achieves higher precision and faster processing speed, so that CNN is an important trend. Latent Dirichlet Allocation (LDA) document generated topic model is not only widely applied in the field of text mining, but also applied in the field of image in recent years. The task of video retrieval can make use of LDA model, to form visual words and visual topics. This approach can reduce video feature dimension and fasten video retrieval.This paper designed the convolutional neural network based lantent dirichlet allocation video retrieval system. This paper did research in video feature generation, video representation, video retrieval. The proposed framework was realized on the desktop. This paper did different experiments on different datasets, also compare and analysis their results.In this paper, the main content of the key work and research are as follows:(1) Video has the characteristics of large amount of data, and it needs a large number of data preprocessing to reduce its dimension. Extracting key frame of video, can simplify the content of the video. Block partition method is used to select the key block, which has more information of the target object in the key frame. This method can further simplify the structure of video.(2) After simplifying the structure of video, extracting the underlying characteristics of the video is needed. The current mainstream video underlying characteristic such as color histogram has no the information like shape and texture, and is more sensitive to color changes. Local characteristics can collect the target object information, but directly computing the similarity between video more time-consuming. Using convolution neural network method to extract the underlying video feature can retain more visual information and improve the performance of video retrieval.(3) CNN video features always has high dimension. Directly using CNN features to retrieval is time consuming. There is a clear need for data dimension reduction. The model Bag of Words is a kind of video feature. This model uses the clustering algorithm to map visual features to the words space. Using these visual words to represent the video. On the basis of BoW, this paper proposes LDA visual topic model to form more compact video representation.(4) This paper designed the convolution neural network based LDA video retrieval system, and compares the proposed method with the front video retrieval method BoW. According to the result of the experiment,this paper analyzed their performance, and proves this paper's method is practicable, generative, and effective.
Keywords/Search Tags:video retrieval, Convolutional Neural Network, Lantent Dirichlet Allocation, Bag of Words K-means
PDF Full Text Request
Related items