Font Size: a A A

The Application Of Grassmann Manifold In Video Analysis

Posted on:2018-10-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y FengFull Text:PDF
GTID:2348330542991378Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the continuous advancement of Internet and technology,our demand for video application is no longer confined to watching or viewing the videos.Instead,people need to analyze the meaning of video,and then find out the correlation between or within the video,Analysis of what is in the video,therefore,comes on the top list.The essence of analyzing video content is to interpret its semantic meaning,which requires accurate description of video context.Image set refers to image collection in which images of the same category have some differences.Image set representation method is not only used to analyze single frame image,but also to analyze the video through the common information shared by the image set which is composed of multiple frame image.Due to its accuracy and robustness to within-class changes such as light variations,angle variation and so on,Image set representation has become the trend of features representation.Video frame sequence changes slowly,but they have some differences within class.So we introduce image set into video analysis,which can effectively avoid inaccurate representation brought by within-class changes.As a result,we represent the video as image set,and we verify its performance by means of video classification.Grassmann manifold is a topological space which is locally similar to Euclidean space at each point of manifold.A convenient way of dealing with image set is to represent them as points on Grassmannian manifolds.Then a video will be represented as a point on the Grassmann manifold,which will transform video anlysis into point anlysis on Grassmann manifold.The main focus and contributions of this paper are as follows:Firstly,the idea of the image set is introduced into video analysis.We use the feature set instead of the traditional feature vector to represent the sample.That is to say,we represent the video as a matrix instead of a vector.Secondly,the CNN is introduced into the image set.CNN is so powerful in feature extraction and expression that it has becomea very common feature extractor.We,therefore,extract the CNN features to describe the video image set.Thirdly,we put forward the Grassmann kernel and the combination between Grassmann kernel and kernel alignment.According to the relationship between princal angles among two points on the Grassmann manifold and error rate,improved Grassmann kernel was proposed.Meanwhile combining with the kernel alignment,we proposed Grassmann kernel learning based on kernel alignment.Youtube,UCF50 and Penn sports action are three datasets about action.We have verified the effectiveness of the methods above on the three databases.Moreover,we verified the applicability of the method on the “in the wild”scene database.
Keywords/Search Tags:Video analysis, CNN, Image set, Grassmann mainfold, Kernel combination
PDF Full Text Request
Related items