Font Size: a A A

Multi-Modal Video Information Retrieval

Posted on:2009-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:H YuFull Text:PDF
GTID:2178360272458931Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of multimedia technology and internet application, multimedia data especially that of video data explodes. Requirements of efficient video retrieval become more and more important. Efficient technology of video retrieval can greatly help people obtain entertainment on internet and promote the quality of lives.Nowadays, textual retrieval performs well. People can retrieval some relevant textual contents on internet by using baidu or google. Compared with the textual data, the structure of video data is more complicated: scene groups, scene, shot, frame. There're many kinds of feature information contained in video data, e.g. text, image, sound, which makes the process of the video more difficult and doing efficient video retrieval becomes the challenge. Actually people can retrieval the relevant video data by means of the features contained in video data.Many methods of video retrieval have been proposed up to now. In early days, people just did video retrieval by means of textual or image information singly. The textual based retrieval can promise high value of recall, while the image based retrieval performance well when the query topic has something to do with visual scene. Generally the single-feature-used video retrieval cannot perform well, so people take all kinds of features into account. Different feature information has different advantages in video retrieval, making use of the technology of machine learning could promote the performance of video retrieval. According to different features, we can make several sub-retrieval modules. Nowadays many researches focus on how to apply suitable technology of machine learning on fusing and training these sub-retrieval modules.Different methods of machine learning have been used in video retrieval, rather their performances seemed not very well. The main reason of the unacceptable performance is that we haven't fully extracted the information contained in video data. If we only make efforts on machine learning and neglect the video data itself, the performance of video retrieval cannot be promoted. Our algorithm focuses on how to extract the information contained in video data fully and make use of the relation among different kinds of features.We propose a new model of video retrieval based on multimodal information. We do manual search and interactive search in TRECVID. Years of TREC Video Retrieval Evaluation (TRECVID) research gives benchmark for video search task. Our model performed well in TRECVID.The article particularly introduces the multimodal video information retrieval. Our video retrieval model shows its effect according to the experimental data. At the end there're conclusion and expectation about future work.
Keywords/Search Tags:video process, content-based video retrieval, multimodal feature information, TRECVID, sub-retrieval module, manual search, interactive search
PDF Full Text Request
Related items