| With the widespread deployment of video surveillance systems, cameras throughout the city produce massive video data day and night. By analyzing these video data, we can build a lot of intelligent services, such as escape path finding, traffic flow detection and traffic violation detection. Surveillance video structured analysis method can extract the video metadata from traffic surveillance video. These video metadata, which contains license plate number, vehicle entry time, vehicle departure time, vehicle color and other information, is an important foundation for building the above intelligent services.Currently, the existing surveillance video structured analysis methods usually rely on some computer vision algorithms, which are not accurate enough in the real world scenes, generate a large number of wrong video metadata, and do not deal with big surveillance data efficiently. We design and implement a surveillance video structured analysis method in cloud. This method mainly contains metadata extraction and metadata correction. Firstly, we define the structure of surveillance video metadata. Based on this structure, we employ a four-phase method to extract video metadata. These phases are as follows: object detection, object classification, object metadata extraction and object tracking. Secondly, we propose a spatiotemporal graph based metadata correction approach. It fuses the massive video metadata of whole camera network, automatically detects suspicious metadata and corrects them based on the metadata spatial-temporal relationship and the image similarity. Our method is implemented in cloud using Hadoop and HBase. Moreover, we open the source code of our method sub-module Hadoop Video Processing Interface (HVPI), which allows users to quickly build large-scale video analysis applications based on Hadoop. Finally, we conduct extensive experiments on real traffic surveillance videos. Experimental results show that our method can significantly improve the accuracy of extracted metadata, and also significantly improve the efficiency of processing the massive surveillance video data. |