Previously, the video monitoring system of Public Security Organs has realized police service informatization in so me extent and improved working efficiency. However, with the expanding scale of video monitoring system and increasing demand of public security organ business, the existing video monitoring system can’t meet application requirement. Facing the massive ima ge data, how to make the computers automatically understand those images and classify them into different semantic categories according to human cognition, then classify and manage the huge amounts of image resources quickly and effectively becomes an impo rtant problem which is urgently needed to be solved in the field of public security research.Structural video information description technology can convert video content into text information which is intelligible for computer and people through means o f spatiotemporal segmentation, feature extraction and object recognition. The video image information can be organized into text semantic information which describe the video content accurately by semantic analysis of video content. These text semantic information can be used as public security information because they are convenient for retrieval and compression. According to requirements of monitoring video information for police work, the video structural description technology which is facing public security business is studied. This technology involves fields of image processing, pattern recognition and semantic extraction. Finally the soft-hardware realization of video information treatment equipment is studied.In computer vision and intelligent video surveillance community, the higher levels such as moving object classification, tracking and behavior understanding heavily depend on the results of moving object detection. In this paper the background modeling and moving object extraction in complex sce ne are fairly deeply explored. An improved mixture Gaussian background modeling algorithm is presented to reduce the computational cost of traditional mixture Gaussian algorithms. The improved mixture Gaussian-based background model updates the parameters of Gaussians according to the frequency of a pixel value changes. Experimental results show that our algorithm can improve the processing speed greatly and detect moving object accurately.Video semantic information extraction technology automatic dimensio ning the video data according to a specific group of characteristics category. This technology can support video classification and retrieval based on content. The moving targets in video which are expressed by partial semantic concept are main research ob jects in this paper. The extraction method of key video semantic information based on probabilistic Latent Semantic Analysis(p LSA) model is studied. Firstly, images are decomposed into variant scales and diverse visual details are extracted from different scale layers. Secondly, a density-based adaptive selection method is employed to choose the best topics number for probabilistic latent semantic analysis model. Then, the p LSA model and Markov random field are combined to mine the contextual semantic co-occurrence information of image patches, thus to construct more accurate visual words. Finally, make statistics on the frequency of visual words in diverse scale layer and linearly combine them to form a multi-scale histogram as the image representation which is subsequently used in scene classification with SVM classifier. The experimental results demonstrate that our novel algorithm effectively utilizes the multi-scale and contextual semantic information of images and improves classification performance.When video stream is described by semantic information extracted from video, the compression ratio can be considerably increased, and the functionalities of accessing, indexing and retrieval, manipulating for video can be enhanced. Hierarchical description and extraction technologies of video semantic information are researched. A structural description scheme for video semantic information is designed, the relationship of objects at different layers is described by object hierarchy, and the relationship of objects at the same layer is described by entity relation graph, thus, the multi- level abstraction of video semantic information is formed. According to the model proposed, a video query system is designed, users can efficiently browse and search video database by different characteristics from different layers.Finally we use the FPGA+DSP architecture and research the hardware/software design of video structural description processing system based on FPGA+DSP. The DSP is used as main processor for core algorithm of image data treatment. The FPGA is used as aided processor for image acquisition, image preprocessing and data communication with DSP. According to the research results above, we have constructed an embedded system for structural description of video information, which can realize real-time analysis and transmission of collected vehicle information on road. The traffic police department could inquiry and search the description database according to requirement, which have improved the use efficienc y of traffic video. This system has been used in many public security sectors such as Nanchang and Taichang and some preliminary achievements has been achieved. |