| There is an urgent need for computer vision technology in the fields of video surveillance and automatic assisted driving.Video-based vehicle and pedestrian tracking and recognition is one of the key research problems.In the field of video surveillance,vehicle tracking and recognition based on a single static camera is widely used in intelligent transportation systems.By analyzing the vehicle’s driving trajectory in the video,it can help the traffic department to accurately punish or determine the responsibility of illegal drivers.In pedestrian tracking and identification based on multiple static cameras,through the analysis of the abnormal behavior of pedestrian groups or single targets in the video,it can assist the public security organs to give early warning of criminal behavior or track and collect evidence for criminal suspects.At the same time,in the field of automatic assisted driving,pedestrian and vehicle tracking and recognition based on moving dash cameras are also important research tasks.By analyzing the distance between the driving vehicle and the targets in the view field of the moving camera.so as to avoid collisions and assist safety driving.After nearly two decades of development,existing target tracking and recognition algorithms can accurately track and recognize some simple video scenes.The accuracy has yet to be improved.This paper analyzes and discusses the development of video-based vehicle and pedestrian tracking and recognition.Taking solving the occlusion problem during target tracking as the starting point,a multi-kernal constraint tracking method in 3D space is proposed.The main innovations and contributions of this thesis can be summarized as follows:(1)A vehicle tracking and recognition system based on 3D multi-kernel constraints of vehicle model under a single static camera is proposed,which solves the problem that the traditional vehicle tracking and recognition algorithm cannot continuously track and recognize different types of vehicles with the same color when occlusion occurs during tracking.The system first uses a distribution estimation algorithm(Estimation of Distribution Algorithm,EDA)to calibrate the parameters of the static camera,and builds a three-dimensional vehicle model for the target vehicle by continuously fitting the general model of the vehicle with the vehicles detected in the image.Each surface in the constructed 3D model of the vehicle is regarded as a kernel.By constraining each kernel,it is continuously updated the position of the moving vehicle under the Kalman filter framework after multi-kernel constrained tracking(CMK)is performed in the 3D space.The method can achieve continuous tracking by adaptively adjusting the unoccluded kernel weights in the 3-D vehicle model when the vehicle is partially occluded.In the case of the vehicle is completely occluded and then appears in the scene,according to the vehicle type,color,license plate area,and other feature information obtained in the vehicle model,the self-similar descriptor(SSD)is further introduced into the license plate area for the vehicle Re-identification,and re-tracking after the occlusion is then overcome.The proposed method is evaluated on the NVIDIA AI City dataset and a self-recorded high-resolution video.The experimental results show good tracking performance,which can not only successfully track the vehicle continuously under occlusion,but also keep the geometry of the 3D model of the vehicle unchanged.The proposed method won first place in the tracking task in the 2017 NVIDIA AI City Competition.(2)A 3D multi-kernel constrained pedestrian tracking system based on deep learning for pedestrian re-identification under multiple static cameras is proposed.Pedestrian tracking under multiple static cameras refers to giving one or more photos of a pedestrian to determine whether the pedestrian appears in other static cameras,and finally,associate pedestrians appearing in different cameras to achieve continuous tracking across cameras.Among them,the pedestrian re-identification problem under different cameras is the main research content.In this paper,a pedestrian re-identification method combining multi-scale local regions and multi-layer attention mechanisms is proposed.By adding the information on the pedestrian’s body structure to the visual feature representation of pedestrians,it solves the problem of using a single scale to divide pedestrians into different subgroups.Pedestrian body structure information cannot be extracted in regional deep learning methods,which leads to the problem of lack of robustness for pedestrian re-identification in complex scenes.The proposed MLMANet model firstly builds a multi-scale local branch,obtains the local area feature information of pedestrians at different scales by using the vertical segmentation strategy,and then obtains the global and local areas of pedestrians through the pooling operation.characteristic information.In addition,the attention feature information in different convolutional layers is obtained through the multi-layer attention branch constructed in the network model,which enables the model to extract deeper pedestrian semantic information.Then,the multi-scale local area branch and the pedestrian feature information in the attention branch are combined through the cascade strategy to obtain more recognizable pedestrian visual feature information.Finally,pedestrian re-identification is performed according to the similarity comparison of the extracted pedestrian feature information,and then the pedestrians under different cameras are associated to achieve continuous tracking across cameras.The method proposed in this paper is tested on three large-scale pedestrian reidentification datasets.The experimental results show that the proposed MLMANet model achieves better results in the pedestrian re-identification task.Combined with 3D multi-kernel constraints Tracking technology enables continuous tracking of pedestrians across cameras under multiple static cameras.(3)A 3D multi-kernel constrained vehicle and pedestrian tracking system based on an adaptively estimated ground plane is proposed under the mobile dash camera,which solves the problem of traditional ground plane estimation is unreliable due to the unpredictability of the vehicle driving environment and the noise caused by the camera shake,which affects vehicle and pedestrian tracking issues.The proposed tracking system effectively integrates technologies such as Visual Simultaneous Localization and Mapping(V-SLAM),Structure from Motion(SfM),adaptive ground plane estimation algorithm and multikernel constrained tracking.First,the system obtains the pre-tracked vehicle and pedestrian targets in the view field of the dash camera according to the objects detector and infers the three-dimensional space position of the tracked target from the mobile camera through V-SLAM technology.Combined with SfM,the attitude parameters of the mobile camera are obtained,and the horizontal yaw angle is used as the feedback to estimate the ground plane adaptively,which solves the problem that the traditional ground plane estimation algorithm is inaccurate when the vehicle travels on the curved road.At the same time,in order to solve the occlusion problem of the target during the tracking process,the system realizes the tracking of the target in the three-dimensional space by combining the three-dimensional space position information of the target on the estimated ground plane and the multi-core constraint tracking technology.The proposed vehicle and pedestrian tracking system is evaluated on multiple datasets such as Kitti,ETHMS,etc.The experimental results show that the proposed system has good tracking performance,it can not only effectively track the vehicle and pedestrian targets in the video area of the vehicle-mounted mobile camera and can handle the target occlusion problem in the tracking process very well. |