Font Size: a A A

Research On Hardware-accelerated Architecture For Object Detection And Recognition Applications

Posted on:2010-08-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:J B XuFull Text:PDF
GTID:1118360305482691Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Object detection and recognition technology has both significant theoretic valuesand wide potential applications in military and civil domain. A lot of algorithms andmethods have been presented to improve the accuracy. However, little research focusesonacceleratingtheprocessingspeed.In fact, while designing a practical object detection and recognition system, theprocessing speed is also very important in addition to the accuracy. At the same time,system size, implementation cost, power consumption and the adaptability for differentapplicationdomainsareproblemsthatrequiremoreattention,too.FPGA-based (Field Programmable Gate Array) hardware acceleration technologycan optimize the mappingstrategyfrom algorithm to computingengine (compared withgeneral purpose processor), and it also has the ability to customize the computing andmemory resources (compared with ASIC(Application Specific Integrated Circuits)).Therefore, FPGA-based hardware acceleration technology finds an appropriate balancebetween the flexibility and high performance. In addition, FPGA-based hardwareaccelerator also has advantages of small size and low power compared with generalpurposeprocessor.FPGA-based hardware acceleration technology has remarkable values fordesigning practical object detection and recognition systems. Some key issues ofhardware-accelerated computing for object detection and recognition applications aredeeply studied in this thesis. Four types of applications are researched, which arestationary rigid object recognition, moving object detection and exaction, pedestriandetectionandrecognition,facial detectionandrecognition.Theresearch goal is tobringforward some solutions for efficiently mapping algorithms to limited hardwareresources,soastofindanappropriatetradeoffamongthehardwareimplementationcost,the detection accuracy and speed. Based on the research results, hardware-acceleratedprototypesystems are built onFPGAfor eachtypeofapplications studied inthis thesis.At the same time, for image applications with irregular data access patterns, such aspedestrianandfacialrecognition,ageneralconflict-freemulti-accessmemoryschemeisproposed.Firstly, the hardware acceleration techniques for stationaryrigid object recognitionbased on Hausdorff distance and template matching are studied. The data accesspatternsofthiskindofapplicationsarebasicallyregular,butthecomputationalcostsareprettylarge. In order to balance the data consumption speed of processing units and thedata production speed of memory system, a parallel computing scheme forlarge-sliding-window applications is proposed. The proposed scheme combinesconflict-free parallel access memory model and multi-PE(Processing Element)computing structure which is based on divide-and-conquer strategy. Consequently, the requirements of memory modules and capacities are reduced. Performance analysis andexperimental results show that the proposed parallel computing scheme achievessignificantparallellism.Secondly, the hardware acceleration techniques for moving object detection andexaction are studied. And the memory optimization techniques are also researched. Inreal world, people are more interested in moving objects than stationary ones. Thehardwarestructure for classifyingdifferent movingobjects inimages is designed. Then,based on the characteristics that the number, positions and sizes of moving objects inimage sequences vary frequently, the problem of maintaining variable data set isintroduced. Aiming at this problem, a general hardware structure for linked-list isbroughtforwardtooptimizetheaccessesofvariabledataset.Following the moving object detection and exaction, the successive procedure isusuallymovingobject recognition.This thesis chooses pedestrianrecognitionandfacialrecognition as research emphasis, which both require complex computation and arewidely used in practice. Pedestrian and face objects belong to"non-rigid objects".Different from rigid ones, the contours of non-rigid objects are irregular and changeconstantly. Consequently, the computational costs are increased, and the data accesspatternsbecomeirregular.The hardware acceleration techniques for pedestrian detection and recognitionbased on Active Shape Model (ASM) are researched. To deal with the problem ofresourceconstraintscausedbylargecomputationalcosts,aresourcemappingstrategyisproposed, which combines the resource-sharing scheme with the hardware pipelinefashion to balance the hardware costs and the processing speed. For the tasks which arenot critical but occupy too many resources, resource-sharing scheme is applied, whichmaps multiple operations with the same type upon a single processing unit for timesharing. The inputs of these operations are selected by a multiplexer. The initiationintervals between different operations are minimized byadopting scheduling algorithm.For those critical tasks which are time-consuming, more resources can be deployed forapplying hardware pipeline and other parallel techniques to improve the processingspeed.AprototypesystemisconstructedonFPGA,whichachievespedestriandetection,recognition and tracking. The experimental results suggest significant speedupscomparedwithrelatedworks.For facial objects, a fine-classified method for rotation invariant multi-view facedetection is presented, which is able to detect faces with all±90-degreerotation-out-of-planeand360-degreerotation-in-planeposechanges fastandaccurately.Each detector node in the tree-structured detector hierarchy is trained by using a noveltwo-stage boosting (TS-Boosting) method. The primary idea is that while decidingwhether a sample belongs to a pose range, the detector not only considers theprobability that the sample belongs to the pose range, but also the probability that the sampledoesnotbelongtootherposeranges.Basedontheproposedmethod,ahardwareaccelerator structure is proposed. And a design space exploration algorithm is presentedto achieve the reconfiguration of the hardware resources. Experiments on FPGA showthat high accuracy and marvelous speed are achieved compared with previous relatedworks.Lastly, a general conflict-free multi-access memory scheme is proposed for imageapplicationswithirregulardataaccesspatterns,suchaspedestrianandfacialrecognition.A multi-module memory structure is presented between the main memory and theprocessing units, which achieves conflict-free parallel access of randomly alignedrectangular data blocks constrained in some regions of interest (ROIs). Performanceanalysis and experimental results show that the proposed memory scheme is moresuitable for image applications with multiple interested regions than related works, andtransfer speedups up to hundreds are achieved when compared with the scheme thataccessesmainmemorydirectly.In summary, this thesis studies the key issues of hardware acceleration techniquesforobject detectionandrecognitionapplications.Solutionsforseveral keyproblems arepresented, which are parallel characteristics analysis for algorithms, hardwarearchitecture design, reconfiguration of computation and memory resources, parallelmemory scheme for irregular data access patterns. The contribution has significantvalues for advancing the theory and practicability of object detection and recognitiontechnology.
Keywords/Search Tags:objectdetectionandrecognition, hardwareacceleration, pedestrian detection and recognition, facial detection and recognition, conflict-free parallel access, FPGA
PDF Full Text Request
Related items