Font Size: a A A

Design And Implementation Of A Specific Object Recognition System Based On Voice Control

Posted on:2024-06-30Degree:MasterType:Thesis
Country:ChinaCandidate:W H HaoFull Text:PDF
GTID:2568307136991879Subject:Electronic information
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of artificial intelligence technology and the arrival of an aging society in China,the demand for home care robots has become extremely urgent,especially for care robots for elderly people living alone.When elderly people have difficulties in movement or sudden situations require robots to retrieve specific objects,it is necessary to solve the problem of how to make robots quickly understand the instructions of the elderly and execute the pick-up of spatial specific objects.This issue has become one of the research hotspots for current home care robots.This paper investigates a multimodal fusion system based on speech recognition,specific object detection in videos,and visual distance measurement.The main research content includes:(1)The Transformer algorithm is used in speech recognition and the forward maximum matching method is used for Chinese word segmentation to extract keywords of specific objects from voice commands.The extracted keywords are matched with the object classes in the database,and the relevant keywords are retained while filtering out irrelevant ones.(2)In the recognition and detection of specific targets in videos,an improved version of the YOLOv5 algorithm is used.In the CSP structure,the convolutional blocks are removed and the original three sets of CBL are converted into four sets of CBL in the CSP module,making the network framework more seamless as a whole.This makes the entire recognition process faster and more accurate.(3)A multimodal fusion method for specific object recognition and distance measurement is proposed,which integrates speech and ranging technology based on images.This method identifies and measures the distance of targets by voice commands.The aim is to improve the overall system efficiency and reduce detection time through highly integrated algorithms.(4)An integrated system that combines speech,computer vision,and distance calculation is designed to achieve specific object recognition and distance measurement based on voice control.The system identifies the specific object by first extracting the keywords specified by the speaker using a speech recognition algorithm.Then,it matches the extracted keywords with the target classes in the home environment using the YOLOv5 s algorithm,and finally calculates the distance between the target object and the camera in real-time using distance measurement principles and identifying it.The experimental results show that the performance of the system designed in this article meets the expected design requirements.
Keywords/Search Tags:speech recognition, object detection, distance measurement, YOLOv5s, Transformer
PDF Full Text Request
Related items