Font Size: a A A

Research And Implementation Of Chinese Image Caption System Based On Multi-Scale Feature Fusion

Posted on:2022-07-27Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y TangFull Text:PDF
GTID:2558306914961419Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet and smart terminal technology,image-related data is growing rapidly.Faced with a large amount of image information,how to obtain useful semantic information from images and achieve deep understanding and knowledge acquisition of images has become a research hotspot in the field of artificial intelligence.The study of image caption aims to teach machines to use natural language to describe the semantic information contained in images,which can be widely used in e-commerce product description,graphical cross-modal information retrieval,and linguistic interpretation of images and environments for visually impaired people,etc.It has important research significance and application value.Traditional Chinese image caption models based on encoder-decoder structure only extract image feature information at a single scale,ignoring the extraction of image feature information at different scales,thus semantic information is often lost in the image captions generated by these models.In addition,the current research on image caption mostly generates English captions,and the research on the technology of generating Chinese captions of images and the applications providing this service are in the initial stage.Around the difficulties faced by the research and application system of Chinese image caption,this paper carries out two aspects of work as follows:(1)Based on the traditional NIC(Neural Image Caption)model and the NIC model with attention mechanism,two Chinese image caption models based on multi-scale feature fusion were then designed by fusing features from different convolutional layers together to obtain features at different scales of images simultaneously.These two models were then experimented on the AI Challenger image Chinese caption dataset.The experimental results show that the model designed in this paper not only performs better in evaluation metrics,but also generates more fluent and accurate Chinese captions.(2)A Chinese image caption system based on REST(Representational State Transfer)architecture and RBAC(Role-Based Access Control)framework is designed and implemented.Firstly,the business requirements of the Chinese image caption system are analyzed,and the core module requirements of the system are described with UML use case diagrams.Next,the non-functional requirements needed for the system are analyzed.Then,based on the requirement analysis,the overall design of the Chinese image caption was carried out,and the design plan including the system architecture,the front-end page structure of the system and the back-end database structure of the system was completed.Then,based on the system requirement analysis and overall design,the detailed design and implementation of the system’s user permission module,system monitoring module,cloud photo album module and picture viewing and talking module were carried out with the help of UML class diagram,timing diagram and flow chart.Finally,the core modules,performance and compatibility of the system are tested.The test results show that the Chinese image caption system designed and implemented in this paper achieves the expected goal,which not only can generate Chinese image captions in different ways relatively quickly,but also can help users store images and their description information by using the cloud album module.
Keywords/Search Tags:Image Caption, Attention Mechanism, Multi-Scale, REST, RBAC
PDF Full Text Request
Related items