| As an important part of Chinese traditional culture,Chinese herbal medicine contains rich cultural symbols and cultural connotations.With the advent of the era of cultural big data,Chinese herbal medicine data also has rich digital resources,and the types and functions of Chinese herbal medicines are analyzed by labeling methods.It is a scientific and technological method that can realize cultural identification,cultural interpretation and cultural inheritance.This paper focuses on the book image data in the field of Chinese herbal medicine,and realizes the digital image and text processing and associated labeling of the Chinese herbal medicine book data with mixed images and texts.The main research contents include:(1)Construction of Chinese herbal medicine graphic dataset.Collect and organize Chinese herbal medicine data through digital instruments,and refer to "Chinese Materia Medica","Chinese Herbal Medicine Encyclopedia" and other authoritative materials to construct the Chinese herbal medicine graphic data set required for the experiment,and supplement manual annotation on the basis of machine processing.A Chinese herbal medicine semantic labeling dataset is constructed.(2)A basic label extraction algorithm for Chinese herbal medicine data is proposed.Using the original book images of Chinese herbal medicine as the data source,instead of manual labor,the key information of the picture part and the text part is extracted in a structured way,and the combined use of layout analysis,optical character recognition,keyword extraction and other technologies are used to analyze the source.The data is processed,and finally the identified results are clustered according to the established Chinese herbal medicine semantic labeling system.(3)A Chinese herbal medicine association labeling algorithm based on semantic consistency constraints is proposed.Aiming at the problems of small gap between classes in Chinese herbal medicine images,image polymorphism,etc.,a feature extraction model,attention mechanism and other structures are comprehensively used,and the classical annotation model structure is optimized according to the characteristics of Chinese herbal medicine data,so as to realize the annotation task of Chinese herbal medicine images.(4)Combined with the Chinese herbal medicine label extraction algorithm and the Chinese herbal medicine association labeling algorithm,a set of Chinese herbal medicine graphic and text modal data association labeling system participated by people from multiple fields,machine labeling is the main,and experts and the public are supplemented.With the goal of exploring and mining the cultural connotation contained in the graphic data of Chinese herbal medicine,this paper improves and proposes a set of image annotation technology paths through digital means,using the images of Chinese herbal medicine books as the data source,and the effectiveness of the method is verified by experiments. |