Font Size: a A A

Research On Story Segmentation And Deep Hashing Algorithm For News Broadcast Story Retrieval

Posted on:2022-12-20Degree:MasterType:Thesis
Country:ChinaCandidate:H L ChenFull Text:PDF
GTID:2518306767962369Subject:Journalism and Media
Abstract/Summary:PDF Full Text Request
News broadcast story retrieval can help people find the information they need from a large amount of information.The story segmentation and indexing based on hashing methods are two essential modules in the news broadcast story retrieval.Story segmentation is to divide news broadcast into story segments according to the content,and indexing is to reduce the dimensionality of the retrieved content and the data in the index database to reduce resource consumption.At present,news broadcast story segmentation algorithm based on caption and deep hashing algorithm although have achieved certain performance gains,there are some problems:(1)The caption detection method used for story segmentation has poor generalization,and the process of determining the story boundary needs to integrate multiple descriptors to achieve a good performance;(2)In the process of deep hash coding,on the one hand,the feature vector has information irrelevant to the hash code,and the fully connected hash layer cannot comprehensively evaluate the function of each feature when directly processing the feature vector,which will cause uncontrollable quantization error.On the other hand,the existing central similarity learning strategy is not ideal for learning multi-label data pairs with only partial similar labels.The thesis is aimed to solve the aforementioned problems.In order to solve the problem of news broadcast story segmentation algorithm based on caption,the thesis regards the primary caption and dialogue caption as a dualtarget detection task and trains the YOLOv3(You Only Look Once)network by a selfbuilt dataset to improve the accuracy of primary caption.Then,story segmentation algorithm for news broadcast based on primary caption is proposed according to the uniqueness of the primary caption.Besides,mean hashing is used to achieve rapid comparison of sub-primary captions detected by YOLOv3.The experimental results show that the primary caption detection based on YOLOv3 improves the F1 score about2.04% on the CCTV News and the F1 score of MORNING NEWS can achieve 0.991 too.Story segmentation for news broadcast based on primary caption improves the F1 score about 0.2%-12.13% on the CCTV News and the F1 score of MORNING NEWS can also achieve 0.917.Most importantly,the aforementioned two methods are suitable for all types of news broadcast which have primary captions.Aiming at the problem that the fully-connected hash layer in deep hashing directly deals with feature vector and center similarity learning,Deep hashing based on quantized-attention mechanism is proposed in the thesis.A quantization-attention mechanism is proposed to reduce the quantization error caused by generating binary codes.It normalizes the eigenvectors to reduce the negative impact of the original eigenvalues on the results,and only selects the key information of the eigenvectors for training.At the same time,pairwise similarity is introduced to improve the center similarity learning problem,so that when the hash code generated by the data point converges to the corresponding semantic hash center,the Hamming distance of hash codes generated by data pairs with only some similar labels can also be reduced.Compared with the central similarity quantization(CSQ),the MAP of the spatial pairwise hashing based on the central similarity achieves an improvement about 2.5%,2.7% and 1.9% on Image Net,MS COCO and NUS?WIDE datasets,respectively.Compared to other deep hashing methods based on binary or triple similarity,the MAP of the spatial pairwise hashing based on the central similarity enjoys an improvement about 12.4%,5.1% and 1.5% on Image Net,MS COCO and NUS?WIDE datasets,respectively.
Keywords/Search Tags:Story segmentation, Primary caption, Deep hashing, Quantization error, Loss optimization
PDF Full Text Request
Related items