| The accurate extraction,management and utilization of financial information are of great significance to the production and operation of enterprises and personal investment and financial management.With the explosive growth of financial information,traditional manual information extraction and utilization technology has been unable to meet the needs of enterprises and individuals.In recent years,the rapid development of deep learning technology,especially in the field of image recognition and natural language processing,has made important breakthroughs.The application of deep learning to the extraction and management of financial information has become a development trend.This article uses deep learning technology to carry out systematic research and development work on financial bill information recognition and news information extraction and market prediction.The main work and innovations are as follows:(1)Use OpenCV library to preprocess gray value,denoise,binarization,tilt correction,perspective transformation,region segmentation and other pretreatment of VAT invoice images,which significantly enhances the image quality and lays the foundation for subsequent text positioning and recognition;(2)In view of the problem of overlapping positioning frames during the detection of the entire invoice text,the invoice is divided into four areas in advance according to the VAT invoice layout rules and the a priori information that each area has,and then the text in them is located and identified separately;On the basis of the CTPN text detection algorithm,the text line post-processing algorithm is improved.The new algorithm is superior to CTPN in the three indicators of accuracy,recall and F-score;(3)Use DenseNet+BLSTM+CTC structure for text recognition,and fine-tune the model generated by training on another data set to enhance the robustness of the model,and the character recognition accuracy rate reaches 98.42%;(4)Use the tkinter module to build the invoice information extraction interface,and match the recognition results with the template,so that each field and content of the invoice correspond to one to one,which improves the system’s friendliness;(5)Use the BeautifulSoup library to crawl financial news data and the Dow Jones index set to generate corresponding tags.The comparison results of the N-garm model,sentiment analysis tool,and Doc2Vec model show that the Doc2Vec model performs better on long-term data sets The prediction accuracy rate reaches 55.82%;the sentiment analysis tool works better on short-term data sets,and the accuracy rate reaches 77.58%after adding LSTM. |