Font Size: a A A

Design And Implementation Of Classification And Extraction System Of Financial Announcement Based On NLP

Posted on:2020-07-04Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y ZhangFull Text:PDF
GTID:2428330578954633Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the further enhancement of national economic vitality,the total number of listed companies is increasing,and the difficulty of data analysis of financial announcements is further increased.In order to provide analysis data for VCs and fund trustees more quickly and accurately,the platform of classification and extraction of financial announcements uses big data and natural language processing technology,which makes the announcements of tens of thousands of Listed Companies in China can be analyzed and processed instantly every day.In order to improve the classification and extraction accuracy of announcements,two-level improved text extraction method is used in the system.Firstly,text lines or segments containing attribute values(potential labels)are classified,and then attribute values are extracted,which greatly improves the core competitiveness of the data platform.Firstly,this paper describes the project background,the present situation of related products at home and abroad,as well as key technologies and theories,and carries out functional and non-functional requirements analysis of the whole system.According to the requirement analysis,the overall architecture of the system is designed,the main function points of each module and the database design are divided,and then the detailed design and implementation of the system are emphasized.The author participated in the whole research and development process,completed the main development work of announcement classification module,announcement annotation module,announcement attribute value extraction module,and participated in the research and implementation of Naive Bayes,potential label classification,NER named entity recognition algorithm.On the basis of the basic requirements,the author has done a lot of comparative experiments for different word segmentation methods and text classification models,and developed the module of word segmentation,data preprocessing and parameter adjustment to optimize the performance of the model;and through the experimental method of combining grammatical rules,regular expressions and NER named entity recognition,the accuracy of text extraction is improved;and the announcement tag is also improved.The functions of annotation and evaluation are visualized,which improves the quantity and quality of model training data in the system;the role and authority modules are added to realize the unified management of users;the log information of each module in the storage system is used to realize the unified monitoring of each index of the model.Finally,the whole system is tested by function and pressure,which ensures the stable and normal operation of the system.At present,the system has been put into operation online.The extraction accuracy and recall rate of important announcement categories can reach more than 80%.Every day,a large number of financial announcements are processed intelligently and structurally,and key information is extracted and stored,covering all dimensions of financial market related information.Users can use the system to query financial information,early warning of bond default,credit risk monitoring and other functions,providing an important reference for individuals and investment institutions.
Keywords/Search Tags:financial data, natural language processing, text multi-classification, attribute value extraction
PDF Full Text Request
Related items