Optimization Of Text Classification Algorithms In The Financial Field

Posted on:2020-08-28

Degree:Master

Type:Thesis

Country:China

Candidate:X W Wang

Full Text:PDF

GTID:2428330590950629

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the development of financial industry,people are increasingly demanding financial-related information,and the information texts in the financial field are also increasing.Financial domain information texts often help to analyze the movements of related stocks and company stock prices.However,the increasing number of financial domain information texts are confusing,flooding a large number of non-financial domain texts,such as advertising texts,soft texts,pure technical texts,and so on.To this end,it is important to analyze the relevance of the text to the financial field.The text classification method of the base version is limited by the size of the training corpus,and the text is modeled based on the dimension of the word,ignoring the semantic information,and the accuracy rate and the recall rate are relatively low.Therefore,the paper proposes to improve the text classification method of the base version.Firstly,using the rules based on keywords and patterns recall text to generate training corpus.Secondly,the method based on active learning and clustering is used to mark the text to generate the training corpus.Then the text is cleaned based on the two dimensions of the text content and the media account to select high quality training corpus.Finally,the word vector feature with semantic information is introduced into the text classification feature to model the text,and different text classification models are adopted.Conduct a comparative experiment on text modeling,and make experimental adjustments to the model prediction probability,so as to more accurately determine whether the text is related to the financial field.At the same time,in order to recall more relevant texts in the financial field,the improved version incorporates a rule strategy based on keyword recognition in the financial field before the text classification model strategy is identified.The experimental results show that expanding the training corpus,retaining high-quality training corpus,the word vector with semantic information is used in the feature of text categorization,and rule method based on keyword recognition in financial field can greatly improve the recall rate and accuracy of text classification.After the discrimination in the financial field,it is possible to more accurately retain the information texts that are more relevant to the financial field.Not only greatly reduces the cost of manual filtering,but also greatly enhances the user's reading experience.

Keywords/Search Tags:

Financial field, Recall text, Text classification, Semantic information

PDF Full Text Request

Related items

1	Researching Text Classification Using Semantic And Sequence Information
2	SEMANTIC MEMORY AND STRUCTURE OF TEXT: EFFECT OF MANIPULATION OF ORGANIZATION OF CONNECTED DISCOURSE ON RECALL OF SEMANTIC UNITS BY ADEQUATE AND POOR READERS AT THE COLLEGE LEVEL
3	Study On Trend Analysis Of The Financial Field Based On Text Mining
4	The Research And Implementation Of Text Classification Based On Meta-Information And Optimization
5	The Research And Implementation Of Text Classification Based On Meta-information And Optimization
6	Research On Financial Text Classification Method Based On Deep Learning
7	Research On Text Classification Of Web Text Mining
8	Automatic Classification Based On The Concept Of The Text
9	Research On Ontology-Based Semantic Text Categorization
10	Research On Text Semantic Orientation Analysis For Areas Of Applied Based On The Web Information