Font Size: a A A

Research On Domain Vocabulary Association For Financial Industry

Posted on:2014-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:L B ZhiFull Text:PDF
GTID:2208330434472097Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Financial industry has great demand on information acquisition and processing. Although Internet acts as a huge financial information platform, the explosive volume also brings great challenges for people to use it. To solve this problem, many technologies appear, such as text mining, among which word relation mining is an important task. However, existing methods heavily rely on manual work and lack of benchmark dataset for comparison and validation. Besides, most of current work is carried on specific domain and the language is only for English.This paper has carried out a research on word relation mining for financial domain, which mainly covers two kinds of tasks. One task is bringing out a novel measure of relationship between security entities; the other one is to detect the relationship between financial terminologies. Firstly, from the perspective of the topic semantics and word co-occurrence, we established a model for measuring the correlation between security entities using topic model and cosine Similarity. The experiment result showed some useful correlation between security entities. Secondly, we found that some concepts of a corpus will split while changing the count of topics in topic model. Accordingly, we proposed an automatic hierarchical ontology learning algorithm. To test the proposed algorithm, we brought out an adaptive query expansion method based on it and query logs. The experiment result proved the proposed method was better than other method which was suitable for the situation that is lack of domain ontology.The main work of this paper includes:1) we propose an entity measure model based on topic correlation and co-occurrence. This model evaluates the correlation from the perspective of topic semantic and word co-occurrence and discoveries the hidden correlations between entities and measures them through unsupervised method;2) we apply this model to security domain by comparison with the price series which is objective standard;3) we propose an automatic domain-specific hierarchical ontology learning method based on domain corpus.4) we propose an adaptive hybrid query expansion model including domain ontology and query logs, which can enhance the existing domain ontology through query logs;5) we provide an application sample of topic model based on objective data from financial industry, which implements the topic model evaluation on objective dataset. On one hand this work provides the application of text mining technology in specific areas. On the other hand it also provides data mining cases for financial industry.
Keywords/Search Tags:Word Relations, Topic Model, Financial Industry, Ontology Learning, Query Expansion
PDF Full Text Request
Related items