Font Size: a A A

Research And Application On Intelligent Search Platform For Business News

Posted on:2016-10-25Degree:MasterType:Thesis
Country:ChinaCandidate:H C WangFull Text:PDF
GTID:2298330470957805Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the high-speed development of the internet, the information including all kinds of business news on the World Wide Web (WWW) is growing at an explosive rate. There is no doubt that the business news contains immense wealth and is very important to help people make decisions. However, people have to face the serious problem of information overload. Therefore, how to help users acquire valuable business news easily and quickly becomes a very vital problem.In order to alleviate that problem, there are two widely used providers for business news, namely the search engine (e.g., Google) and business portals (e.g., Reuters). Most of us are now using search engine for information retrieval since it is very simple and easy to use. Users only need input keywords and then could acquire relevant results. However, too many returned results lead to relatively low precision and recall and users have to spend a large amount of time on finding out useful information. More importantly, some users just care about news related to a particular field (e.g., currency, real estate) which the search engine cannot offer. Although some business portals could provide users with relatively professional and authoritative news of different business domains, there exists two drawbacks as follows, a) The home pages in these portals display all kinds of news related to different fields, which look bloated and huge. Therefore, it may confuse the users who are used to getting news by using search engine, b) These portals just simply display the news, and they cannot find out the hidden relationships of different business news. For example, the news about "housing price" may be related to news of "real estate control policy" or "building material industries".In order to address the problems mentioned above, in this thesis, we pay attention to research and application on intelligent search platform for business news. The major work and contributions can be summarized as follows:1. This thesis proposes and implements an intelligent search platform for business news. This novel platform combines the advantages of both search engine and business portals to help users acquire business news easily and quickly. Moreover, it takes advantage of techniques of data mining and natural language processing to provide users with more intelligent service. Specifically, the platform consists of six parts. Fisrtly, it is based on a vertical search engine system which incorporates automatic classification technology to organize and retrieve business news in different domains. The user can not only acquire news by keyword-based query, but also browse the news of specific fields by its category. Secondly, in order to provide users with hot news stories firsthand, it is integrated with hot news detection based on TDT techniques. Thirdly, to fast guide users finding diversified and useful news, it constructs a dynamic knowledge network graph to display the hidden relationships among news.2. This thesis improves the classification and hot news detection algorithms respectively. In consideration of the importance of title in a news story, this thesis improves the TF-IDF weight formula of feature words. What’s more, time is very important for hot news events detection. So, this thesis extends the similarity formula with consideration of temporal information.3. This thesis conducts a number of experiments on real datasets to validate the effectiveness of improved methods. By introducing the title, the new weight formula can improve the accuracy of classification algorithms. Through considering temporal factors, the proposed new similarity formula does better in detecting news events which just have similar contents but not belong to the same event than traditional methods.
Keywords/Search Tags:Intelligent Search, Search Engine, Classification, Hot News Detection
PDF Full Text Request
Related items