Font Size: a A A

Design And Implementation Of An Fund Information Collection System Based On Crawler Technology

Posted on:2013-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:W L QiFull Text:PDF
GTID:2248330392953346Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The Internet has brought great changes to people’s lives, and now the channels offund investors acquiring fund information are mainly concentrated in the network. inOctober2011, the Fund Department of China Security Regulatory Commission, thethird-party sales of the Fund finally become true, which means that the Fund’ssubscription and redemption will become more convenient. To this situation, we canprovide an authoritative disclosure of fund information website to the fund investorsin order to meet the urgent needs of fund investors. To solve this problem, we proposea program of fund information collection system. Jinchang Investment Consulting Co.,Ltd. can collect and display fund information through this system.In the vertical field of the Fund, The system can collect webpages, parsewebpages, sieve and browse fund information using the technology of the webcrawler. Firstly, we make sure the functions of the system through doing demandanalysis.Then the system is designed and implemented according to object-orientedapproach.Fund information collection system is divided into the module of collectingwebpages, the module of parsing webpages, the module of sieving fund information,the module of maintaining data and the module of browsing fund information. How tocollect webpages fast and incrementally using the technology of web crawler is solvedin the module of collecting webpages. Webpages are parsed to structured data indatabase using the Nokogiri library in the module of parsing webpages. The datamining module processes basic data parsed by the module of parsing webpages inorder to reveal the valuable fund information. The implementation of the module ofmaintaining data is good for administrators to maintain fund data.Fund Informationcan be browsed after finishing the last module.At present, the system is in working and maintenance period, which is a steady,high-performance sytem collecting and parsing websites,sieving fund informationevery day. When the fund information is disclosed, the system will aquire and showthe information the information.
Keywords/Search Tags:fund information, collection system, crawler technology, incremental crawling
PDF Full Text Request
Related items