Font Size: a A A

Research On Algorithms Of Real Estate-Ontology Topical Crawler

Posted on:2017-10-30Degree:MasterType:Thesis
Country:ChinaCandidate:D ChenFull Text:PDF
GTID:2348330482986809Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Nowadays,the rapid development of Internet has changed the traditional ways of information communication.Numerous of real estate-related information has been accumulated on the internet,such as price information,estate news,estate enterprise,market dynamics and so on,and the vertical search engine of real estate provides comprehensive and professional information of real estate for users by topic crawler technology.The performance of topic crawler directly determines the quality of real estate search engine,which makes it important to improve the performance of topic crawler.We studied the related technologies of existing topic crawler and found some shortcomings,such as inaccuracy of topic information description,low accuracy of extracted information,high complexity of topic correlation algorithms,poor performance and so on.In response to these problem,combined with the ontology and neural network technologies,the paper proposed and implemented an ontology-oriented real estate information focused crawler system.The main work and achievements are as follows:1)For web real estate information with characteristics of openness,diversification,timeliness,etc,an ontology-adaptive algorithm based on content learning technologies is proposed.The algorithm extracts the information about the real estate sector by focused crawler technology,and then obtains the domain-related concepts through feature extraction,filtration,classification and semantic analysis.Finally,the domain-related concepts are learned to dynamically maintain the ontologies,which improves the ontology's ability to describe the theme.2)For focused crawler's similarity algorithm with several disadvantages,such as low precision of results and poor performance of topic crawler,topic crawler correlation algorithm based on BFA is constructed.The algorithm has the advantages of superior non-linear neural network learning capability,stability and immunity than others,and it also takes feature of AFSA excellent global optimization capability to optimize the neural network structure,which further improves the precision of the topic crawler.3)Based on the technologies mentioned above,we design and implement anontology-oriented real estate information focused crawler system combining with the current real estate information.The system combines a powerful semantic description ability of ontology and excellent non-linear learning ability of neural network,by indicators of accurate,recall to test the performance of the system,experimental results show that the overall performance of the system has been significantly improved,accuracy rate of 12%,recall rate increased by 9%.
Keywords/Search Tags:Topic crawler, BP neural network, Ontology, AFSA, Search Engine
PDF Full Text Request
Related items