Font Size: a A A

Research On The Technology Of The Corporate Relations Mining

Posted on:2011-08-25Degree:MasterType:Thesis
Country:ChinaCandidate:K GuoFull Text:PDF
GTID:2178330338479982Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rise of the Internet e-commerce, more and more enterprises publish their information on the Web. The enterprise business market is no longer limited by time and space by this way and all the enterprises have the same information resource no matter the enterprise is big or small. But for enterprises, how to get the useful business information from the vast Internet and how to find the potential business partners and competitors are important to the decision-making and operation of the enterprises'survival and development. The research on the technology of the enterprise relations mining arises in this case.The research work of this paper includes two aspects: enterprise information extraction and enterprise relations mining.This paper uses the DOM-based method to extract the enterprise information.First get the enterprise Web pages from Internet with Web crawler. As the pages'structure from the same Website has great similarity, this paper uses the DOM Tree to analyze the pages. First translate the pages into XML format from HTML, then, according to the rules to locate the information node, after that can extract the enterprise information.This paper proposes two methods to mine the enterprise relations: according to the text similarity and ontology-based approach. Text similarity method uses part of the information extracted as the text on behalf of the enterprise. It uses the similarity value between the enterprises to determine the potential competition between the enterprises. This paper uses the vector space model to express the text and calculate the similarity value between the enterprises and have a test in this way.Ontology-based method first has to build domain ontology to the product area. This paper uses Protégéwhich Stanford University developed to build ontology in the computer field, then, uses Jena to analyze the ontology file. Last the paper uses rules have been set to infer and query one product's related products. It uses the relation between the products to determine the relationship between the enterprises. Experiment results show that the recall rate and precision of which based-ontology improve much compared with the method which based text similarity.
Keywords/Search Tags:information extraction, Ontology, enterprise relation, ontology building, Semantic inference
PDF Full Text Request
Related items