Research On The Realization Of The Employment Information Extraction System Based On Web

Posted on:2011-11-05

Degree:Master

Type:Thesis

Country:China

Candidate:S Q Fang

Full Text:PDF

GTID:2178330332966783

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the rapid growth of Internet, it has been becoming an important knowledgebase for people to searech for information and data.In the face of " the data ocean ",which is composed of the worldwide network as an effective means to gain potential and meaningful knowledge,the technique mined on line has been drawn more and more attention.It is necessary for vocational colleges to obtain a large amount of information about demanding talent,which has been provided guiding significance to specialty construction and course settings.The information on the internet has been an important part of the data sources. It is valuable that the information is found on the Web rapidly, accurately and efficiently on specialty construction and core courses settings in vocational colleges.Due to the characteristics on the Internet page such as a large amount of data semi-structural and dynamic changes,it also brings such problems as high complexity, low expansibility and adaptability to Web information extraction. The discovery of XML technology is provided a good opportunity to solve the data extraction on the Web. This dissertation is based on XML Web information extraction, belonging to content mined category on the Web.mainly studied as the following1. Based on the main Web information extraction difficulty to determine extraction rules effectively,this essay presents an information extraction method It is also discussed and researched on the learning of path and the relevant technical issues.2. On the basis of pages characteristics studied on the Web, it brings the characteristics of XML into web information extraction Jtidy is used to optimize and clean the Web page code, which is converted into XML documents. The DOM tree of web information is found in the analysis of XML in order that it is able to extract information better.3.With inductive learning rules based on the DOM data extraction strategy and data extraction, the strategy of the rules and data extraction algorithm has been suggested. By the machine learning rules (sets) generated extraction, template pages for similar structure information extraction have been generated with the rules.4.According to the data acquisition module, block data module, data extraction module (including rules and management, and employment information extraction) are given by the general framework. The development and experiment in an employment information extraction system of Job-hunting on the Web is accomplished with algorithm. The data in the database is saved so that these data be able to conduct with database technology for full and effective use.

Keywords/Search Tags:

Web mining, Inductive Study, Rules Forming, information Extraction

PDF Full Text Request

Related items

1	Web Information Extraction Based On Inductive Study
2	The Study On Multi-Relational Data Mining
3	The Research And Application Of Data Mining In Mining Rules Of Medical Diagnosis
4	Research On Language And Key Techniques For Accurate Information Extractionrules Towards Complex Web
5	Research On Methods Of Fuzzy Rules' Extraction Based On Concept Learning
6	Discretization Of Continuous Attributes In Information Systems And Rules Extraction
7	Research On Application Of Data Mining Technology In College Teaching Study Based On Inductive Learning
8	The Study For Mining Classification Rules Based On Genetic Algorithms
9	A Study On Association Rules Mining Algorithm And Its Application On Web Mining
10	Design And Implementation Of Web Information Extraction Rules