Design And Implementation Of Resume Information Extraction Ystem Based On Domain Knowledge Base

Posted on:2019-01-02

Degree:Master

Type:Thesis

Country:China

Candidate:B Zhang

Full Text:PDF

GTID:2348330545958407

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Resume is a job seeker written description of their own situation,although there are certain characteristics in the structure,there are some content in the specification,but a variety of forms.So for recruiters,manual reading,recording and filtering resumes often cost a tremendous amount of work.Therefore,it is necessary to use information extraction technology to extract structured and valuable information from the free-form resume text,which can greatly simplify the resume analysis and construct an effective talent pool around the entity and event information in the resume so as to facilitate the talent matching,searching and filtering of resumes.Based on the brief introduction of the related technology of information extraction,this paper clarifies the demand and function design of resume extraction according to the actual needs,deeply studies the core technology solutions of resume information extraction,and implements a complete resume information extraction system and the following aspects of work:(1)Collect information from Internet resources such as Wikipedia and recruitment websites for collation,and build an enterprise name knowledge base,equivalent name knowledge base etc.(2)Trigger word matching algorithm is used in conjunction with Word2vec word vector to expand thesaurus to implement the segmentation of the resume information according to the structure characteristics.Trigger word matching algorithm is used in conjunction with Word2vec word vector to expand thesaurus to achieve the structure of the resume information block.For resumes that do not contain triggers,the resumes are expressed as eigenvectors,and the SVM classification algorithm is used to implement resume segmentation based on content features.(3)Comparative analysis the principle and application effect of Hidden Markov Model(HMM),Maximum Entropy Model(ME)and Conditional Random Field Model(CRF)which introduce domain knowledge in the named entities recognition of resume,select the optimal statistical model to achieve entity information extraction in various categories of resume block.(4)Proposing a backtracking strategy of resume information extraction.The rules matching method based on knowledge base was used to complete the results of entity recognition based on statistical methods.At the same time,identify some event information in sequence of entities.(5)The Elasticsearch distributed search engine is used to filter and search resume extraction results.In addition,using Zend framework,Echarts and other WEB related technology to achieve the resume information extraction data visualization and other business layer functions,so that it has a more practical value,enabling business recruiters to efficiently handle resumes.Based on the above work,this paper carried out a series of functions and performance tests on the resume information extraction system.The results show that system can automatically extract structured information from the resume texts and establish a job seeker database,and for most entities can achieve the expected results,illustrate the effectiveness of the proposed block citation scheme and entity extraction scheme in this paper.At the same time the system also provides users with resume management,filtering and retrieval capabilities to improve the efficiency of resume processing.

Keywords/Search Tags:

resume information extraction, domain knowledge base, text categorization, named entity recognition, backtracking strategies

PDF Full Text Request

Related items

1	Research On Named Entity Recognition And Relation Extraction Facing To Domain-oriented Knowledge Base Construction
2	Phone Domain Knowledge Base Question Answering System Based On Named Entity Recognition
3	Research On Named Entity Recognition And Entity Link Method For Short Text Questions
4	Research Of Entity Knowledge Base System Based On Information Extraction
5	Research And Application Of Domain Oriented Entities And Inter-entity Relations
6	Open Domain Event Extraction From Microblogs
7	The Design And Implementation Of Knowledge Extraction Service For Constructing The Knowledge Graph Of The Financial Domain
8	Design And Realization Of Domain Specific Knowledge Base Extraction Syste
9	The Design And Implementation Of Domain Knowledge Base Management System Based On Knowledge Graph
10	Research On The Construction Of Knowledge Graph Of Segmentation Domain Under The Guidance Of Small-scale Knowledge Base