Recognition And Analysis Of Skilled Words In Chinese Online Recruitment Corpus

Posted on:2022-02-09

Degree:Master

Type:Thesis

Country:China

Candidate:Q Mao

Full Text:PDF

GTID:2518306554470314

Subject:Master of Applied Statistics

Abstract/Summary:

PDF Full Text Request

In recent years,the structural contradiction of mismatch between supply and demand has appeared in my country's talent market.Improving the matching degree of higher education talent training and labor market demand has become an important means to solve the mismatch between supply and demand.At present,online recruitment has become the main way for companies to recruit talents.Extracting,identifying and analyzing the skill word information contained in the online recruitment corpus can directly and effectively understand the company's job requirements for recruiting talents,so that universities can further improve the pertinence of talent training,thereby effectively alleviating the contradiction between supply and demand.This article first uses the long and short-term memory deep neural network to identify skill words,and secondly,for the identified skill words,further use the LDA(Latent Dirichlet Allocation)topic model for data analysis to understand the characteristics of the needs of enterprises and regions for talent skills.And the relationship between skills mastery and salary.Therefore,it is of great practical significance to identify and analyze the skill words in online recruitment advertisements.At present,the mainstream method for named entity recognition and term extraction is to use deep neural networks for extraction.This type of method focuses on supervised learning in professional fields and requires a large amount of labeled data.The semantics of the Chinese recruitment corpus is changeable,the sentences are not standardized,the context is more complicated,and there is a lack of sufficient annotation data.How to rely on a small amount of labeled data and a large amount of unlabeled data to establish a semi-supervised learning model so as to perform effective skill word recognition has brought great challenges.In addition,how to use the topic model of LDA(Latent Dirichlet Allocation)to identify the hidden topic information contained in skill words to achieve in-depth analysis and visual presentation of online recruitment corpus is also extremely challenging.In response to the above challenges,this article tried two studies:(1)Aiming at the difficulty of lack of annotated data,this paper proposes a method of skill word recognition based on semi-supervised learning model.It is based on the classic model of sequence labeling Bi-LSTM(Bidirectional Long Short Term Memory),introduces the MMNN(Max-Margin Neural Network)model,combines the prediction results of Bi-LSTM with the dependencies learned by MMNN,and establishes The semi-supervised learning model of sentence confidence is jointly trained on the basis of a small number of labeled samples and a large number of unlabeled samples.The experimental results show the rationality of this research method,and the introduction of a semi-supervised learning model can effectively alleviate the scarcity of artificially labeled data.(2)Aiming at the potential subject information contained in skill words,this paper builds an IT skill word dictionary based on the identified skill words,combined with machine learning methods and expert judgment.Then,according to the topic model of LDA(Latent Dirichlet Allocation),extract the topic information from the skill words,divide the skill word sets,and further construct the relational topic matrix,concrete the abstract information,and compare it from the perspectives of employer,work area,salary,etc.Perform statistical analysis on the extracted subject information,and present the analysis results in visual ways such as word cloud diagrams and Sankey diagrams.

Keywords/Search Tags:

online recruitment, semi-supervised learning model, deep learning, LDA topic model, visual analysis

PDF Full Text Request

Related items

1	A Semi-supervised Element Sensitive Saliency Model With Position Bias Learning For Web Pages
2	Online Semi-Supervised Learning Theory,Algorithms And Applications
3	Research And Implementation Of Science And Technology Policy Classification Method Combining Topic Model And Deep Learning
4	Visual Tracking Algorithm Based On Deep Learning And Semi-supervised Learning
5	Research Of Reliable Semi-supervised Classification
6	Research On Semi-supervised Topic Model For Text Classification
7	Research On Semi-supervised Classification Algorithm Based On Integrated Neural Network
8	Research On Deep Learning Text Classification Based On Fusion Topic Features
9	Research And Implementation Of 3D Model Recognition Method Based On Semi-supervised Deep Learning
10	Visual Object Tracking Via Discriminative Metric Learning