Font Size: a A A

Scholar Interest Labels Mining Based On Network And Text Information

Posted on:2022-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:Z H TianFull Text:PDF
GTID:2518306509954569Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Scholars' interest label can show not only the content and direction of their academic researches,but also their focus on one or more research fields.Most of the existing interest label classification methods classify them based on text attributes and network attributes extracted from the papers of those scholars.But for the utilization of text attributes,probabilistic topic model is widely used to construct attributes which may lead to coarse granularity.For network attributes,heterogeneous networks are decomposed into multiple homogeneous networks for getting corresponding vector of those nodes,which may course the rich semantic information in heterogeneous information networks loss.Based on the previous researching achievements,this paper focuses on scholars' interest label classification task typically based on test attributes and network attributes,and carries out following contributions:(1)Put forward a classification method of interest label based on text attributes.Use word embeddings to represent text attributes and use Bi LSTM to capture context information from the text.After obtain the vectors from Bi LSTM,attention mechanism is used to aggregate the information showing different positions of text.By dimension reduction on aggregated scoring vector of interest label in full-connect layer and output layer,finally get the scores of each interest label.(2)Put forward a classification method of interest label based on network attributes.In order to preserve the semantic information in the heterogeneous information network,GAN is used here to get heterogeneous node embeddings on network structured information.By calculating similarity of different scholars with above embeddings,scores of forecast scholars will be acquired.(3)Get the classification result of the whole model by combining the weighted vote on prediction accuracy of text information as well as network structure and get the overall score of interest labels.Compared with the solution scheme of the scholars' interest mining task in the 2017 Open Academic Data Challenge,method showing in this paper improves the accuracy score by 2.08%.At the same time,a Webbased display application is built to query scholars' interest label and other relevant information based on the mining result of these scholars.
Keywords/Search Tags:heterogeneous network embedding, Recurrent Neural Network, scholar network, interest label mining
PDF Full Text Request
Related items