Font Size: a A A

Research On Association Rules Mining Based On Multi-topic Classification And Named Entity Recognition

Posted on:2022-11-29Degree:MasterType:Thesis
Country:ChinaCandidate:J R ZhangFull Text:PDF
GTID:2518306758450354Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of social media in recent years due to the Internet,data is growing exponentially.In the face of such a large amount of data,how to mine the value behind it is our constant focus.Data mining is a process of discovering hidden information from a large amount of data through algorithms.Common techniques used in data mining include association rules,classification,clustering,prediction and other techniques.Due to the traditional association rule mining algorithms have the problems of low data value density and low novelty of frequent item set,resulting in poor mining results.In recent years,researchers have found that entities in text data can be efficiently identified through natural language processing techniques,and satisfactory results have been achieved in solving the problem of low value density of text data.And with the breakthrough progress of deep learning in different fields,the application of deep learning techniques to association rule mining can effectively alleviate the problem of low novelty of frequent item sets and thus improve the effectiveness of data mining.Therefore,this thesis focuses on deep learning and natural language processing techniques for association rule mining.Firstly,the named entity recognition technique in natural language processing is used to process the text data to obtain the transaction data required for association rule mining,i.e.,the entity set;then,the multi-topic classification technique in deep learning is used for deep mining to solve the problem of low novelty in association rule mining;finally,based on this,the design and implementation of association rule mining system is carried out using multi-topic classification and named entity recognition techniques.The specific research contents are as follows.(1)Chinese named entity recognition research based on deep learning.In different contexts,the same word may represent different meanings,especially in Chinese text,which leads to the low accuracy of Chinese named entity recognition.To address this problem,firstly,this thesis use Text Convolutional Neural Network(Text CNN)to recognize the meaning of each word and Bi-directional Long Short Term Memory Network(Bi LSTM)to express the contextual information;then this thesis introduces the text classification method to classify the text data to qualify the semantics;finally,this thesis identifies different entities according to the classification results.The experimental results show that the selection of adding text classification for Chinese named entity recognition leads to the improvement of the accuracy of named entity recognition.(2)Association rule mining research based on multi-topic classification and named entity recognition.The patterns obtained from data mining are not always novel,and we need to improve the novelty of the mined patterns as much as possible.In this thesis,we use named entity recognition technology to extract different entities in text,i.e.,transactions in transaction data;then use deep learning-based multi-topic classification method to classify text;finally,we combine different categories of data for association rule mining.The experiments show that the novelty of the frequent item set mined by association rules based on multi-topic classification with named entity recognition is improved.(3)Design and implementation of association rule mining system based on multi-topic classification and named entity recognition.The data mining system is developed by using the object-oriented programming method and Pycharm as the development environment,and the association rule mining method based on multi-topic classification and named entity recognition.The system first preprocesses the data,then conducts the study of Chinese named entity recognition according to the deep learning model,and finally mines the data using the combination of multi-topic classification and named entity recognition.
Keywords/Search Tags:Chinese Named Entity Recognition, Neural Network, Text Classification, Data Mining
PDF Full Text Request
Related items