Research And Implementation Of Text Categorization Algorithm In Knowledge Management System

Posted on:2019-02-07

Degree:Master

Type:Thesis

Country:China

Candidate:W Zhang

Full Text:PDF

GTID:2348330542998634

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of mobile Internet,knowledge updates faster and faster,the Internet is full of all kinds of knowledge data.In addition,with the development of information retrieval and search technology,obtaining information through the Internet has become the most common way of obtaining information in people’s lives.On the Internet,most of the knowledge exists in the form of text.In order to effectively manage and better obtain the target text information,text-based information retrieval and data mining have become the areas of great interest in recent years.Among them,text classification is the basis of information retrieval and text data mining.Its main task is to learn the mapping between text content and text category according to the given text content and text category,then use that data to train a classification model,and finally use this classification model to deal with unknown classified text to determine or predict the unknown text category.Text categorization is widely used in NLP(natural language processing),information retrieval,information filtering,digital library,news classification and other fields.Knowledge management system is a distributed system to store and manage large-scale knowledge documents.In order to improve the efficiency and usability of the system,the system need to classify the documents in different fields to improve the speed and accuracy of the retrieval.Because of the huge number of texts,it is not feasible to use manual classification.Based on this situation,it is necessary to design an efficient,high accurate text classification method to complete the text classification automatically.This article studies the characteristics of the text in knowledge management system.It is very difficult to extract the features,because these texts are characterized by high dimensionality,sparsely and polysemy.In addition,some traditional machine learning uses only the keyword as the feature that will make the contextual semantic information lost,which leads to the low classification accuracy;On the other hand,in order to improve the accuracy,it is necessary to design a complex feature project for a specific classification area,which increases the design cost of the model.In a word,the current situation is that text classification method is still need to be improved.In view of the above two problems,this paper proposes to use the word embedding method to represent the text,and use deep learning algorithm to automatically extract text features and do classification,and use the way of multi-model fusion to improve the classification model.The results of experiments show that the text classification method based on deep learning can achieve good classification effect,and the fusion of multiple models can indeed improve the accuracy.

Keywords/Search Tags:

knowledge management system, natural language processing, text categorization, deep learning

PDF Full Text Request

Related items

1	Research On Text Categorization Technology Based On Deep Learning
2	Intelligent Device Text Classification Method Based On Natural Language Processing
3	Design And Implementation Of Knowledge Extraction Algorithm Based On Natural Language Processing
4	Research And Application Of Automatic Text Summarization Technology Based On Deep Learning
5	Research On Models Of Generating SQL Statement Through Natural Language Based On Knowledge Graph
6	Natural Language Processing Of Ancient Books Of Chinese Traditional Medicine Based On Deep Learning
7	Research And Implementation Of Short Text Classification Model Based On Course Knowledge Points
8	Research On Machine Learning For Natural Language Processing And Transmission
9	Research And Application On Method Of Generating SQL Through Natural Language Based On Interactive Information Editing
10	Design And Implementation Of Intelligent Text Circulation System For Government And Enterprise Based On Deep Learning