Massive Academic Resources Classification Research For Personalized Recommender

Posted on:2018-10-08

Degree:Master

Type:Thesis

Country:China

Candidate:Y Gao

Full Text:PDF

GTID:2348330536986033

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

With the advent of the era of big data,information resource has been a spurt of growth.Every year there are hundreds of millions of academic resources produce,these resources can lead to huge help for the students,teachers and researchers,but the mass production of academic resources at the same time also brought the problem of resource organization and retrieval,causing a large number of high value of resources submerged.The effective ways to solve this problem is building a personalized recommendation system for academic resources based on big data technology.Help users to find really need academic resources efficiently through the way �resources take the initiative to match the user�.A outstanding recommendation system need quick access to resources,accurate classification organization and building personalized model based on user behavior.This article did in-depth research of automatic classification of mass academic resources for personalized recommendations.In this paper,combining with the characteristics of different types of academic resources we design different classification model,including single classifier model and multiple classifier model,and creatively introduce the key extension method based on collaborative filtering to solve the problem of inadequate corpus.Improving classification accuracy through making concrete analysis for concrete conditions.Based on the above analysis,this paper mainly completed the following work:(1)This paper analyzes the characteristics of thesis,patent data,and select the bayes model as the target classifier,and emphatically expounds the keyword extraction algorithm(2)This paper puts forward relevant keywords extension method based on collaborative filtering to solve the problem of news,blogs lack of learning samples.This method can increase the amount of information,so as to improve classification accuracy.(3)This article adopts the method of integrated learning in order to solve the conference title classification task.This method improved random forest classification model by changing the decision tree to the bayes in the underlying.This way both retained the stability and generalization ability of random forest,at the same time,solve the data sparseness of vector space model for decision tree.(4)This paper designed and implemented participle task of huge amounts of academic resources,related parameters extraction task of TF-IDF and training task of classification model based on the Hadoop platform,solving the problem of low efficiency by using traditional standalone mode to deal with massive text data.

Keywords/Search Tags:

text categorization, bayes model, feature expansion, ensemble learning, Hadoop platfor

PDF Full Text Request

Related items

1	Text Categorization Research Based On TAN Model
2	The Study Of Chinese Text Categorization Based On Na(?)ve Bayes
3	A Study On Text Categorization Based On Machine Learning
4	Text Categorization Based On Naive Bayes Method
5	Text Categorization On Machine Learning Algorithm
6	Research On Feature Selection And Classification Methods For Text Categorization
7	On Bavesian Text Classification Learning Under Mapreduce Framework
8	An Automatic Chinese Text Categorization System Based On Statistical Language Model
9	Design And Realization Of Automated Text Categorization System For Chinese Documents Based On Relevancy
10	Chinese Text Data Classification