Keyword Extraction Using A Graph-Based Approach

Posted on:2020-08-17

Degree:Master

Type:Thesis

Country:China

Candidate:R K T a r i q u e K h a n

Full Text:PDF

GTID:2428330575956326

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

Due to the increasing rate of text over the Internet,it is very complicated to retrieve the relevant information regarding to the user.To overcome these types of problems more research work has been done in information retrieval and text analytics so far and it is the trending topic for research regarding the keyword extraction.There are many types of data regarding the observations and analysis such as graphical data and others.Data can also be generated by the user,by considering social media,Wikipedia or any other resources.Most of the people generate their own data by Twitter(social media,considered as one of the most popular platforms for crawling the short text,because it contains 140 characters per tweet).Keyword extraction is a process where a text is given to the computer and the computer returns a set of keywords that recommended topical words and phrases from the content of documents.Keyword extraction helps the reader to understand the summary or at least the core idea of the document without reading the whole document As a result,the prospect readers do not waste their valuable time reading the irrelevant documents comprehensively.Generally,by searching the keywords,users could find related posts to an event.Keyword extraction methods are being applied to many areas especially when we extract key.words in the area of information retrieval.This has a particular interest because people retrieve significant information based on keywords.In this thesis,we have used a graph-based keyword extraction algorithm over four different datasets collected from Twitter on different terms.By the preprocessing of datasets through NLTK we will set more optimized data,and the co-occurrence graph also generated by this dataset.Moreover,we have also shown whether the study of co-occurrences allows keeping track of the structure of each text,however,it is more tedious to handle and often leads to messy visualizations.There are many libraries there for visualization,python is giving more reliability for plotting because it provides many built-in libraries.TextRank algorithm is a graph-based keyword extraction algorithm,it follows the Google PageRank algorithm but somehow it is different from that by the words and links.TextRank calculates the score of every relevant word and by that score,we can find more important words of the corpus,further,it also finds the precision of those relevant words.Word cloud is also enhancing its popularity by the visualization,by its different look there are many word clouds are present over the internet.The data for the experimental evaluation of the proposed work is done by the real data set,crawled from Twitter.

Keywords/Search Tags:

Graph-Based

PDF Full Text Request

Related items

1	Graph-based Pattern Recognition And Its Applications In Computer Vision
2	Reach On Graph Matching Algorithms Based On The Graph Embedded
3	Research And Implementation Of Scene Graph Retrieval Method Based On Graph Theory
4	A Research On Graph Computation Platform In A Single PC By Application Of Graph Algorithm Tuned Graph Representation Format
5	The Study Of Structure-based Graph Classification Algorithm
6	Multilevel Graph Partitioning Based On Weighted Laplacian Method
7	Hybrid Graph Query And Graph Computing Engine For Distributed Graph Database
8	The Design And Implementation Of Graph Processing Middleware On Infosphere Streams
9	Research And Implementation Of Parallel Analysis And Mining System Based On Graph OLAM
10	Research On Large Graph Aggregation Algorithm Based On Finite Memory