Keyword Extraction Based On Statistic And Syntactic Parsing

Posted on:2013-11-03

Degree:Master

Type:Thesis

Country:China

Candidate:Q Wu

Full Text:PDF

GTID:2268330401982979

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With the continuous development of the network, there are vast amounts ofinformation presenting every day. The explosive growth of the information is difficultand important problem confronting us in the field of computer natural languageprocessing. How to effectively control the massive data, accurately identify anddistinguish the information whether people need, has become a problem that need tobe solved today. So raised this topic of the keyword extraction, will help people toidentify and distinguish the vast amounts of information, if an article is able to extracthigh quality of keywords. The text Keywords automatically extracted processingtechnology can be widely used in many fields, such as text classification, informationfeedback system, and network information filtering systems, information retrieval,digital libraries, and automatic summarization.This article employs the keyword extraction algorithm based on TF statistics, andparsing. These includes the Chinese word segmentation, parsing, syntactic analysis,keyword extraction and so on technology, the main contents are as follows:1. Elaborate on Chinese keywords automatically extracted theoretical solutions andexperimental analysis. And raise the keyword extraction algorithm based on TFstatistics and parsing.2. Introduces in detail the Chinese participle technology, and summarizedsegmentation ambiguities. Then describes some more mature word segmentationalgorithm today, and compared, experimental data selected the Chinese Academy ofSciences segmentation system which results significantly better than the otheralgorithms experiment as the subject of the preliminary work tools. Then proposed astatistical method based on the actual application, further divided the initialsegmentation of the Chinese Academy of Sciences. 3. Detailed description of the most popular syntax analysis method: The rule-basedand statistics-based two methods. Rules and statistical comparisons with the two methods,through the research and analysis by other scholars, finally using the combination of thetwo approaches to build the tree bank.4. At parsing algorithm, introduce the more popular method briefly, and describe thecurrently recognized Chart algorithm detailed.5. In parsing, syntactic analysis, is by the University of Pennsylvania’s Penn corpus,extract the information of the structure of the sentence. And according to the practicalapplication of Chinese grammar, sentence elements respectively assigned to thedifferent levels of the value.6. Finally, through statistical and grammatical analysis, there are six kinds ofcharacteristic value as weight parameters and then explain, analysis it in detail.

Keywords/Search Tags:

Keyword extraction, Syntax analysis, segmentation

PDF Full Text Request

Related items

1	Research On Keyword Extraction Technology Oriented To Conversational Text
2	Research On Chinese Word Segmentation And Keyword Extraction Model Based On Deep Learning
3	Research On Keyword Extraction And Sentiment Analysis For Chinese Text
4	The Effective Text Keyword Extraction Technologies And Their Applications
5	Chinese Keyword Extraction And Analysis Based On Tourism Weibo
6	Research And Application Of Collaborative Filtering Algorithm Based On Keyword Extraction Technology
7	Research On Multi Feature Based Extract Text Keyword Algorithm
8	Research On Patent Novelty Analysis Technology Based On Keyword Extraction
9	Automatic Abstract Extraction Based On Keyword And Graph Model
10	Keyword Automatic Extraction Based On Similar Documents