Font Size: a A A

Based On Analysis Of Subject Indexing Rules Of Syntactic Analysis

Posted on:2018-02-07Degree:MasterType:Thesis
Country:ChinaCandidate:D ZhaoFull Text:PDF
GTID:2348330521451313Subject:Books intelligence
Abstract/Summary:PDF Full Text Request
With the development of the Internet databases,individuals put their major dependence on the Internet when they look for information,so there are always various kinds of relevant information on the Internet.But because of massive literature information,the workload of subject indexing staves who often feel unable to start,is quite large.In order to cater to the needs of the times and users and reduce the burden of indexing personally,it is necessary to establish a method which can automatically identify the theme of literature.In the situation,according to the particularity of subject indexing,this paper attaches to analyze subject indexing rules based on syntactic parsing methods.The innovation,firstly,is the using of syntactic parsing tool obtaining the required lexical information in indexing literature.Secondly,according to the proportion of required lexical information in the indexing literature and relationship among the required lexical information,the relevant theme structure can be found.The third,to accurate the required lexical information,the achievement of the subject indexing of literature can be done.What's more,the improvement of syntactic analysis method is presented for special sentences composed by the multi topic,the topic of the topic and nested topic...This paper uses the method of syntactic parsing only for titles of litertures and then obtain syntactic tree structure.Different designs are made for different syntactic parsing methods,this paper in the propose of obtaining the relationship between various factors and indexing terms.The main steps are as follows:Firstly,extracting the topic sentence,and analyzing them by syntactic parsing.Several methods of different analysis methods on different syntactic topic sentences are desired,so that the required words or phrases in each document subject indexing can be got.Moreover,creatively,the literature of probability calculation of words and phrasesare done.Secondly,to summarize the theme structure based on the extraction of topic sentences,that is,extracting the content of theme,analyzing it by syntactic parsing and finally forming the syntax tree.According to POS tagging,phrase structure,sentence component labeling made by parser,this parser analyses the dependency relationship between words and phrases.According to different sentence structures,various ways are used,which all use different methods of syntactic parsing for words or phrases for indexing,and correspond to relevant factors of theme;accounting for the probability calculation in litertures of required indexing words or phrases with the same sentences and writing words with high probability into the significant factors on theme to index.Finally,some topic sentences with special structure and analyzing the complex text and special structures from the syntactic meaning and structure,this paper aims to put forward respectively the improved methods,so as to use syntactic parsing in subject indexing better.This paper analyzes the problems existing in the syntax analysis method used in this paper,which exert its value in actual indexing.
Keywords/Search Tags:Subject indexing, Syntax analysis, Sentence component labeling, The topic sentence, Syntax tree
PDF Full Text Request
Related items