Font Size: a A A

Research And Application Of Back-of-the-Book Indexes Generation Technology Based On Book Content

Posted on:2018-06-03Degree:MasterType:Thesis
Country:ChinaCandidate:D Z YangFull Text:PDF
GTID:2348330518475626Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In today's information age,the readers are often impossible to read through the books due to time-consuming.Therefore,the back-of-the-book indexes can help readers to quickly find the proper nouns,but also are important references to explain academic terminology,proper nouns and other related issues.The back-of-the-book indexes provide an important way to add value for books.The indexes take the content of the information in the book as a unit,so the readers can find and locate the information quickly through the phrases.Before reading,readers can easily choose their needed information in the book through the indexes.After reading,people canalso use indexes to review the knowledge once read.However,to automatically generate indexes for the Chinese books,there are several problems as following.First,because the Chinese grammar rules are more complex than English and the accuracy of the Chinese part of speech tagging algorithm is not good enough,based on part-of-speech matching rules to extract candidate indexes could not turn out a good result.Second,based on the supervised algorithm to extract the candidate indexes is lacking sufficient training corpus.At last,candidate indexes can become an index,not only need to consider the indexability and the context,but also need to consider the book's subject areas and users'interestingness.Therefore,on the one hand,we construct a high frequency part of speech rules of Chinese terminology,and combine the mutual information,information entropy and phrase-combination patterns to extract the candidate indexes.On the other hand,considering the terminology of the candidate indexes,the indexability,context weight and the position of indexes,we develop an algorithm to automatically generate back-of-the-book indexes,which is applied the Knowledge Service System for Science and Engineering Books.
Keywords/Search Tags:Back-of-the-Book Indexes, Phrase Extraction, Online Reading
PDF Full Text Request
Related items