Font Size: a A A

The Research Of Vietnamese Language News Build Lexical Chain Based On Converged Network Semantic Knowledge

Posted on:2016-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:W X YuFull Text:PDF
GTID:2308330470470752Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The current computing technology, especially the rapid development of Internet technology, information technology is having a profound impact on people’s lives. Text Data ocean daily news events in the formation of the urgent need to provide users with efficient text information processing services. Text information processing includes text categorization, text clustering, text mining and approximate query processing and other content, and text keyword extraction and build vocabulary in the above chain has a wide range of applications, it is not only indispensable to carry out these tasks on the basis and prerequisite is also an important task of building a database of information on the Internet. Text automatically extracted keywords and vocabulary building chain is the basis of information retrieval and summary generated Web page retrieval, document clustering, document summary extraction, text mining and other aspects of a wide range of applications.First, a brief introduction of natural language processing, text information related knowledge and feature items such as preprocessing, analysis and comparison of commonly used keyword extraction algorithm discussed GenEx system for English keywords extracted extraction algorithm and Naive Bayes Chinese text processing PAT TREE, maximum entropy model and other related work, and were classified.Then, combined with Vietnamese language features, three text-based feature items by considering the candidate word text keyword weight calculation algorithm for extracting TFLD (Term Frequency, Location & Distance algorithm), the algorithm is based on word frequency, word distance regional location and order of three kinds property characteristics, achieve Vietnamese news event keyword extraction.At the same time, the performance of coherent vocabulary chain semantic relationships between words caused, it provides clues about the structure and themes of the news. Combined network of semantic knowledge HowNet, WordNet and Wikipedia resources, using semantic disambiguation related degree and eventually constitute lexical chain expression news events clues information.Finally, the use of the above work to achieve more language news event text keyword extraction and lexical chain prototype system built.
Keywords/Search Tags:Natural language processing, Keyword extraction, Semantic knowledge base, Disambiguation, Build lexical chain
PDF Full Text Request
Related items