Font Size: a A A

Building A Connected Semantic Knowledge Base Using Heterogeneous Chinese Encyclopedias

Posted on:2014-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:X NiuFull Text:PDF
GTID:2248330392960908Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Man has never stop archiving knowledge and building knowledge bases: formancient Yongle Encyclopedia and authoritative Encyclopedia Britannica to global on-line free Wikipedia. Nowadays, the highly developed computer techniques enable us tobuild the machine-readable semantic knowledge bases, which support reasoning andpave the path to the magnifcent goal of artifcial intelligent. Recently, people pay in-creasing attention to building semantic knowledge bases using Web encyclopedias andSemanticWebtechniques. InEnglishcommunity,suchworkhasalreadybeentakenforyears and commercial products begin to spring up, but little related work can be foundw.r.t. Chinese community. The number of lemmas in Chinese online encyclopedias(e.g. Baidu Baike and Hudong Baike) equally match the one of English Wikipedia.So we exploited such abundant resources to explore efective ways to build a Chinesesemantic knowledge base. We employed heuristic rules, Chinese word segmentationand association rule mining algorithms to accomplish the semantic information extrac-tion, cleansing and mining. Integrating these data harvested from heterogeneous datasourcesisthekeychallengeofthisproject. Inparticular, weproposeasemi-supervisedlearning algorithm to iteratively refne matching rules and fnd new matches of highconfdence based on these rules. This dramatically relieves the burden on users ofdefning rules and similarity metrics but still gives high-quality matching results. Fi-nally, we briefy introduced how we publish our connected Chinese knowledge base(Zhishi.me) based on the standard of Linked Data.
Keywords/Search Tags:Knowledge Base, Linked Data, Encyclopedia, Chi-nese, Semantic Web
PDF Full Text Request
Related items