Font Size: a A A

Topic Chain-based Topic Information Extraction From Chinese Food Complaint Documents

Posted on:2013-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y K TianFull Text:PDF
GTID:2248330395472419Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In recent years, there are so many food safety incidents appearing in public. For instance,“The melamine” and “Clenbuterol”. This pushes food safety issues to another high level. Notonly Food safety issues are related to our health, but also the stability of our country. Thosefrequently happening events take a significant damage to the people and economy of thecountry. In the same time, they also decrease the confidence of consumers to the product andmarket, even the country.Since hidden dagger exists in food and the government also encourages consumers to sueby the public website, there are a mass of files about suing on food. If we can explore amethod to settle and extract those files, it will provide an effect of warning to food safetyissues.Topic chain-based topic information extraction from Chinese food complaintdocuments,can make food security information management generalized and intelligentized.The traditional information extraction just extracts individual word or single information. Inthis paper, however, we introduce the Domain ontology to guide the extraction of Chinesefood complaint documents by the form of topic chain. This method can extract the dangeroustopic. Moreover, it can clarify the topics and make the extraction more intuitive.Four basic module consisting of this paper, there sequentially are “pretreatment module”,“dynamically allocating windows module”,“extracting topic chain module” and “semanticsclarifying module”. Those four modules cooperate to finish our work,“pretreatmentmodule” parse the domain ontology of food domain and build the knowledge base;“dynamically allocating windows module” firstly analyze the attributes of the words resultingfrom pretreatment, then dynamically allocating windows by those attributes and store therelated information;“extracting topic chain module” remove the repeated ones in theextracted words related to topics and integrate the results, then extracts the first topic chainand second topic chain representing topic information;“semantics clarifying module”automatically clarifies the Chinese food complaint documents according to the dangeroustopic, which is obtained by topic chain. By our method, consumers can obtain the hot topicsin suing files intuitively. On the other hand, it is convenient for supervision department tosupervise.
Keywords/Search Tags:Topic chain, Domain ontology, Similarity computing, Topic Information Extraction
PDF Full Text Request
Related items