Font Size: a A A

Research Of Web News Event Clustering Based On Event Words And Co-Reference

Posted on:2009-03-16Degree:MasterType:Thesis
Country:ChinaCandidate:J S ZhangFull Text:PDF
GTID:2178360242989527Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
News, which is an important information source, is reported anytime and anywhere, and is disseminated across geographic barriers through Internet. Detecting the occurrences of new events and tracking the processes of the events are useful for decision-making in this fast-changing network era. Event clustering automatically groups documents by events that are specified in the documents in a temporal order. The research issues behind event clustering include: how many features can be used to determine event clustering, which cue patterns can be employed to relate news stories in the same event, how the clustering strategies affect the clustering performance using retrospective data or on-line data, how the time factor affects clustering performance, and how multilingual data is clustered.Event clustering on streaming news aims to group news documents by events automatically. This paper employs co-reference chains to extract most representative sentences. And then uses them to select the most important summaries and uses those summaries to determine event clustering .Due to the long span of events, a fixed threshold approach prohibits the latter documents to be the clustered and thus decreases the perdormance. A dynamic threshold using time decay function and spanning window is proposed. In the last of my paper , two models which use both co-reference chains and events are proposed to detect which factor is more important for event clustering . The experimental results show that both event words and co-reference chain are useful on event clustering. Event is more import for event clustering. In our experiment environment, we use Windows XP for our operating system and use Visual Studio2005 for our soft tool.
Keywords/Search Tags:Event Words, Co-Reference, Clustering, F-Measure, TF-IDF
PDF Full Text Request
Related items