Font Size: a A A

Research On Techniques Of Large-scale Information Filtering And Its Application In Web Question Answering System

Posted on:2004-01-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:H B XuFull Text:PDF
GTID:1118360185996976Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
This dissertation addresses the techniques of Information Filtering in knowledge discovery and data mining, makes a thorough analysis on some key problems in Information Filtering, especially in Adaptive Filtering, and proposed a large-scale and high-capable integrated information filtering method. This method produced the best results among current information filtering systems.User requirement and user interest profile are the basis of Information Filtering. The dissertation introduces different user requirement expanding methods using traditional thesaurus-based and pseudo-relevance feedback techniques, then proposed a two-layer profile initialization method basing on an unbiased selecting technique for pseudo-relevant documents. To solve the problem that"small topic"is hard to filtering, the dissertation proposed a"small topic"automatic identification and optimization method. This method effectively improves the filtering performance on"small topic".After summarizing and analyzing traditional methods of feature selection, this dissertation proposed a flexible feature selection method aiming at topic granularity. This method automatically divides original user requirements into rude topic (requirement with big granularity) and detailed topic (requirement with small granularity), and then self-detemines the best feature selection method according to different topic granularity. This dissertation also investigates the smoothing techniques in term weighting.This dissertation discusses the problem learning from uncertainty information in Adaptive Filtering, and proposed an adaptive learning method using uncertainty information. It thoroughly explores the effect to performance by applying different methods dealing with the unjudged documents for profile updating, and finally achieves the most robust and effective profile updating method.Threshold optimization is one of the most important and difficult problems in Adaptive Filtering. This dissertation points out general drawbacks of current threshold optimization methods, and then proposed a target-oriented threshold optimization method, taking the evaluation measure itself as the target function for optimizing. Meanwhile, the dissertation also addresses holistic and local target function optimization strategies, summarizes their advantages and disadvantages, and then compares in the round the different performances with the methods that holistic or local target optimization controls and guides threshold optimization. Probing into the differences between two methods, the dissertation makes an important conclusion that threshold-optimization aiming at local target optimization is much better in Adaptive Filtering.
Keywords/Search Tags:Information Filtering, Adaptive Filtering, user requirement, user interest profile, small topic, topic granularity, uncertainty information, profile updating, threshold optimization, local target optimization, converse filtering
PDF Full Text Request
Related items