Font Size: a A A

Dynamically discovering similar resources in large-scale information networks

Posted on:1998-11-16Degree:Ph.DType:Dissertation
University:University of California, Los AngelesCandidate:Perry, Bradley JohnFull Text:PDF
GTID:1468390014975467Subject:Computer Science
Abstract/Summary:
his dissertation develops a novel approach to compute and search for context-specific similarity relationships between HTML resources (HyperText Markup Language). The basis of the approach is a technique to capture the similarity between two HTML resources in a multi-dimensional feature vector. The similarity is computed by decomposing resources into parts, finding similar parts among resources, and then extracting the pattern of matched parts into a feature vector. As a result, the vectors capture the localized content similarity and overall organizational similarity between any two resources. This general approach for resource-resource linking is termed part-linking and its specialization to HTML resources is termed html-linking. Both concepts are created and described in this dissertation and the html-linking method is fully defined and implemented atop live data sources. Given the vectors describing resource-resource associations, the following search process can be employed: "Find all resources similar to...
Keywords/Search Tags:Resources, Similar
Related items