Font Size: a A A

Study On How To Describe Characters Of Web Pages In Personalization Service

Posted on:2005-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:S WangFull Text:PDF
GTID:2168360125963845Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The development of Internet technique can give people convenient and fast ways of obtaining information. But people always don't know how they can do, in face of large quantity information. Lots of experts have come to pay attention to one question, which is how to afford people fast and exact information that can fulfill users' demand. Today, a new concept comes into being , which named Personalization Service, that is studying users' interest and behavior by collecting and analyzing users' information in order to commend information to user initiativelyPersonalization Service System should describe accurately the Web Pages that users had visited and had interested in to describe the interest of the users accurately. That is, describing the content of Web Pages by using succinct and representational character. So, whether can describe accurately the Web Pages will affect veracity of users' interest whether or not. This is the key aspect of Personalization Service. There hasn't systemic studies on describing characters of Web Pages nowadays. This paper pays attention to the methods of how to describe the character of Web Pages, including three aspects: First, extract areas of signature information. This paper selects the fields of computer, and analyzes the pages' organization and structure. There extract title, boldface,first paragraph and end paragraph as Semantically Significant Phrases of Web Pages according into the characteristic content of this field of Web Page by using technique of Semantically Significant Phrases. This method is different from old one that extracts title and abstract only. After participle the semantically significant phrases, there forms a dictionary that including words of computer fields and some common words. The dictionary can afford reference to obtain keywords of Web Pages of the same fields. It can decrease the workload of participle.Second, keywords standardize and lexical disambiguation. There is a ontology model to standardize the words, and don't put all the expression form into keywords character vector that used to do before. There is also a ontology dictionary that including terms of the field of computer and its thesaurus, which connect them by link, and including some common words and its thesaurus. In order to disambiguate, there forms expanded ontology model, which contains comprehensive information of words, and it can provide effective structure and storage. It is arrange in pairs database of the field that make disambiguation do.Third, weight terms. This paper provides a new method that is adapt to Web Pages, which emphasizes the information of the place of Title, by analyzing some lack of the old method nowadays. The experiment results show the new method has better effect on recommending web pages to users.
Keywords/Search Tags:describe character, standardization, word disambiguation, term-weighing, personalization
PDF Full Text Request
Related items