With traditional informatics theory and practice, the paper try on using automatic indexing technology of documents to process Web pages. Firstly, the present condition and shortages of search engine are described; Secondly, the characteristics of Web pages data are analyzed; Lastly, the author present the indexing scheme of Chinese Web pages and develop a experimental search engine of economic information with network technology.The automatic indexing of Chinese Web pages is based on knowledge database. In fact, the knowledge database is an experiential specialist system, which includes library classification, thesaurus, concordance of class number with descriptor, synonymous dictionary, keywords lists, stop-words lists, etc.After determining the indexing data of Web pages, the method of weighted word frequency, which combined with statistical algorithms, is adopted to exercise the subject indexing of Chinese Web pages. And then, the paper use the measure of literal similarity to classify the Chinese Web pages, which based on lots of experiential classifying data.Then, the author uses Borland Delphi and Visual FoxPro to develop an automatic indexing system, which is used to process Chinese Web pages. The experiential system is composed of Web pages text analysis, automatic words extracting, automatic subject indexing, classifying, indexing result confirmation and knowledge database maintenance. Moreover, the design procedure, workflow, usage approach, running conditions of the system are detailed.According to the trend of integration of classifications and thesauri, the paper also designed keyword retrieval system, directory search system, and integrated system individually.At the end of the paper, the author tests and evaluates the automatic indexing system in some aspects; the deficiency of system is also detailed objectively.
|