Font Size: a A A

Design And Implementation Of Science And Technology Information Portal

Posted on:2001-07-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z X ZhangFull Text:PDF
GTID:1118360002450806Subject:Library science
Abstract/Summary:PDF Full Text Request
Internet is the biggest and rapidly growing information resource library in the world. It is a challenge for the documentation and information institute to effectively organize the information resources on the Internet and put them into a good use. In this paper, the concept of "portal site"is introduced to the library and information domain and the idea of developing the system of "Science and Technology Information Portal (STIP)"is presented. By taking advantage of such a portal that can automatically crawl, organize and distribute information, the documentation and information institute will be able to promote its capablity of exploiting the science and technology information on the Internet and enhance its capability of information processing and information set-vices.After the studying of the information technologies used on the Internet, such as markup language and metadata, robot, text summarization and automatic classification, information retrieval and information distributing, the author designed a vertical portal: STIP, which is special for science and technology. By now the author have developed the following subsystems: (1) STRobot, a robot which has the ability to crawl web pages, parse the HTML files and extract metadata; (2)STBrowser, a tool that can be used to help gather information on the Internet; (3) STPortal, a web portal that serves the user on the internet, which provides following services: information retrieval, information browsing, topic-specific report, news on science and technology, personalized services, forum on science and technology, the top list of visited URL, web site presenting etc; (4) STMirror, a mirroring system that provides full-text search engine.During the procedures of system design and implementation, the author has made innovative attempts in some area, such as: employing Microsoft's advanced Windows DNA architecture and its component technology to develop STLP system; using java technology to develop a multi-thread robot which has a vigorous parser; taking advantage of the characteristics of RDBMS to develop boolean retrieval,weighed term retrieval, and natural langUage retrieval; making fully use of channeland having bolemented Personalized dynamic channl to push information. Based onthe research, we have got a set of solution on how to bulld a vertical portal andobtalned some core technologies on crawling, organialng and distributing informationon the Intemet. ms has laid a good foundation for the documentaion andinformation institute to exploit the science and tecImology information on the Intemet.
Keywords/Search Tags:STIP, robot, multi-thread crawling, information retrieval, personalizedservices, personalized dyndric channel
PDF Full Text Request
Related items