Font Size: a A A

Design And Implementation Of A Document Tendency Analysis Based On The Associated Field

Posted on:2014-05-02Degree:MasterType:Thesis
Country:ChinaCandidate:J HuFull Text:PDF
GTID:2268330422964504Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Researchers could conveniently obtain these documents which they want. However,there prominent a problem: researchers could only obtain the massive information, butcould not obtain the knowledge behind the information and they should analyse andextract from the massive scientific and technical literatures by themselves. So, ifautomatically analysing the documents quickly and accurately and giving the trends ofdomains based on the domains documents will be helpful to researchers especially forfreshmen. The existing methods all focus on web news, little focus on scientific andtechnical literatures. Only the little research, it also has a problem. That is it ignored theimportance of articles which can affect the trend of thematic areas.Different documents with different importance will give different effect to the domaintrend. Based on this we propose a system that automatically give trends of domain. Itproposes a set of algorithm that analysed the trend of different fields. The basic idea is:using the source of scientific literatures, based on associated analysis on literatures,authors, conference/journals, developed a set of document important metric methods, andsorting them based on their ranking, and giving the importance the quantifying value.Then participle each articles, and ultimately the texts will change to the formation of thetext vector. Then cluster analyse the text vectors. Each cluster represents a related field.Analyse the final value of each weight of term, the direction of field in which there will bea very good development prospects will be shown. Our system is a B/S structure, and theusers can immediate obtain the hot term of domain and the trend of relative domain fromthe system analysis by sending the request to the server.Though test, the trend analysis system which based on the associated fields is able togive a very good analysis to the documents set, and final the generated tree of thematicareas can be very good and match the actual demarcation of the areas of the theme. Compared with other systems, our system is in a leading position on accuracy andresponse rate. And the last statistic information about the generation of emerging trends inword is convenient for subsequent analysis. The graphical interface even makes the trendhaving a very direct result display.
Keywords/Search Tags:Vector space model, Cluster analysis, Theme detection, Emerging trends detection
PDF Full Text Request
Related items