Font Size: a A A

Design And Implementation Of Data Visualization System Based On Archive Resources

Posted on:2022-12-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y D LiFull Text:PDF
GTID:2518306746451974Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Archival resources are characterized by large quantity,a wide range of sources,political and confidential nature,etc.most of the full-text in the archives is undecrypted,and the content cannot be accessed.It is difficult to query quickly and accurately the required content from a large number of the archives.In addition,archives statistics play a crucial part in archives management.In the past,the statistical results were usually presented in the form of a pile of numbers,which could not express the trend of quantitative changes.To solve these two problems,the thesis designs and implements a data visualization system for archival resources from the perspective of data visualization and the main work of this thesis is summarized as follows:(1)Build a foundation database of archives resources,including full-text data of multi-field archives,such as 410 in the field of public security,548 in the area of transportation,148 in the area of water conservancy,2,432 in the field of taxation,etc.,and 21,790 pieces of archive metadata in different periods.The data base on which the system operates.(2)Construct a keyword extraction model based on the archival texts,which applies Word2 Vec to the Text Rank keyword extraction algorithm,and improves the initial weights of word nodes by integrating multiple features such as word frequency,part of speech,word position,and word length,and improves the deficiencies of the Text Rank algorithm in equal probability transfer weights in the random wandering process with the cosine similarity between word nodes,thus providing a more accurately extract and sort quality for sort of archival keyword,and improving the keyword extraction effect.(3)Design visualization function based on the archival query,through visualizing the relationship of the query result set and visualizing the archival keyword word cloud,so that the archivist can quickly understand the archival relationship and the main content of the archives,to implement quick positioning of the query archives and improve the efficiency of an archival query.(4)Design archival statistics visualization function,taking advantage of visual charts in time,quantity,proportion,and geographical expression,to display the archival statistics results,which can better representation of data trends.The system uses Django as the backend framework,ECharts as the graphics framework,Bootstrap as the frontend framework,Whoosh as the search engine,MySQL as the database,and u WSGI and Nginx as the web server for development.The security is improved,and the system has strong practical value.
Keywords/Search Tags:keyword extraction, archive management, archive query, data visualization, archive statistics
PDF Full Text Request
Related items