| With the rapid development of urbanization in China,the problem of urban flood disasters has become increasingly prominent,which severely restricts the sustainable and healthy development of the economy and society.As one of the most important sources of rainfall observation data for disaster analysis,the urban rainfall station is affected by the surrounding environment and the construction requires a lot of manpower,material resources and financial resources.The current problems of irrational and sparse distribution of rainfall station networks in many cities have brought great uncertainty to disaster analysis.With the advent of the big data era,urban flood disaster data has burst out,and social media data,which is an important role,contains a lot of knowledge for mining.In the process of urban flooding,the official media and users will release flood-related information in real time,including largescale texts,pictures,videos,etc.How to extract,process,store,and mine data with potential application value has gradually become a hot issue in the era of data-driven development.This article is based on the key project of the National Natural Science Foundation of China "Research on the Theory and Method of Urban Flood Disaster Forecast and Early Warning Based on Big Data"(No.51739009).Based on the analysis and application method of big data theory,and urban flood social media data,it carried out its research in optimizing the layout of traditional urban rainfall stations.(1)Analyze the characteristics of urban flood social media data and its application in the study of urban flood and rainfall process.Summarizes the data scale and characteristics of Weibo as the main social media platform in the Internet era,and its relationship with urban rainfall,which proves that social media data has typical Internet big data characteristics.There is a significant correlation between data and urban rainfall data,which can be used as a supplement and application of unstructured data in traditional urban flood disaster research.(2)Constructed the urban flood social media corpus to realize the formatted storage and management of unstructured social media data.It discusses the significance of constructing the urban flood social media corpus as the research foundation,and proves the necessity of extracting and formatting unstructured data in the web2.0 era.Based on the web crawler technology,the simulated login process of the web version of the Weibo client and the analysis and extraction process of the urban flood data page are realized,and the data processing method is used to achieve the deduplication and deinterference of the urban flood data,and finally a two-dimensional table The data is stored in the format of which is used to build the urban flood social media corpus.(3)A dictionary of urban flood keywords was constructed to classify and summarize the search terms in the data extraction process,to make up for the lack of corpus data caused by the lack and lack of search terms.Taking urban flood social media corpus as the data source,through the screening and word segmentation of keyword candidate sentences,based on the TF-IDF algorithm to filter keywords,iteratively and realize the continuous supplement of the urban flood keyword dictionary,make up for the impact caused by the lack of keywords.(4)Based on social media data,an urban simulated rainfall station taking Zhengzhou as an example was constructed,which realized the visualization of social media data,and established the optimal buffer zone for the simulated rainfall station.And use ordinary kriging,inverse distance weighting and Bayesian interpolation method to interpolate the rainfall of the urban simulated rainfall station.Comparing the interpolation results of traditional urban rainfall stations,the optimal number and distribution types of simulated rainfall stations are screened out. |