| Under the background of global warming,extreme weather frequently occurs.Urban waterlogging caused by short-term precipitation weather has many impacts on urban transportation,residents’ travelling,etc.Even more,it will cause property damage and casualties.At present,the urban waterlogging monitoring mainly uses the equipment at the stable places,and the coverage is not enough.It is difficult to obtain the disaster information,and it needs to be supplemented by other data sources.Due to its real-time nature and wide source,social media is widely used in disaster monitoring,which is of great significance for disaster response and assessment.This article uses the three-year Sina Weibo data from 2016 to 2018 in Shanghai to study how to extract urban waterlogging related microblogs from the complex microblog data and extract geoinformation from the microblog text,as well as analyze and verify the extraction results.The main contents of the paper are as the following:(1)Extract the microblog text of urban waterlogging.Weibo has the characteristics of unstructured,diverse expressions,and irregularities.It is very difficult to accurately extract the Weibo of a certain topic.Based on the text similarity analysis method in deep learning,this paper finds similar words related to waterlogging from the corpus,and on this basis,selects keywords for urban waterlogging disasters.Using urban waterlogging disaster keywords to extract urban waterlogging related microblogs,compared with the simple use of "waterlogging" keywords,more related microblogs can be extracted.(2)Extract the location of the disaster situation in the Weibo text.Compared with the Weibo check-in location,the geo information in the Weibo text can more accurately reflect the location of the waterlogging event.Some existing methods can better recognize the existing address nouns in the corpus,but the place names in the Weibo text have the characteristics of wide range and wide variation,and many are not included in the corpus.In this paper,the Bi LSTM-CRF algorithm is employed to extract the place names in Weibo text and compared with the extraction results using the CRF algorithm.The comparison results show that the Bi LSTM-CRF algorithm can extract more place names that are not in the corpus with higher accuracy.(3)Analysis and application of urban waterlogging Weibo.The relationship between the number of microblogs and the weather is analyzed,and it is found that the number of microblogs has a good correspondence with rain and snow days.When the weather changes from non-rain and snow weather to rain and snow weather,or the rain and snow weather is changed to non-rain and snow,the number of blogs will increase significantly.Spatial positioning of waterlogging related microblogs with location information found that most of the waterlogging points were concentrated in subway entrances,commercial plazas and roads.Through comparison and analysis with the historical waterlogging data of Gaode,it is found that about 20% of the points coincide with the data of Gaode,indicating that the waterlogging information extracted from Weibo can complement each other.The research results of this paper can provide new data acquisition and extraction methods for urban waterlogging monitoring,and provide more data support for decision-making. |