| Text classification is a vital branch of natural language processing that has attracted a lot of attention in recent years.The majority of the text in the network is manually generated and supplied by users since text data is flexible to update.Therefore,the information retrieval system benefits greatly from the uniformity of text data processing and text classification at various levels of granularity.Traditional text classification is to process the text as a serialized data and to predicts the context through the central word.As a result,the classifier can learn the knowledge of the new sequence while inheriting the old sequence information,so as to complete the information of the whole text sequence.Most of text dataset contains not only sequence information,but also interactions between texts with similar network structures,such as Knowledge Graph,Social Network,etc.To model the graph information,methods based on graph neural network are proposed by many researchers.Graph neural network can construct serialized text data into graph structure.Each word in the datasets are used as vertices of the graph and the interactions between texts are used as the edges of the graph.The information is transmitted by the edges between vertices to aggregate features.To sum up,we takes the graph neural network as the research method,the geographic text and user retrieval log text as the research objects for text classification.The main contributions of our work are as follows:(1)We propose a graph neural network for geographical text classification.This method constructs the geographic information in the geographic text into the graph structure with global information,and introduces the attention mechanism on the basis of the graph convolution network to encode high-order information of the geographic text.Graph attention mechanism can enhance the ability of the graph attention network to capture the important information in the text,and achieve better performance of classify texts containing geographic information.To verify the effectiveness of proposed method,we construct a text classification dataset containing geographic information and summarize available public Chinese datasets,manually annotate,and generate a novel dataset suitable for geographic text classification.(2)We propose a divide-aggregate hypergraph neural network.The goal of the model is to identify the user ’ s real search intention by using the search text typed by the user in the online search engine,which is essentially a text multi-classification task.Our method also addressed the challenge of graph data modeling for large-scale user retrieval logs,and capture high-order semantic information between texts to achieve text multi-classification.We divide the millionlevel dataset into several small sub-datasets,construct hypergraphs for each sub-dataset and aggregate features at the implicit level.Our proposed framework not only completes the largescale user intention detection task,but also generates the global feature representation of the whole dataset.To verify the ability of hypergraph network to generate global features and high accuracy in text classification,we construct a text multi-classification dataset containing multiattribute user search logs.The dataset contains search logs in 3 million user search engines,each containing many text fields and two labels. |