The rapid growth of the Internet has created a huge demand for data analysis.Traditional data processing can only be computed on disk after data accumulation,and the computation time is long and the amount of data is limited.Analytical methods represented by traditional offline batch processing of big data no longer meet the needs of increasingly complex and diverse data processing.In addition,in the post-universal search engine era,search for a single domain is more widely used,but vertical domain search often does not provide analysis of the search data for that domain,resulting in ineffective observation of the search data behavior in that region.To solve these problems,this paper investigates the implementation of vertical search data analysis under streaming computing,using streaming computing techniques to solve the real-time data flow requirements,while providing different optimization schemes for system performance problems to improve data processing efficiency.In addition,considering the lack of relevance of the results of the data analysis,the visual presentation is too stereotypical,etc.,a scheme is proposed to optimize the results of the data analysis using search,importing the results of the data analysis into the search engine to build an index and establish the correlation between the analysis modules.In this paper,we study the process of search data analysis under streaming computing conditions and design a system implementation of search data analysis,which provides a common solution for the analysis of data in different domains and improves data management capabilities.The main research work in this paper is as follows.First,the streaming computing process is studied,and the implementation of streaming computing is combed in detail from the computational model,task management and other aspects,and the differences with the traditional offline batch computing method are compared.It also investigates the performance optimization of streaming computing processes and provides scenario-specific optimization solutions to improve system performance.Secondly,a scheme for using search to optimize the post-analysis results of data is proposed to solve the problems of lack of relevance of data analysis results andconfusion of analysis indicators,and to use search engines to retrieve and classify the post-analysis results to reduce the difficulty of retrieving analysis results.Third,a vertical search data analysis system based on streaming computing combined with offline batch processing is designed,and the big data architecture and system architecture needed for this system are studied and realized in response to search data characteristics and data volume requirements.In addition,based on the system implementation,the search log analysis metrics are combed through to provide dynamic real-time data analysis and visual presentation from search and user aspects. |