Font Size: a A A

Research And Application Of Kafka-based E-commerce Enterprise Search Engine Data Comprehensive Processing System

Posted on:2020-12-02Degree:MasterType:Thesis
Country:ChinaCandidate:X C WuFull Text:PDF
GTID:2438330572499556Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of e-commerce industry,the integration of online and offline services has been accelerated.With the continuous development of businesses,the amount of data stored in the system has also increased significantly.And the demand for data,such as commodity data,inventory data,store data and other kinds of data collection,query,sorting and screening is increasingly prominent.This puts forward higher requirements for the search function of enterprise e-commerce platform.Building a commodity search engine in line with the business scenarios of the enterprise can greatly improve the shopping experience of customers,which is of great strategic and practical significance to the enterprise.But the data synthesis processing system may solve the commodity search engine the data source question,is the search engine construction important premise.This paper aims to build a multi-source import data comprehensive processing system based on KAFKA.Through data collection and comprehensive processing,the structured wide table data that meets the needs can be obtained as the calculation result,so as to solve the problem of data import and update of e-commerce search engine.Make search engine business can be built quickly develop.Through the study found that the traditional relational database of data processing,in the face of complex business logic and application bottlenecks,scale expansion of the common general ETL tools,on the one hand,based on the scenario,data real-time response is not enough,on the other hand,customizability,and need professional maintenance staff,high input costs.Based on the actual needs of enterprises,this paper selects the self-developed data comprehensive processing system by combing and screening the needs and integrating the factors such as project economy,personnel economy,project schedule,system function perfection and system expansibility.Firstly,the requirements are collected,the peripheral ecology and functional positioning of the system are defined,and the non-functional requirements of the system are determined.Further,the functional positioning and boundary division of the subsystem of data comprehensive processing are carried out,and the requirements and responsibilities of each module are clarified.Then,the data receiving subsystem,data processing subsystem,data submission subsystem and task scheduling subsystem are designed and implemented in detail.By using the introduction of open source distributed components such as KAFKA,Cassandra,vert.X framework and elastic-job,the system's high performance and scalability are guaranteed at the beginning of design.At the same time,a unified system receiving process specification is designed to ensure the reliability of data receiving.Through a unique time slice data processing mechanism,played a Cassandra,vert.X framework and characteristic of elastic-job,make the system can be the task of multi-type parallel processing,lightweight implementation to achieve wide table merging.At the same time,the problem of conflicting task state is solved by the order of data receiving and the uniqueness of version.At last,the system is tested by functional test and non-functional test.Through the research on the comprehensive data processing system,this paper found that the current general scheme has the problems of scalability,real-time,customization,economy,etc.By using open source distributed components,self-developed the comprehensive data processing system,the problems in the performance and scalability of the general system were solved.At the same time,by designing the system receiving process specification and time piece data processing mechanism,the reliability of the system was increased,and finally the smooth launch of the system was guaranteed.
Keywords/Search Tags:KAFKA, Comprehensive data processing, Search engine, Multi-source import, ETL, Wide table
PDF Full Text Request
Related items