| With the development of information technology and the Internet,massive amounts of data in the Internet have caused great information overload.In order to solve the problem of information overload,the recommendation system comes into being,and its purpose is to guide users to discover products or information which they are interested in based on their intentions.At present,the recommendation system architecture of Netease KaoLa has been unable to cope with the growth of millions of goods.It takes a lot of time to find the products that users are interested in from millions of goods.As the number of goods grows,goods information update will become more frequent.Goods update requires lots of system resources,which causes the frequency of service jitter to get higher and higher.At the same time,with the expansion of services,more and more business scenarios are suitable for the recommendation system.For each scenario,developers have to program continuously to meet the business requirements,and a large amount of duplicate code will reduce the development efficiency.The thesis summarizes the related concepts and core technologies of the recommendation system,analyzes the current business needs of Netease KaoLa recommendation system,and proposes a new distributed recall engine in combination.The system designed in the thesis contains four core modules including Distributed Parallel module,Recall Filter module,Scenario Configuration module and Data Update module,which solved the issues of multi-scenario recommendations,millions of information updates and service stability with the mainstream technology.The system used Dubbo and Akka to build the Distributed Parallel module.The Distributed Parallel module used the cluster to solve the problem of massive data storage and update,which improved the stability and the processing speed of the service.At the same time,the system used Redis and Ehcache as cache middleware in the Recall Filter module,which further improved the execution speed of business processes.In the Scenario Configuration module,the system used configuration files to manage all business scenarios,which solved the problem of multi-scenario recommendations.In the Data Update module,the system used file update and Kafka to ensure the real time and validity of the data.After functional and performance test,the system runs well online.A single machine in the cluster can process 2000 requests per second.The response time and garbage collection time of the recommendation system interfaces are 50%less than before.The system can meet dozens of current business scenarios and possible expansion needs in the future.And as the number of goods continues to grow,the system can provide a stable service of recall and filter. |