Font Size: a A A

A Hybrid Data And Model Transfer Framework For Distributed ML

Posted on:2021-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:J M YanFull Text:PDF
GTID:2428330614968279Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The solution of large-scale machine learning(ML)bases on data,hashrate and algorithm However,when facing the explosion of data,the failure of Moore's Law and rapid growth of the size of neural network(NN)model,it is unrealistic to train a large-scale ML model on a single processor.Therefore,distributed machine learning(DML)is becoming increasingly popular.The mainstream DML framework Parameter Server(PS)bases on the large scale-scale computing clus-ters in the cloud,it first needs all the local data transferred to the cloud,afterwards splits the global data and assigns the ?D(Identical and Independently Distributed)data shards to the distributed computing nodes,which leads to significant time delay.What's more,PS is a centralized frame-work which leverages a central node to aggregate parameters of all nodes,generate an averaged global model and distributing the global model.The centralized configuration assigns high compu-tation load to the central parameter server and leads to heavy communication cost.In the forthcom-ing 5G(Fifth Generation)wireless network,a bunch of Non-?D(Non-identical and Independently Distributed)data sources are accessible to each node in a network,and an abundant amount of storage,computation and communication are carried out locally.In such a big data network,we leverage the capability of computing and communicating on every node,so as to bring up a DML framework based on centralized model ensemble and decentralized instance transfer.The contents of our proposal are listed as followsFirst,instead of redistribute all data by a center,we consider keep the Non-?D(Non-identical and Independently Distributed)data locally.Based on the Non-?D data,we propose a DML frame-work named Model Ensemble to alleviate the burden of centralized communication.Different from the PS framework which share a single model via a center,the distributed nodes in Model Ensemble have independently local models,and they uploaded the models to a fusion node which combines them in an ensemble way to yield a global model.Numerical results show that the Model Ensemble framework brings significant performance promotion when the local models achieve low biasThen,in order to make up the performance of the Model Ensemble framework when local models are in high bias,we allow instance to transfer between the neighboring nodes in a decen-tralized way.Further,inspired by transfer learning,we leverage weighted transfer to raise the transfer efficiency and set a transfer threshold to reduce the communication costAt last,we proposed a meta-learning based communication scheme to further reduce the de-centralized communication cost.Specifically,we bring up a meta learner-base learner framework on each node.The meta learner can adaptively and actively to make the next most promising transfer decision by making trade-off between communication cost and learning performance.In contrast to the heuristically transfer scheme,the meta-based transfer scheme is more selective and forward-looking.Simulation results validate that it reduces the communication cost significantly with higher performance promotionTo sum up,the proposed DML framework based on hybrid transfer including centralized and decentralized transfer,and the transfer obj ects insist of model parameter and training instance.Such framework exploits the communication and computation storage in the distributed network leading to a centralized burden devolution.Moreover,the communicating and computing processes are integrated closely to bring a network with high efficiency and intelligence.
Keywords/Search Tags:distributed machine learning, big data network, model ensemble, transfer learning, meta learning
PDF Full Text Request
Related items