Abstract
Managed by an auto-scaler in the clouds, applications may still be overloaded by sudden flash crowds or resource failures as the auto-scaler takes time to make scaling decisions and provision resources. With more cloud providers building geographically dispersed data centers, applications are commonly deployed in multiple data centers to better serve customers worldwide. In this case, instead of sufficiently over-provisioning each data center to prepare for occasional overloads, it is more cost-efficient to over-provision each data center a small amount of capacity and to balance the extra load among them when resources in any data center are suddenly saturated. In this paper, we present a decentralized system that timely detects short-term overload situations and autonomously handles them using geographical load balancing and admission control to minimize the resulted performance degradation. Our approach also includes a new algorithm that optimally distributes the excessive load to remote data centers causing minimum increase of overall response times. We developed a prototype and evaluated it on Amazon Web Services. The results show that our approach is able to maintain acceptable quality of service while greatly increase the number of requests served during overloading periods.
Original language | English |
---|---|
Article number | e4126 |
Number of pages | 15 |
Journal | Concurrency and Computation: Practice and Experience |
Volume | 29 |
Issue number | 12 |
Publication status | Published - 25 Jun 2017 |
Bibliographical note
Publisher Copyright:Copyright © 2017 John Wiley & Sons, Ltd.
Keywords
- cloud computing
- load balancing
- web applications