Today's cloud system are composed of geographically distributed datacenter interconnected by high-speed optical networks. Disaster failures can severely affect both the communication network as well as datacenters infrastructure and prevent users from accessing cloud services. After large-scale disasters, recovery efforts on both network and datacenters may take days, and, in some cases, weeks or months. Traditionally, the repair of the communication network has been treated as a separate problem from the repair of datacenters. While past research has mostly focused on network recovery, how to efficiently recover a cloud system jointly considering the limited computing and networking resources has been an important and open research problem. In this work, we investigate the problem of progressive datacenter recovery after a large-scale disaster failure, given that a network-recovery plan is made. An efficient recovery plan is explored to determine which datacenters should be recovered at each recovery stage to maximize cumulative content reachability from any source considering limited available network resources. We devise an Integer Linear Program (ILP) formulation to model the associated optimization problem. Our numerical examples using the ILP show that an efficient progressive datacenter-recovery plan can significantly help to increase reachability of contents during the network recovery phase. We succeeded in increasing the number of important contents in the early stages of recovery compared to a random-recovery strategy with a slight increase in resource consumption.
Progressive datacenter recovery over optical core networks after a large-scale disaster
TORNATORE, MASSIMO;
2016-01-01
Abstract
Today's cloud system are composed of geographically distributed datacenter interconnected by high-speed optical networks. Disaster failures can severely affect both the communication network as well as datacenters infrastructure and prevent users from accessing cloud services. After large-scale disasters, recovery efforts on both network and datacenters may take days, and, in some cases, weeks or months. Traditionally, the repair of the communication network has been treated as a separate problem from the repair of datacenters. While past research has mostly focused on network recovery, how to efficiently recover a cloud system jointly considering the limited computing and networking resources has been an important and open research problem. In this work, we investigate the problem of progressive datacenter recovery after a large-scale disaster failure, given that a network-recovery plan is made. An efficient recovery plan is explored to determine which datacenters should be recovered at each recovery stage to maximize cumulative content reachability from any source considering limited available network resources. We devise an Integer Linear Program (ILP) formulation to model the associated optimization problem. Our numerical examples using the ILP show that an efficient progressive datacenter-recovery plan can significantly help to increase reachability of contents during the network recovery phase. We succeeded in increasing the number of important contents in the early stages of recovery compared to a random-recovery strategy with a slight increase in resource consumption.File | Dimensione | Formato | |
---|---|---|---|
Ferdousi_DRCN_16.pdf
Accesso riservato
Descrizione: Ferdousi_DRCN_2016
:
Pre-Print (o Pre-Refereeing)
Dimensione
2.8 MB
Formato
Adobe PDF
|
2.8 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.