DRL-based progressive recovery for quantum-key-distribution networks

Li, Mengyao; Zhang, Qiaolun; Gatto, Alberto; Bregni, Stefano; Verticale, Giacomo; Tornatore, Massimo

doi:10.1364/jocn.526014

With progressive network recovery, operators restore network connectivity after massive failures along multiple stages, by identifying the optimal sequence of repair actions to maximize carried live traffic. Motivated by the initial deployments of quantum-key-distribution (QKD) over optical networks appearing in several locations worldwide, in this work we model and solve the progressive QKD network recovery (PQNR) problem in QKD networks to accelerate the recovery after failures. We formulate an integer linear programming (ILP) model to optimize the achievable accumulative key rates during recovery for four different QKD network architectures, considering different capabilities of using trusted relay and optical bypass. Due to the computational limitations of the ILP model, we propose a deep reinforcement learning (DRL) algorithm based on a twin delayed deep deterministic policy gradients (TD3) framework to solve the PQNR problem for large-scale topologies. Simulation results show that our proposed algorithm approaches well compared to the optimal solution and outperforms several baseline algorithms. Moreover, using optical bypass jointly with trusted relay can improve the performance in terms of the key rate by 14% and 18% compared to the cases where only optical bypass and only trusted relay are applied, respectively.