TY - JOUR
T1 - Distributed online bandit optimization under random quantization
AU - Yuan, Deming
AU - Zhang, Baoyong
AU - Ho, Daniel W. C.
AU - Zheng, Wei Xing
AU - Xu, Shengyuan
PY - 2022
Y1 - 2022
N2 - This paper considers the problem of solving distributed online optimization over a network that consists of multiple interacting nodes. Each node in the network is endowed with a sequence of loss functions, each of which is revealed to the node after a decision has been committed. The goal of the network is to minimize the cumulative loss functions of nodes in a distributed fashion, while subject to two types of information constraints, namely, message quantization and bandit feedback. To this end, a quantized distributed online bandit optimization algorithm is proposed by adopting random quantization operation and one-point gradient estimator. We show the convergence of the algorithm by establishing an O(dT3/4) regret bound, where d is the dimension of states and T is the total number of rounds. Finally, an online distributed quadratic programming problem is investigated to validate the theoretical findings of the paper.
AB - This paper considers the problem of solving distributed online optimization over a network that consists of multiple interacting nodes. Each node in the network is endowed with a sequence of loss functions, each of which is revealed to the node after a decision has been committed. The goal of the network is to minimize the cumulative loss functions of nodes in a distributed fashion, while subject to two types of information constraints, namely, message quantization and bandit feedback. To this end, a quantized distributed online bandit optimization algorithm is proposed by adopting random quantization operation and one-point gradient estimator. We show the convergence of the algorithm by establishing an O(dT3/4) regret bound, where d is the dimension of states and T is the total number of rounds. Finally, an online distributed quadratic programming problem is investigated to validate the theoretical findings of the paper.
UR - https://hdl.handle.net/1959.7/uws:69878
U2 - 10.1016/j.automatica.2022.110590
DO - 10.1016/j.automatica.2022.110590
M3 - Article
SN - 0005-1098
VL - 146
JO - Automatica
JF - Automatica
M1 - 110590
ER -