Hierarchical optimal synchronization for linear systems via reinforcement learning : a Stackelberg-Nash game perspective

Man Li, Jiahu Qin, Qichao Ma, Wei Xing Zheng, Yu Kang

Research output: Contribution to journalArticlepeer-review

43 Citations (Scopus)

Abstract

Considering the fact that in the real world, a certain agent may have some sort of advantage to act before others, a novel hierarchical optimal synchronization problem for linear systems, composed of one major agent and multiple minor agents, is formulated and studied in this article from a Stackelberg-Nash game perspective. The major agent herein makes its decision prior to others, and then, all the minor agents determine their actions simultaneously. To seek the optimal controllers, the Hamilton-Jacobi-Bellman (HJB) equations in coupled forms are established, whose solutions are further proven to be stable and constitute the Stackelberg-Nash equilibrium. Due to the introduction of the asymmetric roles for agents, the established HJB equations are more strongly coupled and more difficult to solve than that given in most existing works. Therefore, we propose a new reinforcement learning (RL) algorithm, i.e., a two-level value iteration (VI) algorithm, which does not rely on complete system matrices. Furthermore, the proposed algorithm is shown to be convergent, and the converged values are exactly the optimal ones. To implement this VI algorithm, neural networks (NNs) are employed to approximate the value functions, and the gradient descent method is used to update the weights of NNs. Finally, an illustrative example is provided to verify the effectiveness of the proposed algorithm.
Original languageEnglish
Pages (from-to)1600-1611
Number of pages12
JournalIEEE Transactions on Neural Networks and Learning Systems
Volume32
Issue number4
DOIs
Publication statusPublished - Apr 2021

Bibliographical note

Publisher Copyright:
© 2012 IEEE.

Keywords

  • linear systems
  • mathematical optimization
  • neural networks (computer science)
  • synchronization

Fingerprint

Dive into the research topics of 'Hierarchical optimal synchronization for linear systems via reinforcement learning : a Stackelberg-Nash game perspective'. Together they form a unique fingerprint.

Cite this