Reinforcement Learning for Constrained Energy Trading Games With Incomplete Information

Huiwei Wang, Tingwen Huang, Xiaofeng Liao, Haitham Abu-Rub, Guo Chen

Research output: Contribution to journalArticle

14 Citations (Scopus)

Abstract

This paper considers the problem of designing adaptive learning algorithms to seek the Nash equilibrium (NE) of the constrained energy trading game among individually strategic players with incomplete information. In this game, each player uses the learning automaton scheme to generate the action probability distribution based on his/her private information for maximizing his own averaged utility. It is shown that if one of admissible mixed-strategies converges to the NE with probability one, then the averaged utility and trading quantity almost surely converge to their expected ones, respectively. For the given discontinuous pricing function, the utility function has already been proved to be upper semicontinuous and payoff secure which guarantee the existence of the mixed-strategy NE. By the strict diagonal concavity of the regularized Lagrange function, the uniqueness of NE is also guaranteed. Finally, an adaptive learning algorithm is provided to generate the strategy probability distribution for seeking the mixed-strategy NE.

Original languageEnglish
JournalIEEE Transactions on Cybernetics
DOIs
Publication statusAccepted/In press - 27 Apr 2016

Fingerprint

Reinforcement learning
Adaptive algorithms
Probability distributions
Learning algorithms
Costs

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Software
  • Information Systems
  • Human-Computer Interaction
  • Computer Science Applications
  • Electrical and Electronic Engineering

Cite this

Reinforcement Learning for Constrained Energy Trading Games With Incomplete Information. / Wang, Huiwei; Huang, Tingwen; Liao, Xiaofeng; Abu-Rub, Haitham; Chen, Guo.

In: IEEE Transactions on Cybernetics, 27.04.2016.

Research output: Contribution to journalArticle

@article{6a0688a2ba25433c8962894f2a584f6e,
title = "Reinforcement Learning for Constrained Energy Trading Games With Incomplete Information",
abstract = "This paper considers the problem of designing adaptive learning algorithms to seek the Nash equilibrium (NE) of the constrained energy trading game among individually strategic players with incomplete information. In this game, each player uses the learning automaton scheme to generate the action probability distribution based on his/her private information for maximizing his own averaged utility. It is shown that if one of admissible mixed-strategies converges to the NE with probability one, then the averaged utility and trading quantity almost surely converge to their expected ones, respectively. For the given discontinuous pricing function, the utility function has already been proved to be upper semicontinuous and payoff secure which guarantee the existence of the mixed-strategy NE. By the strict diagonal concavity of the regularized Lagrange function, the uniqueness of NE is also guaranteed. Finally, an adaptive learning algorithm is provided to generate the strategy probability distribution for seeking the mixed-strategy NE.",
author = "Huiwei Wang and Tingwen Huang and Xiaofeng Liao and Haitham Abu-Rub and Guo Chen",
year = "2016",
month = "4",
day = "27",
doi = "10.1109/TCYB.2016.2539300",
language = "English",
journal = "IEEE Transactions on Cybernetics",
issn = "2168-2267",
publisher = "IEEE Advancing Technology for Humanity",

}

TY - JOUR

T1 - Reinforcement Learning for Constrained Energy Trading Games With Incomplete Information

AU - Wang, Huiwei

AU - Huang, Tingwen

AU - Liao, Xiaofeng

AU - Abu-Rub, Haitham

AU - Chen, Guo

PY - 2016/4/27

Y1 - 2016/4/27

N2 - This paper considers the problem of designing adaptive learning algorithms to seek the Nash equilibrium (NE) of the constrained energy trading game among individually strategic players with incomplete information. In this game, each player uses the learning automaton scheme to generate the action probability distribution based on his/her private information for maximizing his own averaged utility. It is shown that if one of admissible mixed-strategies converges to the NE with probability one, then the averaged utility and trading quantity almost surely converge to their expected ones, respectively. For the given discontinuous pricing function, the utility function has already been proved to be upper semicontinuous and payoff secure which guarantee the existence of the mixed-strategy NE. By the strict diagonal concavity of the regularized Lagrange function, the uniqueness of NE is also guaranteed. Finally, an adaptive learning algorithm is provided to generate the strategy probability distribution for seeking the mixed-strategy NE.

AB - This paper considers the problem of designing adaptive learning algorithms to seek the Nash equilibrium (NE) of the constrained energy trading game among individually strategic players with incomplete information. In this game, each player uses the learning automaton scheme to generate the action probability distribution based on his/her private information for maximizing his own averaged utility. It is shown that if one of admissible mixed-strategies converges to the NE with probability one, then the averaged utility and trading quantity almost surely converge to their expected ones, respectively. For the given discontinuous pricing function, the utility function has already been proved to be upper semicontinuous and payoff secure which guarantee the existence of the mixed-strategy NE. By the strict diagonal concavity of the regularized Lagrange function, the uniqueness of NE is also guaranteed. Finally, an adaptive learning algorithm is provided to generate the strategy probability distribution for seeking the mixed-strategy NE.

UR - http://www.scopus.com/inward/record.url?scp=84964680181&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84964680181&partnerID=8YFLogxK

U2 - 10.1109/TCYB.2016.2539300

DO - 10.1109/TCYB.2016.2539300

M3 - Article

JO - IEEE Transactions on Cybernetics

JF - IEEE Transactions on Cybernetics

SN - 2168-2267

ER -