Tree-Based Solution Methods for Multiagent POMDPs with Delayed Communication (bibtex)
by Frans A. Oliehoek and Matthijs T. J. Spaan
Abstract:
Multiagent Partially Observable Markov Decision Processes (MPOMDPs) provide a powerful framework for optimal decision making under the assumption of instantaneous communication. We focus on a delayed communication setting (MPOMDP-DC), in which broadcasted information is delayed by at most one time step. This model allows agents to act on their most recent (private) observation. Such an assumption is a strict generalization over having agents wait until the global information is available and is more appropriate fo applications in which response time is critical. In this setting, however, value function backups are significantly more costly, and naive application of incremental pruning, the core of many state-of-the-art optimal POMDP techniques, is intractable. In this paper, we overcome this problem by demonstrating that computation of the MPOMDP-DC backup can be structured as a tree and introducing two novel tree-based pruning techniques that exploit this structure in an effective way. We experimentally show that these methods have the potential to outperform naive incremental pruning by orders of magnitude, allowing for the solution of larger problems.
Reference:
Tree-Based Solution Methods for Multiagent POMDPs with Delayed Communication (Frans A. Oliehoek and Matthijs T. J. Spaan), In Proceedings of the National Conference on Artificial Intelligence, 2012.
Bibtex Entry:
@InProceedings{Oliehoek12AAAI_TBP,
 author = {Frans A. Oliehoek and 
 Matthijs T. J. Spaan},
 title = {Tree-Based Solution Methods for Multiagent POMDPs
 with Delayed Communication},
 booktitle = {Proceedings of the National Conference on Artificial Intelligence},
 month = jul,
keywords={Multiagent},
 year = 2012,
 OPTpages = {},
 bib2html_rescat = {Multiagent systems - decentralized (approximate) planning under uncertainty},
 bib2html_pubtype = {Refereed Conference (International)},
 abstract =  {
 Multiagent Partially Observable Markov Decision Processes
 (MPOMDPs) provide a powerful framework for optimal decision
 making under the assumption of instantaneous communication.
 We focus on a delayed communication setting (MPOMDP-DC), in
 which broadcasted information is delayed by at most one time
 step. This model allows agents to act on their most recent
 (private) observation. Such an assumption is a strict
 generalization over having agents wait until the global
 information is available and is more appropriate fo
 applications in which response time is critical. In this
 setting, however, value function backups are significantly
 more costly, and naive application of incremental pruning, the
 core of many state-of-the-art optimal POMDP techniques, is
 intractable. In this paper, we overcome this problem by
 demonstrating that computation of the MPOMDP-DC backup can be
 structured as a tree and introducing two novel tree-based
 pruning techniques that exploit this structure in an effective
 way. We experimentally show that these methods have the
 potential to outperform naive incremental pruning by orders of
 magnitude, allowing for the solution of larger problems.
 },
url={http://people.csail.mit.edu/fao/docs/Oliehoek12MSDM.pdf}
}
Powered by bibtexbrowser