COLT-MDP-962387 Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback

Summary

This is a publication. If there is no link to the publication on this page, you can try the pre-formated search via the search engines listed on this page.

Authors: Tiancheng Jin, Tal Lancewicki, Haipeng Luo, Yishay Mansour, Aviv Rosenberg

Journal title: NeurIPS 2022b

Journal publisher: NeurIPS 2022, Arxiv, European Workshop on Reinforcement Learning (EWRL 2022), Complex Feedback in Online Learning (CFOL ), 2022

Published year: 2022

Associated projects

COLT-MDP - Computational Learning Theory: compact representation, efficient computation, and societal challenges in learning MDPs

Organisations

Not specified