Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback

Summary

This is a publication. If there is no link to the publication on this page, you can try the pre-formated search via the search engines listed on this page.

Authors: Tiancheng Jin, Tal Lancewicki, Haipeng Luo, Yishay Mansour, Aviv Rosenberg

Journal title: NeurIPS 2022b

Journal publisher: NeurIPS 2022, Arxiv, European Workshop on Reinforcement Learning (EWRL 2022), Complex Feedback in Online Learning (CFOL ), 2022

Published year: 2022