I found some useful papers about PAC of MDP and its VC (and â¦) dimensions. Read them later:
Rahul Jain and Pravin Varaiya, âPAC Learning for Markov Decision Process and Dynamics Games,â ?, 04.
R. Jain and P. Varaiya, âExtension of PAC Learning for Partially Observable Markov Decision Processes,â ?, 04.
Yishay Mansour, âReinforcement Learning and Mistake Bounded Algorithsm,â ?, 1999.