Department of Mathematics - Seminar on Statistics - Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning

7月18日

4:00pm - 5:00pm

研討會, 演講, 講座

In this paper, we propose a robust policy evaluation algorithm in reinforcement learning, to feature outlier contamination and heavy-tailed reward distributions. We further develop a fully-online method to conduct statistical inference for the modeling parameters. Our method converges faster to the minimum asymptotic variance than the classical temporal difference (TD) learning and avoids the selection of the step sizes. Numerical experiments are provided on the effectiveness of the proposed algorithm in real-world reinforcement learning experiments, which highlight the efficiency and robustness of our approach when compared to the existing online bootstrap method. This work is joint with Jiyuan Tu (SUFE), Xi Chen (NYU), and Weidong Liu (SJTU).

7月18日

4:00pm - 5:00pm

立即登記

地點

Room 2302 (Lifts 17/18)

講者/表演者

Prof. Yichen ZHANG
Purdue University

主辦單位

Department of Mathematics

聯絡方法

付款詳情

對象

Alumni, Faculty and staff, PG students, UG students

語言

英語

其他活動

7月14日

研討會, 演講, 講座

IAS / School of Science Joint Lecture - Boron Clusters

Abstract The study of carbon clusters led to the discoveries of fullerenes, carbon nanotubes, and graphene. Are there other elements that can form similar nanostructures? To answer this questio...

5月15日

研討會, 演講, 講座

IAS / School of Science Joint Lecture - Laser Spectroscopy of Computable Atoms and Molecules with Unprecedented Accuracy

Abstract Precision spectroscopy of the hydrogen atom, a fundamental two-body system, has been instrumental in shaping quantum mechanics. Today, advances in theory and experiment allow us to ext...