MPI-INF Logo
Campus Event Calendar

Event Entry

What and Who

Reward Design for Reinforcement Learning Agents

Rati Devidze
Max Planck Institute for Software Systems
SWS Student Defense Talks - Thesis Proposal
AG 1, AG 2, AG 3, INET, AG 4, AG 5, D6, SWS, RG1, MMCI  
AG Audience
English

Date, Time and Location

Monday, 25 March 2024
16:00
60 Minutes
E1 5
029
Saarbrücken

Abstract

Reward functions are central to reinforcement learning (RL) as they implicitly capture the optimal behavior of the learning agent. Since the behavioral policy of the RL agent is updated based on the provided reward signals, the choice of the reward function can have a very large impact on how fast the reinforcement learning algorithm converges. One of the most popular approaches to speed up the learning process is a method of reward design. Reward design is a technique that replaces original rewards with designed rewards to make the problem easier to learn. In our work, we propose different reward design strategies that guarantee the desired characteristics of the designed reward functions. In particular, we want our designed rewards to satisfy three main characteristics: Invariance i.e., Reward signals should capture desired behavior without reward bugs, Interpretability i.e., Reward signals should be easy to diagnose and verify, and Informativeness i.e., reward signals should lead to effective learning. The theoretical analysis and empirical evaluations across various RL tasks highlight the effectiveness of our proposed methods.


Please contact grad-office@mpi-sws.org for zoom details

Contact

Susanne Girard
+49 631 9303 9605
--email hidden
Zoom
passcode not visible
logged in users only

Susanne Girard, 03/22/2024 12:32 -- Created document.