Posts
-
The Exploration Problem
A reinforcement learning agent's observed data distribution depends on the actions of the agent itself, thus leading to the exploration-exploitation tradeoff. This post details efficient strategies proposed for performing exploration in complex environments.
[Under Construction]
A reinforcement learning agent's observed data distribution depends on the actions of the agent itself, thus leading to the exploration-exploitation tradeoff. This post details efficient strategies proposed for performing exploration in complex environments.