Homework 5
In this homework, we'll hardcode a policy for the slippery version of Frozen Lake. Recall from the documentation:
https://gymnasium.farama.org/environments/toy_text/frozen_lake/#is_slippy
that if you make the environment with is_slippery=True, then the player will move in the intended direction with probability 1/3, otherwise will move in either of the perpendicular directions with probabilities 1/3 each. Our aim is to create a policy that yields a nonzero return with probability larger than 1/2.
makethe environment just like in Notebook 0307, besides changingis_slipperytoTrue.- Hardcode a deterministic policy.
- Collect the returns of 1000 runs without making videos. Check that more than half of them is nonzero.
- Sort the returns and make a scatterplot of them, like in Notebook 0207.
- Make another run, this time with a video.