Homework 5

In this homework, we'll hardcode a policy for the slippery version of Frozen Lake. Recall from the documentation:
https://gymnasium.farama.org/environments/toy_text/frozen_lake/#is_slippy
that if you make the environment with is_slippery=True, then the player will move in the intended direction with probability 1/3, otherwise will move in either of the perpendicular directions with probabilities 1/3 each. Our aim is to create a policy that yields a nonzero return with probability larger than 1/2.

  1. make the environment just like in Notebook 0307, besides changing is_slippery to True.
  2. Hardcode a deterministic policy.
  3. Collect the returns of 1000 runs without making videos. Check that more than half of them is nonzero.
  4. Sort the returns and make a scatterplot of them, like in Notebook 0207.
  5. Make another run, this time with a video.