{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Automatic Differentiation in `pytorch`\n",
    "\n",
    "## Setup\n",
    "\n",
    "### Imports\n",
    "\n",
    "Import the following:\n",
    "\n",
    "1. The `Callable` class from the module `collections.abc`,\n",
    "2. The module `matplotlib.pyplot` as `plt`, and\n",
    "3. The module `torch`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "raise NotImplementedError"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Configuration\n",
    "\n",
    "Create a configuration dictionary with the following key-value pairs:\n",
    "- `learning_rate`: `float`  \n",
    "    This should be the learning rate in our gradient descent algorithm. You can make it `1.`, for starters.\n",
    "- `steps_num`: `int`  \n",
    "    The number of steps our gradient descent algorithm should take. To begin with, make this `10`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "raise NotImplementedError"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Calculating Gradients\n",
    "\n",
    "You can initialize a tensor by the function `torch.tensor`.\n",
    "\n",
    "1. The first, positional argument declares the data.\n",
    "    Make this a small list of numbers.\n",
    "2. The `dtype` keyword argument declares the data type.\n",
    "    Make this a floating point type such as `torch.float32`.\n",
    "3. The `requires_grad` keyword argument decides if gradients should be\n",
    "    tracked in calculations using this tensor. To save computation time,\n",
    "    this is `False` by default. Set it to `True`.\n",
    "\n",
    "Apply a function to this tensor you know the derivative of.\n",
    "For example, you could take squares elementwise.\n",
    "Print the result. If all went well, in the output you can see that\n",
    "the operation resulting in the new tensor was recorded\n",
    "for gradient tracking.\n",
    "\n",
    "More precisely, `pytorch` keeps a\n",
    "*computation graph* of tensors that require gradient calculations.\n",
    "See more here:\n",
    "https://pytorch.org/tutorials/beginner/basics/autogradqs_tutorial.html\n",
    "\n",
    "You can see \"Backward\" in `grad_fn` as you will get the gradient\n",
    "with respect to the input tensor via *backpropagation*, that is\n",
    "repeated use of the chain rule. We'll discuss this at length\n",
    "when we introduce Dense Neural Networks."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "raise NotImplementedError"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can only take the gradient of a scalar-valued function.\n",
    "So take your output tensor and sum up its elements\n",
    "by using the `sum` method of the tensor or the `torch.sum` function.\n",
    "(If you don't know what *method* or *attribute* mean,\n",
    "check this for a quick intro:\n",
    "https://exercism.org/tracks/python/concepts/classes  \n",
    "Using `pytorch`, tensors have type `torch.Tensor`)\n",
    "\n",
    "Print out the result."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "raise NotImplementedError"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can initiate backpropagation by calling the `backward` method\n",
    "or the `torch.autograd.backward` function on the scalar tensor you got.\n",
    "Then you can access the gradient of the input tensor\n",
    "as the `grad` attribute of the input tensor. Is it what you would expect?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "raise NotImplementedError"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Gradient Descent\n",
    "\n",
    "### Plotting a Scalar-to-Scalar Function\n",
    "\n",
    "Let's try to find the minimum of a scalar-to-scalar function\n",
    "with Gradient Descent (GD). I provide the function as `get_loss` below.\n",
    "For starters, let's make a graph on the interval `[-10,10]`.\n",
    "Recall that if you provide one sequence `s` of values to `plt.plot`,\n",
    "then it will make a line plot between the points `(i, s[i])`.\n",
    "On the other hand, if you provide two sequences `x` and `y`,\n",
    "then `plt.plot` will make a line plot between the points `(x[i], y[i])`.\n",
    "\n",
    "To make a detailed plot of the function `get_loss`, we should make `x`\n",
    "a collection of *ticks*\n",
    "```\n",
    "x[i] = -10 + i * (10 - -10) / (num_ticks - 1), i = 0, ..., num_ticks - 1\n",
    "```\n",
    "where the value `num_ticks` determines the degree of detail of the plot.\n",
    "Then we can make `y` the values at the ticks:\n",
    "```\n",
    "y[i] = get_loss(x[i])\n",
    "```\n",
    "\n",
    "Such a tensor `x` of ticks can be gotten via the function `torch.linspace`\n",
    "The three positional arguments are\n",
    "1. the lower boundary,\n",
    "2. the upper boundary and\n",
    "3. the number of ticks.\n",
    "Then you can get `y` as `get_loss(x)`.\n",
    "\n",
    "Set the variable `num_ticks` to 1000 and make the plot.\n",
    "As usual, show the canvas and clear it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_loss(x: torch.Tensor) -> torch.Tensor:\n",
    "    return x ** 2 / 10 + 10 / (1 + (-x).exp())\n",
    "\n",
    "raise NotImplementedError"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For ease of use, let's write a function `draw_line_plot`\n",
    "that makes the line plot above!\n",
    "Rewriting code with the same functionality but easier use or readability\n",
    "is called *refactoring*. So we refactor the plotting functionality\n",
    "into a function `draw_line_plot`.\n",
    "\n",
    "Look out that we should only make the `draw_line_plot` draw the plot,\n",
    "not show the canvas or clear it. This is because we'll want to draw\n",
    "gradient descent steps on the same canvas.\n",
    "\n",
    "Note that I made\n",
    "1. the function `f` to plot,\n",
    "2. the lower and upper boundary, and\n",
    "3. the number of ticks\n",
    "\n",
    "keyword arguments with defaults as above. This means that you can quickly\n",
    "make a line plot with the same settings as above, but you can also change\n",
    "the settings around.\n",
    "\n",
    "In the type hint for `f`, you can see the type\n",
    "`Callable[Collection[torch.Tensor], torch.Tensor]`\n",
    "This means that we expect `f` to be a callable object\n",
    "with one positional argument of type `torch.Tensor`\n",
    "and output of type `torch.Tensor`.\n",
    "\n",
    "Write the function `draw_line_plot`,\n",
    "then call it, finally show and close the canvas."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def draw_line_plot(\n",
    "    f: Callable[[torch.Tensor], torch.Tensor]=get_loss,\n",
    "    lower=-10.,\n",
    "    num_ticks=1000,\n",
    "    upper=10.\n",
    "):\n",
    "    \"\"\"\n",
    "    Draws a line plot of a function over a given interval\n",
    "    with a given number of ticks.\n",
    "\n",
    "    Parameters \n",
    "    ----------\n",
    "    f : Callable[[torch.Tensor], torch.Tensor], optional\n",
    "        The function to plot. Default:\n",
    "        ```\n",
    "        f(x) = x ** 2 / 10 + 10 / (1 + (-x).exp())\n",
    "        ```\n",
    "    lower : float, optional\n",
    "        The lower boundary of the interval to plot `f` on. Default: -10.\n",
    "    num_ticker : int, optional\n",
    "        The number of ticks in the interval to plot `f` on. Default: 1000\n",
    "    upper : float, optional\n",
    "        The upper boundary of the interval to plot `f` on. Default: 10.\n",
    "    \"\"\"\n",
    "    raise NotImplementedError\n",
    "\n",
    "raise NotImplementedError"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Gradient Descent by Hand\n",
    "\n",
    "#### Initialize Parameters\n",
    "\n",
    "This time, we only have 1 parameter, the value `x`.\n",
    "Initialize a tensor storing this value:\n",
    "1. You can input the starting value as the first positional argument.\n",
    "    Make this a 7.\n",
    "2. Don't forget to make the datatype a floating point type\n",
    "    and enable gradient tracking."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "raise NotImplementedError"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Initialize Variables\n",
    "\n",
    "Initialize two empty lists, one for `x` values and another for `y` values.\n",
    "During training, we'll want to record the values at each step.\n",
    "Afterwards, we'll plot all values at once."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "raise NotImplementedError"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Training Loop\n",
    "\n",
    "Now we want to perform a given number of train steps.\n",
    "The canonical way perform an operation a given number of times\n",
    "is via a `for` loop over a `range`.\n",
    "Check this if you don't know how this works:  \n",
    "https://exercism.org/tracks/python/concepts/loops\n",
    "\n",
    "You need to perform the following operation in a train step:\n",
    "1. Set `x.grad` to `None`.\n",
    "    This will reset gradient calculation between steps.\n",
    "2. Calculate the value `y`, that is the loss at parameter `x`.\n",
    "3. Append the current values `x` and `y` to the lists of `x` and `y` values.\n",
    "    1. You can call the method `append` of a list to append a value to it.\n",
    "    2. As for plotting we only need `x` and `y` as numbers,\n",
    "        call their `item` method before appending.\n",
    "        This method returns the number stored in a scalar tensor.\n",
    "4. Call the `backward` method on `y` to backpropagate gradients.\n",
    "5. Print the `x` and `y` values and the gradient of `x`.\n",
    "    To only print the numbers, call the `item` method on the tensors.\n",
    "6. Now we want to update the parameter `x` with the new gradient data.\n",
    "    1. While we update the value of `x`, we want to suspend gradient tracking. This is done using the `with torch.no_grad():` context block.\n",
    "        1. Check the autograd tutorial for a quick example:      \n",
    "            https://pytorch.org/tutorials/beginner/basics/autogradqs_tutorial.html#disabling-gradient-tracking\n",
    "        2. If interested what happens under the hood when you use `with`, check the relevant section of the Python reference library:      \n",
    "            https://docs.python.org/3/reference/compound_stmts.html#with\n",
    "    2. Use the `-=` in-place subtraction operator to update `x`\n",
    "        by subtracting learning rate times gradient from it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "raise NotImplementedError"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "1. Plot the loss function using `draw_line_plot`.\n",
    "2. Make a scatter plot from the lists of `x` and `y` values.\n",
    "    Adjust the color and the scale at will.\n",
    "3. Show and clear the canvas."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "raise NotImplementedError"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Gradient Descent with a `torch.optim.Optimizer`\n",
    "\n",
    "As an alternative to updating the parameters by hand,\n",
    "`pytorch` provides a unified interface\n",
    "for gradient descent-based optimization:     \n",
    "https://pytorch.org/docs/stable/optim.html    \n",
    "We shall refactor our code to use this.\n",
    "For full batch gradient descent, it will be about the same length,\n",
    "but later on, it will be easy to swap the basic GD optimizer\n",
    "with more advanced ones.\n",
    "\n",
    "1. Copy the gradient descent training loop here\n",
    "    and make the following changes:\n",
    "    1. After you initialize `x`, initialize an optimizer of type `torch.optim.SGD`:\n",
    "        1. As first positional argument, you need to supply\n",
    "            an iterable of the parameters you want to optimize.\n",
    "            As we only have one parameter `x`, supply a list\n",
    "            with `x` as its unique element.\n",
    "        2. Set the learing rate with the `lr` keyword argument.\n",
    "2. Replace the line where you set `x.grad` to `None` by\n",
    "    calling the `zero_grad` method of the optimizer.\n",
    "3. Replace the two lines where you\n",
    "    open a `torch.no_grad()` context block and update `x` by\n",
    "    calling the `step` method of the optimizer.\n",
    "4. Copy the line and scatter plot code in the end to\n",
    "    see at once if you got the same result."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "raise NotImplementedError"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# License\n",
    "\n",
    "This work is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-sa/4.0/"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "dml",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}