Homework 2

We shall train affine models on the CIFAR-10 dataset:
https://huggingface.co/datasets/uoft-cs/cifar10
If you get low accuracy values, do not fret. This is a more difficult task than MNIST. Just make sure your values are over what would be expected from random choice.

  1. Load the dataset. Note that it includes a "train" and a "test" split.
  2. Perform a train-validation split on its "train" split.
    1. Make the validation set as big as the "test" split.
    2. Make the random split using a fixed seed for reproducibility.
  3. Transform the train, validation and test sets to pairs of feature matrices and label vectors, in torch.Tensor format.
  4. Save the 6 tensors you get in a dictionary, with torch.save.
  5. Optional (no grade, but can help): Make your code so that at start if it finds a file at the location you would save the preprocessed tensors to, it skips preprocessing and loads the tensors directly.
    1. You can use os.path.exists for this.
  6. Fit a linear regression model on the dataset, as if class indices were a 1-dimensional target value.
    1. Round the predicted values to the nearest integers.
    2. Print the accuracy you get.
  7. Train logistic regression models on the dataset:
    1. Using Negative Log Likelihood as loss.
    2. Using 7 different learning rates: 10 ** i where i = -2, -1.5, ..., 1.
    3. Make a line plot with
      1. log10 of the learning rates, that is -2, -1.5, ..., 1 on the horizontal axis and
      2. the validation accuracies on the vertical axis.
  8. Repeat step 7 with Brier Score as the loss function: This is the squared distance between:
    1. The model output as 10 probability values. To take softmax of a torch.Tensor, you can use the softmax method. In the dim keyword, you can specify along which dimension you want to take the softmax.
    2. A one-hot vector. Besides the approach of Notebook 0207, you can also use torch.nn.functional.one_hot.
  9. Optional (no grade, but can help): To help with points 7 and 8, make a train function that has
    1. the learning rate and the loss function as input and
    2. outputs the validation accuracy.