Homework 2
We shall train affine models on the CIFAR-10 dataset:
https://huggingface.co/datasets/uoft-cs/cifar10
If you get low accuracy values, do not fret. This is a more difficult task than MNIST. Just make sure your values are over what would be expected from random choice.
- Load the dataset. Note that it includes a "train" and a "test" split.
- Perform a train-validation split on its "train" split.
- Make the validation set as big as the "test" split.
- Make the random split using a fixed seed for reproducibility.
- Transform the train, validation and test sets to pairs of feature matrices and label vectors, in
torch.Tensorformat. - Save the 6 tensors you get in a dictionary, with
torch.save. - Optional (no grade, but can help): Make your code so that at start if it finds a file at the location you would save the preprocessed tensors to, it skips preprocessing and loads the tensors directly.
- You can use
os.path.existsfor this.
- You can use
- Fit a linear regression model on the dataset, as if class indices
were a 1-dimensional target value.
- Round the predicted values to the nearest integers.
- Print the accuracy you get.
- Train logistic regression models on the dataset:
- Using Negative Log Likelihood as loss.
- Using 7 different learning rates:
10 ** iwherei = -2, -1.5, ..., 1. - Make a line plot with
- log10 of the learning rates, that is
-2, -1.5, ..., 1on the horizontal axis and - the validation accuracies on the vertical axis.
- log10 of the learning rates, that is
- Repeat step 7 with Brier Score as the loss function: This is the squared distance between:
- The model output as 10 probability values. To take softmax of a
torch.Tensor, you can use thesoftmaxmethod. In thedimkeyword, you can specify along which dimension you want to take the softmax. - A one-hot vector. Besides the approach of Notebook 0207, you can also use
torch.nn.functional.one_hot.
- The model output as 10 probability values. To take softmax of a
- Optional (no grade, but can help): To help with points 7 and 8, make a
trainfunction that has- the learning rate and the loss function as input and
- outputs the validation accuracy.