Skip to content
Snippets Groups Projects
02-CNN-MNIST_Lightning.ipynb 19.5 KiB
Newer Older
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "86fe2213-fb44-4bd4-a371-a541cba6a744",
   "metadata": {},
   "source": [
    "\n",
    "<img width=\"800px\" src=\"../fidle/img/header.svg\"></img>\n",
    "## <!-- TITLE --> [LMNIST2] - Simple classification with CNN\n",
    "<!-- DESC --> An example of classification using a convolutional neural network for the famous MNIST dataset, using PyTorch Lightning\n",
    "<!-- AUTHOR : MBOGOL Touye Achille (AI/ML Engineer MIAI/SIMaP) -->\n",
    "\n",
    "## Objectives :\n",
    " - Recognizing handwritten numbers\n",
    " - Understanding the principle of a classifier DNN network \n",
    " - Implementation with pytorch lightning \n",
    "\n",
    "\n",
    "The [MNIST dataset](http://yann.lecun.com/exdb/mnist/) (Modified National Institute of Standards and Technology) is a must for Deep Learning.  \n",
    "It consists of 60,000 small images of handwritten numbers for learning and 10,000 for testing.\n",
    "\n",
    "\n",
    "## What we're going to do :\n",
    "\n",
    " - Retrieve data\n",
    " - Preparing the data\n",
    " - Create a model\n",
    " - Train the model\n",
    " - Evaluate the result\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7f16101a-6612-4e02-93e9-c45ce1ac911d",
   "metadata": {},
   "source": [
    "## Step 1 - Init python stuff"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "743c77d3-0983-491c-90be-ef2219861a47",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "\n",
    "import torch\n",
    "import torch.nn as nn\n",
    "import lightning.pytorch as pl\n",
    "import torch.nn.functional as F\n",
    "import torchvision.transforms as T\n",
    "\n",
    "import sys,os\n",
    "import multiprocessing\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "from torchvision import datasets\n",
    "from torchmetrics.functional import accuracy\n",
    "from torch.utils.data import Dataset, DataLoader\n",
    "from modules.progressbar import CustomTrainProgressBar\n",
    "from lightning.pytorch.loggers import TensorBoardLogger\n",
    "\n",
    "\n",
    "# Init Fidle environment\n",
    "import fidle\n",
    "\n",
    "run_id, run_dir, datasets_dir = fidle.init('LMNIST2')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "df10dcda-aa63-476b-8665-9b1610fe51c6",
   "metadata": {},
   "source": [
    "## Step 2 Retrieve data\n",
    "\n",
    "MNIST is one of the most famous historic dataset include in torchvision Datasets. `torchvision` provides many built-in datasets in the `torchvision.datasets`.  \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6668e50c-f0c6-43cf-b733-9ac29d6a3900",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load data sets \n",
    "train_dataset = datasets.MNIST(root=\"data\", train=True, download=True, transform=None)\n",
    "\n",
    "test_dataset= datasets.MNIST(root=\"data\", train=False, download=False, transform=None)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a14d6fc2-b913-4eaa-9cde-5ca6785bfa12",
   "metadata": {},
   "outputs": [],
   "source": [
    "# print info for train data\n",
    "print(train_dataset)\n",
    "\n",
    "print()\n",
    "\n",
    "# print info for test data\n",
    "print(test_dataset)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "44a489f5-3e53-4a2b-8069-f265b2814dc0",
   "metadata": {},
   "outputs": [],
   "source": [
    "# See the shape of train data and test data\n",
    "print(\"x_train : \",train_dataset.data.shape)\n",
    "print(\"y_train : \",train_dataset.targets.shape)\n",
    "\n",
    "print()\n",
    "\n",
    "print(\"x_test  : \",test_dataset.data.shape)\n",
    "print(\"y_test  : \",test_dataset.targets.shape)\n",
    "\n",
    "print()\n",
    "\n",
    "# print number of targets and  values targets\n",
    "print(\"Number of Targets :\",len(np.unique(train_dataset.targets)))\n",
    "print(\"Targets Values    :\",    np.unique(train_dataset.targets))\n",
    "\n",
    "print()\n",
    "\n",
    "print(\"Remark that we work with torch tensors and not numpy array, not tensorflow tensor\")\n",
    "print(\" -> x_train.dtype = \",train_dataset.data.dtype)\n",
    "print(\" -> y_train.dtype = \",train_dataset.targets.dtype)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b418adb7-33ea-450c-9793-3cdce5d5fa8c",
   "metadata": {},
   "source": [
    "## Step 3 -  Preparing your data for training with DataLoaders\n",
    "The Dataset retrieves our dataset’s features and labels one sample at a time. While training a model, we typically want to pass samples in `minibatches`, reshuffle the data at every epoch to reduce model overfitting, and use Python’s multiprocessing to speed up data retrieval. DataLoader is an iterable that abstracts this complexity for us in an easy API."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8af0bc4c-acb3-46d9-aae2-143b0327d970",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Before normalization:\n",
    "x_train=train_dataset.data\n",
    "print('Before normalization : Min={}, max={}'.format(x_train.min(),x_train.max()))\n",
    "\n",
    "# After normalization:\n",
    "## T.Compose creates a pipeline where the provided transformations are run in sequence\n",
    "transforms = T.Compose(\n",
    "    [\n",
    "        # This transforms takes a np.array or a PIL image of integers\n",
    "        # in the range 0-255 and transforms it to a float tensor in the\n",
    "        # range 0.0 - 1.0\n",
    "        T.ToTensor(),\n",
    "\n",
    "    ]\n",
    ")\n",
    "\n",
    "\n",
    "train_dataset = datasets.MNIST(root=\"data\", train=True, download=True, transform=transforms)\n",
    "test_dataset= datasets.MNIST(root=\"data\", train=False, download=True, transform=transforms)\n",
    "\n",
    "\n",
    "# print image and label After normalization. \n",
    "# iter() followed by next() is used to get some images and label\n",
    "image,label=next(iter(train_dataset))\n",
    "print('After normalization  : Min={}, max={}'.format(image.min(),image.max()))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "35d50a57-8274-4660-8765-d0f2bf7214bd",
   "metadata": {},
   "source": [
    "### Have a look"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a172ebc5-8858-4f30-8e2c-1e9c123ae0ee",
   "metadata": {},
   "outputs": [],
   "source": [
    "x_train=train_dataset.data\n",
    "y_train=train_dataset.targets"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5a487760-b43a-4f7c-bfd8-1ce2c9652769",
   "metadata": {},
   "outputs": [],
   "source": [
    "fidle.scrawler.images(x_train, y_train, [27],  x_size=5,y_size=5, colorbar=True, save_as='01-one-digit')\n",
    "fidle.scrawler.images(x_train, y_train, range(5,41), columns=12, save_as='02-many-digits')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ca0a63ae-e6d6-4940-b8ff-9b11cb2737bb",
   "metadata": {},
   "outputs": [],
   "source": [
    "# train bacth data\n",
    "train_loader= DataLoader(\n",
    "  dataset=train_dataset, \n",
    "  shuffle=True, \n",
    "  batch_size=512,\n",
    "  num_workers=2\n",
    ")\n",
    "\n",
    "# test batch data\n",
    "test_loader= DataLoader(\n",
    "  dataset=test_dataset, \n",
    "  shuffle=False, \n",
    "  batch_size=512,\n",
    "  num_workers=2 \n",
    ")\n",
    "\n",
    "# print image and label  After normalization and batch_size.\n",
    "image, label=next(iter(train_loader))\n",
    "print('Shape of first training data batch after use pytorch dataloader :\\nbatch images = {} \\nbatch labels = {}'.format(image.shape,label.shape))      "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "51bf21ee-76ca-42fa-b67f-066dbd239a72",
   "metadata": {},
   "source": [
    "## Step 4 - Create Model\n",
    "About informations about : \n",
    " - [Optimizer](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers)\n",
    " - [Activation](https://www.tensorflow.org/api_docs/python/tf/keras/activations)\n",
    " - [Loss](https://www.tensorflow.org/api_docs/python/tf/keras/losses)\n",
    " - [Metrics](https://www.tensorflow.org/api_docs/python/tf/keras/metrics)\n",
    "   \n",
    " `Note :` PyTorch provides losses such as the cross-entropy loss (`nn.CrossEntropyLoss`) usually use for classification problem. we're using the softmax function to predict class probabilities. With a softmax output, you want to use cross-entropy as the loss. To actually calculate the loss, we need to pass in the raw output of our network into the loss, not the output of the softmax function. This raw output is usually called the *logits* or *scores*. because in pytorch the cross entropy contain softmax function already."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "16701119-71eb-4f59-a50a-f153b07a74ae",
   "metadata": {},
   "outputs": [],
   "source": [
    "class MyNet(nn.Module):\n",
    "    \n",
    "    def __init__(self,num_class=10):\n",
    "        super().__init__()\n",
    "        self.num_class=num_class\n",
    "        self.model = nn.Sequential(\n",
    "            # first convolution  \n",
    "            nn.Conv2d(in_channels=1, out_channels=8, kernel_size=3, stride=1, padding=0),\n",
    "            nn.ReLU(),\n",
    "            nn.MaxPool2d((2,2)),  \n",
    "            nn.Dropout2d(0.1),            # Combat overfitting\n",
    "            \n",
    "            # second convolution  \n",
    "            nn.Conv2d(in_channels=8, out_channels=16, kernel_size=3, stride=1, padding=0),\n",
    "            nn.ReLU(),\n",
    "            nn.MaxPool2d((2,2)),   \n",
    "            nn.Dropout2d(0.1),            # Combat overfitting\n",
    "            \n",
    "            nn.Flatten(),                 # convert feature map into feature vectors\n",
    "            \n",
    "            # MLP network   \n",
    "            nn.Linear(16*5*5,100),\n",
    "            nn.ReLU(),\n",
    "            nn.Dropout1d(0.1),            # Combat overfitting\n",
    "        \n",
    "            nn.Linear(100, num_class),    # logits outpout\n",
    "        )\n",
    "        \n",
    "    def forward(self, x):\n",
    "        x=self.model(x)                   # forward pass\n",
    "        return x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "37abf99b-f8ec-4048-a65d-f173ee18b234",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "class LitModel(pl.LightningModule):\n",
    "    \n",
    "    def __init__(self, MyNet):\n",
    "        super().__init__()\n",
    "        self.MyNet = MyNet\n",
    "\n",
    "    # forward pass\n",
    "    def forward(self, x): \n",
    "        return self.MyNet(x)\n",
    "    \n",
    "\n",
    "     # optimizer\n",
    "    def configure_optimizers(self):\n",
    "        optimizer = torch.optim.Adam(self.parameters(), lr=1e-3)\n",
    "        return optimizer \n",
    "\n",
    "    \n",
    "    def training_step(self, batch, batch_idx):\n",
    "        # defines the train loop\n",
    "        x, y  = batch\n",
    "        \n",
    "        # forward pass \n",
    "        y_hat = self.MyNet(x) \n",
    "        \n",
    "        # loss function\n",
    "        loss= F.cross_entropy(y_hat, y) \n",
    "\n",
    "        # metrics accuracy\n",
    "        acc=accuracy(y_hat, y,task=\"multiclass\",num_classes=10)    \n",
    "        \n",
    "        metrics = {\"train_loss\": loss, \"train_acc\": acc}\n",
    "        \n",
    "        # logs metrics for each training_step\n",
    "        self.log_dict(metrics,\n",
    "                      on_step=False ,\n",
    "                      on_epoch=True, \n",
    "                      prog_bar=True, \n",
    "                      logger=True)\n",
    "        \n",
    "        return loss\n",
    "\n",
    "        \n",
    "    def validation_step(self, batch, batch_idx):\n",
    "        # defines the valid loop.\n",
    "        x, y = batch\n",
    "        \n",
    "        # forward pass\n",
    "        y_hat= self.MyNet(x) \n",
    "\n",
    "        # loss function MSE\n",
    "        loss = F.cross_entropy(y_hat, y)                                \n",
    "\n",
    "        # metrics accuracy\n",
    "        acc=accuracy(y_hat, y, task=\"multiclass\", num_classes=10)        \n",
    "\n",
    "        \n",
    "        metrics = {\"test_loss\": loss, \"test_acc\": acc}\n",
    "        \n",
    "        # logs metrics for each validation_step\n",
    "        self.log_dict(metrics,\n",
    "                      on_step  = False,\n",
    "                      on_epoch = True, \n",
    "                      prog_bar = True, \n",
    "                      logger   = True\n",
    "                     ) \n",
    "        \n",
    "        return metrics\n",
    "        \n",
    "    \n",
    "    def predict_step(self, batch, batch_idx):\n",
    "        # defnie the predict loop \n",
    "        x, y = batch\n",
    "\n",
    "        # forward pass\n",
    "        y_hat = self.MyNet(x)\n",
    "        \n",
    "        return y_hat\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "489af62f-8f7c-4d1b-a6d0-5a0417e79869",
   "metadata": {},
   "outputs": [],
   "source": [
    "# print summary model\n",
    "model=LitModel(MyNet())\n",
    "print(model) "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fb32e85d-bd92-4ca5-a3dc-ddb5ed50ba6b",
   "metadata": {},
   "source": [
    "## Step 5 - Train Model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "756d5e19-6a10-42b8-8971-411389f7d19c",
   "metadata": {},
   "outputs": [],
   "source": [
    "# loggers data\n",
    "os.makedirs(f'{run_dir}/logs',   mode=0o750, exist_ok=True)\n",
    "logger= TensorBoardLogger(save_dir=f'{run_dir}/logs',name=\"CNN_logs\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ce975c03-d05d-40c4-92ff-0cc90699c13e",
   "metadata": {},
   "outputs": [],
   "source": [
    "# train model\n",
    "# trainer = pl.Trainer(accelerator='auto',\n",
    "#                      max_epochs=16,\n",
    "#                      logger=logger,\n",
    "#                      num_sanity_val_steps=0,\n",
    "#                      callbacks=[CustomTrainProgressBar()]\n",
    "#                     )\n",
    "\n",
    "trainer = pl.Trainer(accelerator='auto',\n",
    "                     max_epochs=16,\n",
    "                     logger=logger,\n",
    "                     num_sanity_val_steps=0\n",
    "trainer.fit(model=model, train_dataloaders=train_loader, val_dataloaders=test_loader)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a1191f05-4454-415c-a5ed-e63d9ae56651",
   "metadata": {},
   "source": [
    "## Step 6 - Evaluate\n",
    "### 6.1 - Final loss and accuracy\n",
    "Note : With a DNN, we had a precision of the order of : 97.7%"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9f45316e-0d2d-4fc1-b9a8-5fb8aaf5586a",
   "metadata": {},
   "outputs": [],
   "source": [
    "# evaluate your model\n",
    "score=trainer.validate(model=model,dataloaders=test_loader, verbose=False)\n",
    "\n",
    "print('x_test / acc      : {:5.4f}'.format(score[0]['test_acc']))\n",
    "print('x_test / loss     : {:5.4f}'.format(score[0]['test_loss']))\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5cfe9bd6-654b-42e0-b430-5f3b816526b0",
   "metadata": {},
   "outputs": [],
   "source": [
    "score=trainer.validate(model=model,dataloaders=test_loader, verbose=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e352e48d-b473-4162-a1aa-72d6d4f7aa38",
   "metadata": {},
   "source": [
    "## 6.2 - Plot history\n",
    "To access logs with tensorboad :\n",
    "- Under **Docker**, from a terminal launched via the jupyterlab launcher, use the following command:<br>\n",
    "```tensorboard --logdir <path-to-logs> --host 0.0.0.0```\n",
    "- If you're **not using Docker**, from a terminal :<br>\n",
    "```tensorboard --logdir <path-to-logs>```  \n",
    "\n",
    "**Note:** One tensorboard instance can be used simultaneously."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f00ded6b-a7db-4c5d-b1b2-72264db20bdb",
   "metadata": {},
   "source": [
    "###  6.3 - Plot results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e387a70d-9c23-4d16-8ef7-879aec7791e2",
   "metadata": {},
   "outputs": [],
   "source": [
    "# logits outpout by batch size\n",
    "y_logits= trainer.predict(model=model,dataloaders=test_loader)\n",
    "\n",
    "# Concat into single tensor\n",
    "y_logits= torch.cat(y_logits)\n",
    "\n",
    "# output probabilities values\n",
    "y_pred_values=F.softmax(y_logits,dim=1)\n",
    "\n",
    "# Returns the indices of the maximum output probabilities values \n",
    "y_pred=torch.argmax(y_pred_values,dim=-1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fb2b2eeb-fcd8-453c-93ef-59a960a8bbd5",
   "metadata": {},
   "outputs": [],
   "source": [
    "x_test=test_dataset.data\n",
    "y_test=test_dataset.targets"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "71187fa9-2ad3-4b23-94b9-1846045bd070",
   "metadata": {},
   "outputs": [],
   "source": [
    "fidle.scrawler.images(x_test, y_test, range(0,200), columns=12, x_size=1, y_size=1, y_pred=y_pred, save_as='04-predictions')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2fc7b2b9-9115-4848-9aae-2798bf7aa79a",
   "metadata": {},
   "source": [
    "### 6.4 - Plot some errors"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e55f17c4-fce7-423a-9adf-f2511c534ef5",
   "metadata": {},
   "outputs": [],
   "source": [
    "errors=[ i for i in range(len(x_test)) if y_pred[i]!=y_test[i] ]\n",
    "errors=errors[:min(24,len(errors))]\n",
    "fidle.scrawler.images(x_test, y_test, errors[:15], columns=6, x_size=2, y_size=2, y_pred=y_pred, save_as='05-some-errors')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fea1b396-70ca-4b00-851d-0538a4b347fb",
   "metadata": {},
   "outputs": [],
   "source": [
    "fidle.scrawler.confusion_matrix(y_test,y_pred,range(10),normalize=True, save_as='06-confusion-matrix')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e982c032-cce8-4c71-8cdc-2af4b31b2914",
   "metadata": {},
   "outputs": [],
   "source": [
    "fidle.end()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "233838c2-c97f-4489-8c79-9247d7b7456b",
   "metadata": {},
   "source": [
    "<div class=\"todo\">\n",
    "    A few things you can do for fun:\n",
    "    <ul>\n",
    "        <li>Changing the network architecture (layers, number of neurons, etc.)</li>\n",
    "        <li>Display a summary of the network</li>\n",
    "        <li>Retrieve and display the softmax output of the network, to evaluate its \"doubts\".</li>\n",
    "    </ul>\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "51b87aa0-d4e9-48bb-8205-4b583f4b0b61",
   "metadata": {},
   "source": [
    "---\n",
    "<img width=\"80px\" src=\"../fidle/img/logo-paysage.svg\"></img>"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}