From 748bf82a9c6aa8223e747687e51847cfc329fdf1 Mon Sep 17 00:00:00 2001
From: Achille Mbogol Touye <achille.mbogol-touye@univ-grenoble-alpes.fr>
Date: Tue, 26 Sep 2023 17:33:41 +0200
Subject: [PATCH] 02-CNN-MNIST_Lightning

---
 MNIST_Lightning/02-CNN-MNIST_Lightning.ipynb | 567 +++++++++++++++++++
 1 file changed, 567 insertions(+)
 create mode 100644 MNIST_Lightning/02-CNN-MNIST_Lightning.ipynb

diff --git a/MNIST_Lightning/02-CNN-MNIST_Lightning.ipynb b/MNIST_Lightning/02-CNN-MNIST_Lightning.ipynb
new file mode 100644
index 0000000..11fff32
--- /dev/null
+++ b/MNIST_Lightning/02-CNN-MNIST_Lightning.ipynb
@@ -0,0 +1,567 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "86fe2213-fb44-4bd4-a371-a541cba6a744",
+   "metadata": {},
+   "source": [
+    "\n",
+    "<img width=\"800px\" src=\"../fidle/img/00-Fidle-header-01.svg\"></img>\n",
+    "\n",
+    "## <!-- TITLE --> [MNIST2] - Simple classification with CNN using lightning\n",
+    "<!-- DESC --> An example of classification using a convolutional neural network for the famous MNIST dataset\n",
+    "<!-- AUTHOR : MBOGOL Touye Achille (AI/ML Engineer MIAI/SIMaP) -->\n",
+    "\n",
+    "## Objectives :\n",
+    " - Recognizing handwritten numbers\n",
+    " - Understanding the principle of a classifier DNN network \n",
+    " - Implementation with pytorch lightning \n",
+    "\n",
+    "\n",
+    "The [MNIST dataset](http://yann.lecun.com/exdb/mnist/) (Modified National Institute of Standards and Technology) is a must for Deep Learning.  \n",
+    "It consists of 60,000 small images of handwritten numbers for learning and 10,000 for testing.\n",
+    "\n",
+    "\n",
+    "## What we're going to do :\n",
+    "\n",
+    " - Retrieve data\n",
+    " - Preparing the data\n",
+    " - Create a model\n",
+    " - Train the model\n",
+    " - Evaluate the result\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7f16101a-6612-4e02-93e9-c45ce1ac911d",
+   "metadata": {},
+   "source": [
+    "## Step 1 - Init python stuff"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "743c77d3-0983-491c-90be-ef2219861a47",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "import numpy as np\n",
+    "\n",
+    "import torch\n",
+    "import torch.nn as nn\n",
+    "import lightning.pytorch as pl\n",
+    "import torch.nn.functional as F\n",
+    "import torchvision.transforms as T\n",
+    "\n",
+    "import sys,os\n",
+    "import multiprocessing\n",
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "from lightning.pytorch.loggers.tensorboard import TensorBoardLogger\n",
+    "from torch.utils.data import Dataset, DataLoader\n",
+    "from torchvision import datasets\n",
+    "from torchmetrics.functional import accuracy\n",
+    "\n",
+    "# Init Fidle environment\n",
+    "import fidle\n",
+    "\n",
+    "run_id, run_dir, datasets_dir = fidle.init('MNIST_Ligthning')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "df10dcda-aa63-476b-8665-9b1610fe51c6",
+   "metadata": {},
+   "source": [
+    "## Step 2 Retrieve data\n",
+    "\n",
+    "MNIST is one of the most famous historic dataset include in torchvision Datasets. `torchvision` provides many built-in datasets in the `torchvision.datasets`.  \n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6668e50c-f0c6-43cf-b733-9ac29d6a3900",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Load data sets \n",
+    "train_dataset = datasets.MNIST(root=\"data\", train=True, download=True, transform=None)\n",
+    "\n",
+    "test_dataset= datasets.MNIST(root=\"data\", train=False, download=False, transform=None)\n",
+    "\n",
+    "# print info for train data\n",
+    "print(train_dataset)\n",
+    "\n",
+    "print()\n",
+    "\n",
+    "# print info for test data\n",
+    "print(test_dataset)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "44a489f5-3e53-4a2b-8069-f265b2814dc0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# See the shape of train data and test data\n",
+    "print(\"x_train : \",train_dataset.data.shape)\n",
+    "print(\"y_train : \",train_dataset.targets.shape)\n",
+    "\n",
+    "print()\n",
+    "\n",
+    "print(\"x_test  : \",test_dataset.data.shape)\n",
+    "print(\"y_test  : \",test_dataset.targets.shape)\n",
+    "\n",
+    "print()\n",
+    "\n",
+    "# print number of targets and  values targets\n",
+    "print(\"Number of Targets :\",len(np.unique(train_dataset.targets)))\n",
+    "print(\"Targets Values    :\",    np.unique(train_dataset.targets))\n",
+    "\n",
+    "print()\n",
+    "\n",
+    "print(\"Remark that we work with torch tensors and not numpy array, not tensorflow tensor\")\n",
+    "print(\" -> x_train.dtype = \",train_dataset.data.dtype)\n",
+    "print(\" -> y_train.dtype = \",train_dataset.targets.dtype)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b418adb7-33ea-450c-9793-3cdce5d5fa8c",
+   "metadata": {},
+   "source": [
+    "## Step 3 -  Preparing your data for training with DataLoaders\n",
+    "The Dataset retrieves our datasetâ€™s features and labels one sample at a time. While training a model, we typically want to pass samples in `minibatches`, reshuffle the data at every epoch to reduce model overfitting, and use Pythonâ€™s multiprocessing to speed up data retrieval. DataLoader is an iterable that abstracts this complexity for us in an easy API."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8af0bc4c-acb3-46d9-aae2-143b0327d970",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Before normalization:\n",
+    "x_train=train_dataset.data\n",
+    "print('Before normalization : Min={}, max={}'.format(x_train.min(),x_train.max()))\n",
+    "\n",
+    "# After normalization:\n",
+    "## T.Compose creates a pipeline where the provided transformations are run in sequence\n",
+    "transforms = T.Compose(\n",
+    "    [\n",
+    "        # This transforms takes a np.array or a PIL image of integers\n",
+    "        # in the range 0-255 and transforms it to a float tensor in the\n",
+    "        # range 0.0 - 1.0\n",
+    "        T.ToTensor(),\n",
+    "\n",
+    "        # This then renormalizes the tensor to be between -1.0 and 1.0,\n",
+    "        # which is a better range for modern activation functions like\n",
+    "        # Relu\n",
+    "        T.Normalize((0.5), (0.5)),\n",
+    "    ]\n",
+    ")\n",
+    "\n",
+    "\n",
+    "train_dataset = datasets.MNIST(root=\"data\", train=True, download=True, transform=transforms)\n",
+    "test_dataset= datasets.MNIST(root=\"data\", train=False, download=True, transform=transforms)\n",
+    "\n",
+    "\n",
+    "# print image and label After normalization. \n",
+    "# iter() followed by next() is used to get some images and label\n",
+    "image,label=next(iter(train_dataset))\n",
+    "print('After normalization  : Min={}, max={}'.format(image.min(),image.max()))\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "35d50a57-8274-4660-8765-d0f2bf7214bd",
+   "metadata": {},
+   "source": [
+    "### Have a look"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a172ebc5-8858-4f30-8e2c-1e9c123ae0ee",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "x_train=train_dataset.data\n",
+    "y_train=train_dataset.targets"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5a487760-b43a-4f7c-bfd8-1ce2c9652769",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fidle.scrawler.images(x_train, y_train, [27],  x_size=5,y_size=5, colorbar=True, save_as='01-one-digit')\n",
+    "fidle.scrawler.images(x_train, y_train, range(5,41), columns=12, save_as='02-many-digits')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ca0a63ae-e6d6-4940-b8ff-9b11cb2737bb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# get the number of CPUs in your system \n",
+    "n_workers = multiprocessing.cpu_count()\n",
+    "\n",
+    "# train bacth data\n",
+    "train_loader= DataLoader(\n",
+    "  dataset=train_dataset, \n",
+    "  shuffle=True, \n",
+    "  batch_size=512,\n",
+    "  num_workers=n_workers \n",
+    ")\n",
+    "\n",
+    "# test batch data\n",
+    "test_loader= DataLoader(\n",
+    "  dataset=test_dataset, \n",
+    "  shuffle=False, \n",
+    "  batch_size=512,\n",
+    "  num_workers=n_workers \n",
+    ")\n",
+    "\n",
+    "# print image and label  After normalization and batch_size.\n",
+    "image, label=next(iter(train_loader))\n",
+    "print('Shape of first training data batch after use pytorch dataloader :\\nbatch images = {} \\nbatch labels = {}'.format(image.shape,label.shape))      "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "51bf21ee-76ca-42fa-b67f-066dbd239a72",
+   "metadata": {},
+   "source": [
+    "## Step 4 - Create Model\n",
+    "About informations about : \n",
+    " - [Optimizer](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers)\n",
+    " - [Activation](https://www.tensorflow.org/api_docs/python/tf/keras/activations)\n",
+    " - [Loss](https://www.tensorflow.org/api_docs/python/tf/keras/losses)\n",
+    " - [Metrics](https://www.tensorflow.org/api_docs/python/tf/keras/metrics)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "16701119-71eb-4f59-a50a-f153b07a74ae",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class MyNet(nn.Module):\n",
+    "    \n",
+    "    def __init__(self,num_class=10):\n",
+    "        super().__init__()\n",
+    "        self.num_class=num_class\n",
+    "        self.model = nn.Sequential(\n",
+    "            # first convolution  \n",
+    "            nn.Conv2d(in_channels=1, out_channels=8, kernel_size=3, stride=1, padding=0),\n",
+    "            nn.ReLU(),\n",
+    "            nn.MaxPool2d((2,2)),  \n",
+    "            nn.Dropout2d(0.2),            # Combat overfitting\n",
+    "            \n",
+    "            # second convolution  \n",
+    "            nn.Conv2d(in_channels=8, out_channels=16, kernel_size=3, stride=1, padding=0),\n",
+    "            nn.ReLU(),\n",
+    "            nn.MaxPool2d((2,2)),   \n",
+    "            nn.Dropout2d(0.2),            # Combat overfitting\n",
+    "            \n",
+    "            nn.Flatten(),                 # convert feature map into feature vectors\n",
+    "            \n",
+    "            # MLP network   \n",
+    "            nn.Linear(16*5*5,100),\n",
+    "            nn.ReLU(),\n",
+    "            nn.Dropout1d(0.2),            # Combat overfitting\n",
+    "        \n",
+    "            nn.Linear(100, num_class),    # logits outpout\n",
+    "        )\n",
+    "        \n",
+    "    def forward(self, x):\n",
+    "        x=self.model(x)                   # forward pass\n",
+    "        return x"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "37abf99b-f8ec-4048-a65d-f173ee18b234",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class LitModel(pl.LightningModule):\n",
+    "    \n",
+    "    def __init__(self, MyNet):\n",
+    "        super().__init__()\n",
+    "        self.MyNet = MyNet\n",
+    "                               \n",
+    "    def forward(self, x):                                               # forward pass\n",
+    "        return self.MyNet(x)\n",
+    "    \n",
+    "     # defines the train loop\n",
+    "    def training_step(self, batch, batch_idx):\n",
+    "        x, y = batch\n",
+    "        y_hat= self.MyNet(x)                                            # forward pass \n",
+    "        loss= F.cross_entropy(y_hat, y)                                 # loss function\n",
+    "        acc=accuracy(y_hat, y,task=\"multiclass\",num_classes=10)         # metrics    \n",
+    "        metrics = {\"train_loss\": loss, \"train_acc\": acc}\n",
+    "        \n",
+    "        # logs metrics for each training_step\n",
+    "        self.log_dict(metrics,\n",
+    "                      on_step=False ,\n",
+    "                      on_epoch=True, \n",
+    "                      prog_bar=True, \n",
+    "                      logger=True) \n",
+    "        return loss\n",
+    "\n",
+    "        \n",
+    "    # defines the valid loop.\n",
+    "    def validation_step(self, batch, batch_idx):\n",
+    "        x, y = batch\n",
+    "        y_hat= self.MyNet(x)                                             # forward pass\n",
+    "        loss = F.cross_entropy(y_hat, y)                                 # loss function MSE\n",
+    "        acc=accuracy(y_hat, y, task=\"multiclass\", num_classes=10)        # metrics\n",
+    "        metrics = {\"test_loss\": loss, \"test_acc\": acc}\n",
+    "        \n",
+    "        # logs metrics for each validation_step\n",
+    "        self.log_dict(metrics,\n",
+    "                      on_step=False,\n",
+    "                      on_epoch=True, \n",
+    "                      prog_bar=True, \n",
+    "                      logger=True\n",
+    "                     ) \n",
+    "        \n",
+    "        return metrics\n",
+    "        \n",
+    "    \n",
+    "    # complete prediction step\n",
+    "    def predict_step(self, batch, batch_idx):\n",
+    "        x, y = batch\n",
+    "        y_hat = self.MyNet(x)                                            # forward pass\n",
+    "        return y_hat\n",
+    "    \n",
+    "    \n",
+    "    # optimizer\n",
+    "    def configure_optimizers(self):\n",
+    "        optimizer = torch.optim.Adam(self.parameters(), lr=1e-3)\n",
+    "        return optimizer\n",
+    "\n",
+    "    \n",
+    "    \n",
+    "# print summary model\n",
+    "model=LitModel(MyNet())\n",
+    "print(model) "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "fb32e85d-bd92-4ca5-a3dc-ddb5ed50ba6b",
+   "metadata": {},
+   "source": [
+    "## Step 5 - Train Model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ce975c03-d05d-40c4-92ff-0cc90699c13e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# loggers data\n",
+    "logger = TensorBoardLogger(save_dir='MNIST2_logs',name=\"history_logs\")\n",
+    "\n",
+    "# train model\n",
+    "trainer=pl.Trainer(accelerator='auto',max_epochs=16,logger=logger)\n",
+    "trainer.fit(model=model, train_dataloaders=train_loader, val_dataloaders=test_loader)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a1191f05-4454-415c-a5ed-e63d9ae56651",
+   "metadata": {},
+   "source": [
+    "## Step 6 - Evaluate\n",
+    "### 6.1 - Final loss and accuracy\n",
+    "Note : With a DNN, we had a precision of the order of : 97.7%"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9f45316e-0d2d-4fc1-b9a8-5fb8aaf5586a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# evaluate your model\n",
+    "score=trainer.validate(model=model,dataloaders=test_loader, verbose=False)\n",
+    "\n",
+    "print('x_test / acc      : {:5.4f}'.format(score[0]['test_acc']))\n",
+    "print('x_test / loss     : {:5.4f}'.format(score[0]['test_loss']))\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e352e48d-b473-4162-a1aa-72d6d4f7aa38",
+   "metadata": {},
+   "source": [
+    "## 6.2 - Plot history"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5b1c6d11-b897-4e2b-8615-c207c8344d07",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# launch Tensorboard \n",
+    "%reload_ext tensorboard\n",
+    "%tensorboard  --logdir MNIST2_logs/history_logs/ "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f00ded6b-a7db-4c5d-b1b2-72264db20bdb",
+   "metadata": {},
+   "source": [
+    "###  6.3 - Plot results"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e387a70d-9c23-4d16-8ef7-879aec7791e2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# logits outpout by batch size\n",
+    "y_logits=trainer.predict(model=model,dataloaders=test_loader)\n",
+    "\n",
+    "# Concat into single tensor\n",
+    "y_logits=torch.cat(y_logits)\n",
+    "\n",
+    "# output probabilities values\n",
+    "y_pred_values=F.softmax(y_logits,dim=1)\n",
+    "\n",
+    "# Returns the indices of the maximum output probabilities values \n",
+    "y_pred=torch.argmax(y_pred_values,dim=-1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "fb2b2eeb-fcd8-453c-93ef-59a960a8bbd5",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "x_test=test_dataset.data\n",
+    "y_test=test_dataset.targets"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "71187fa9-2ad3-4b23-94b9-1846045bd070",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fidle.scrawler.images(x_test, y_test, range(0,200), columns=12, x_size=1, y_size=1, y_pred=y_pred, save_as='04-predictions')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2fc7b2b9-9115-4848-9aae-2798bf7aa79a",
+   "metadata": {},
+   "source": [
+    "### 6.4 - Plot some errors"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e55f17c4-fce7-423a-9adf-f2511c534ef5",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "errors=[ i for i in range(len(x_test)) if y_pred[i]!=y_test[i] ]\n",
+    "errors=errors[:min(24,len(errors))]\n",
+    "fidle.scrawler.images(x_test, y_test, errors[:15], columns=6, x_size=2, y_size=2, y_pred=y_pred, save_as='05-some-errors')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "fea1b396-70ca-4b00-851d-0538a4b347fb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fidle.scrawler.confusion_matrix(y_test,y_pred,range(10),normalize=True, save_as='06-confusion-matrix')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e982c032-cce8-4c71-8cdc-2af4b31b2914",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fidle.end()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "233838c2-c97f-4489-8c79-9247d7b7456b",
+   "metadata": {},
+   "source": [
+    "<div class=\"todo\">\n",
+    "    A few things you can do for fun:\n",
+    "    <ul>\n",
+    "        <li>Changing the network architecture (layers, number of neurons, etc.)</li>\n",
+    "        <li>Display a summary of the network</li>\n",
+    "        <li>Retrieve and display the softmax output of the network, to evaluate its \"doubts\".</li>\n",
+    "    </ul>\n",
+    "</div>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "51b87aa0-d4e9-48bb-8205-4b583f4b0b61",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "<img width=\"80px\" src=\"../fidle/img/00-Fidle-logo-01.svg\"></img>"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.10"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
-- 
GitLab