03-Tracking-and-visualizing.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img width=\"800px\" src=\"../fidle/img/00-Fidle-header-01.svg\"></img>\n",
    "\n",
    "# <!-- TITLE --> [GTS3] - CNN with GTSRB dataset - Monitoring \n",
    "<!-- DESC --> Episode 3: Monitoring and analysing training, managing checkpoints\n",
    "<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->\n",
    "\n",
    "## Objectives :\n",
    "  - **Understand** what happens during the **training** process\n",
    "  - Implement **monitoring**, **backup** and **recovery** solutions\n",
    "  \n",
    "The German Traffic Sign Recognition Benchmark (GTSRB) is a dataset with more than 50,000 photos of road signs from about 40 classes.  \n",
    "The final aim is to recognise them !  \n",
    "Description is available there : http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset\n",
    "\n",
    "\n",
    "## What we're going to do :\n",
    "\n",
    " - Monitoring and understanding our model training \n",
    " - Add recovery points\n",
    " - Analyze the results \n",
    " - Restore and run recovery points\n",
    "\n",
    "## Step 1 - Import and init"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "from tensorflow import keras\n",
    "from tensorflow.keras.callbacks import TensorBoard\n",
    "\n",
    "import numpy as np\n",
    "import h5py\n",
    "\n",
    "from sklearn.metrics import confusion_matrix\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sn\n",
    "import os, sys, time, random\n",
    "\n",
    "from importlib import reload\n",
    "\n",
    "sys.path.append('..')\n",
    "import fidle.pwk as ooo\n",
    "\n",
    "ooo.init()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 2 - Load dataset\n",
    "Dataset is one of the saved dataset: RGB25, RGB35, L25, L35, etc.  \n",
    "First of all, we're going to use a smart dataset : **set-24x24-L**  \n",
    "(with a GPU, it only takes 35'' compared to more than 5' with a CPU !)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "\n",
    "def read_dataset(name):\n",
    "    '''Reads h5 dataset from ./data\n",
    "\n",
    "    Arguments:  dataset name, without .h5\n",
    "    Returns:    x_train,y_train,x_test,y_test data'''\n",
    "    # ---- Read dataset\n",
    "    filename='./data/'+name+'.h5'\n",
    "    with  h5py.File(filename) as f:\n",
    "        x_train = f['x_train'][:]\n",
    "        y_train = f['y_train'][:]\n",
    "        x_test  = f['x_test'][:]\n",
    "        y_test  = f['y_test'][:]\n",
    "        x_meta  = f['x_meta'][:]\n",
    "        y_meta  = f['y_meta'][:]\n",
    "\n",
    "    # ---- done\n",
    "    print('Dataset \"{}\" is loaded. ({:.1f} Mo)\\n'.format(name,os.path.getsize(filename)/(1024*1024)))\n",
    "    return x_train,y_train,x_test,y_test,x_meta,y_meta\n",
    "\n",
    "x_train,y_train,x_test,y_test,x_meta,y_meta = read_dataset('set-24x24-L')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 3 - Have a look to the dataset\n",
    "Note: Data must be reshape for matplotlib"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"x_train : \", x_train.shape)\n",
    "print(\"y_train : \", y_train.shape)\n",
    "print(\"x_test  : \", x_test.shape)\n",
    "print(\"y_test  : \", y_test.shape)\n",
    "\n",
    "ooo.plot_images(x_train, y_train, range(12), columns=6,  x_size=2, y_size=2)\n",
    "ooo.plot_images(x_train, y_train, range(36), columns=12, x_size=1, y_size=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 4 - Create model\n",
    "We will now build a model and train it...\n",
    "\n",
    "Some models... "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# A basic model\n",
    "#\n",
    "def get_model_v1(lx,ly,lz):\n",
    "    \n",
    "    model = keras.models.Sequential()\n",
    "    \n",
    "    model.add( keras.layers.Conv2D(96, (3,3), activation='relu', input_shape=(lx,ly,lz)))\n",
    "    model.add( keras.layers.MaxPooling2D((2, 2)))\n",
    "    model.add( keras.layers.Dropout(0.2))\n",
    "\n",
    "    model.add( keras.layers.Conv2D(192, (3, 3), activation='relu'))\n",
    "    model.add( keras.layers.MaxPooling2D((2, 2)))\n",
    "    model.add( keras.layers.Dropout(0.2))\n",
    "\n",
    "    model.add( keras.layers.Flatten()) \n",
    "    model.add( keras.layers.Dense(1500, activation='relu'))\n",
    "    model.add( keras.layers.Dropout(0.5))\n",
    "\n",
    "    model.add( keras.layers.Dense(43, activation='softmax'))\n",
    "    return model\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 5 - Prepare callbacks  \n",
    "We will add 2 callbacks :  \n",
    " - **TensorBoard**  \n",
    "Training logs, which can be visualised with Tensorboard.  \n",
    "`#tensorboard --logdir ./run/logs`  \n",
    "IMPORTANT : Relancer tensorboard à chaque run\n",
    " - **Model backup**  \n",
    " It is possible to save the model each xx epoch or at each improvement.  \n",
    " The model can be saved completely or partially (weight).  \n",
    " For full format, we can use HDF5 format."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {},
   "source": [
    "%%bash\n",
    "# To clean old logs and saved model, run this cell\n",
    "#\n",
    "/bin/rm -r ./run/logs   2>/dev/null\n",
    "/bin/rm -r ./run/models 2>/dev/null\n",
    "/bin/mkdir -p -m 755 ./run/logs\n",
    "/bin/mkdir -p -m 755 ./run/models\n",
    "echo -e \"Reset directories : ./run/logs and ./run/models .\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ooo.mkdir('./run/models')\n",
    "ooo.mkdir('./run/logs')\n",
    "\n",
    "# ---- Callback tensorboard\n",
    "log_dir = \"./run/logs/tb_\" + ooo.tag_now()\n",
    "tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)\n",
    "\n",
    "# ---- Callback ModelCheckpoint - Save best model\n",
    "save_dir = \"./run/models/best-model.h5\"\n",
    "bestmodel_callback = tf.keras.callbacks.ModelCheckpoint(filepath=save_dir, verbose=0, monitor='accuracy', save_best_only=True)\n",
    "\n",
    "# ---- Callback ModelCheckpoint - Save model each epochs\n",
    "save_dir = \"./run/models/model-{epoch:04d}.h5\"\n",
    "savemodel_callback = tf.keras.callbacks.ModelCheckpoint(filepath=save_dir, verbose=0, save_freq=2000*5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 6 - Train the model\n",
    "**Get the shape of my data :**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(n,lx,ly,lz) = x_train.shape\n",
    "print(\"Images of the dataset have this folowing shape : \",(lx,ly,lz))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Get and compile a model, with the data shape :**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model = get_model_v1(lx,ly,lz)\n",
    "\n",
    "# model.summary()\n",
    "\n",
    "model.compile(optimizer='adam',\n",
    "              loss='sparse_categorical_crossentropy',\n",
    "              metrics=['accuracy'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Train it :**  \n",
    "Note: The training curve is visible in real time with Tensorboard :     \n",
    "`#tensorboard --logdir ./run/logs`  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "\n",
    "batch_size = 64\n",
    "epochs     = 30\n",
    "\n",
    "# ---- Shuffle train data\n",
    "x_train,y_train=ooo.shuffle_np_dataset(x_train,y_train)\n",
    "\n",
    "# ---- Train\n",
    "# Note: To be faster in our example, we can take only 2000 values\n",
    "#\n",
    "history = model.fit(  x_train, y_train,\n",
    "                      batch_size=batch_size,\n",
    "                      epochs=epochs,\n",
    "                      verbose=1,\n",
    "                      validation_data=(x_test, y_test),\n",
    "                      callbacks=[tensorboard_callback, bestmodel_callback, savemodel_callback] )\n",
    "\n",
    "model.save('./run/models/last-model.h5')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Evaluate it :**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "max_val_accuracy = max(history.history[\"val_accuracy\"])\n",
    "print(\"Max validation accuracy is : {:.4f}\".format(max_val_accuracy))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "score = model.evaluate(x_test, y_test, verbose=0)\n",
    "\n",
    "print('Test loss      : {:5.4f}'.format(score[0]))\n",
    "print('Test accuracy  : {:5.4f}'.format(score[1]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 7 - History\n",
    "The return of model.fit() returns us the learning history"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ooo.plot_history(history)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 8 - Evaluation and confusion"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_pred   = model.predict_classes(x_test)\n",
    "conf_mat = confusion_matrix(y_test,y_pred, normalize=\"true\", labels=range(43))\n",
    "\n",
    "ooo.plot_confusion_matrix(conf_mat)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 9 - Restore and evaluate\n",
    "### 9.1 - List saved models :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!find ./run/models/"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 9.2 - Restore a model :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "loaded_model = tf.keras.models.load_model('./run/models/best-model.h5')\n",
    "# loaded_model.summary()\n",
    "print(\"Loaded.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 9.3 - Evaluate it :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "score = loaded_model.evaluate(x_test, y_test, verbose=0)\n",
    "\n",
    "print('Test loss      : {:5.4f}'.format(score[0]))\n",
    "print('Test accuracy  : {:5.4f}'.format(score[1]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 9.4 - Make a prediction :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# ---- Get a random image\n",
    "#\n",
    "i   = random.randint(1,len(x_test))\n",
    "x,y = x_test[i], y_test[i]\n",
    "\n",
    "# ---- Do prediction\n",
    "#\n",
    "predictions = loaded_model.predict( np.array([x]) )\n",
    "\n",
    "# ---- A prediction is just the output layer\n",
    "#\n",
    "print(\"\\nOutput layer from model is (x100) :\\n\")\n",
    "with np.printoptions(precision=2, suppress=True, linewidth=95):\n",
    "    print(predictions*100)\n",
    "\n",
    "# ---- Graphic visualisation\n",
    "#\n",
    "print(\"\\nGraphically :\\n\")\n",
    "plt.figure(figsize=(12,2))\n",
    "plt.bar(range(43), predictions[0], align='center', alpha=0.5)\n",
    "plt.ylabel('Probability')\n",
    "plt.ylim((0,1))\n",
    "plt.xlabel('Class')\n",
    "plt.title('Trafic Sign prediction')\n",
    "plt.show()\n",
    "\n",
    "# ---- Predict class\n",
    "#\n",
    "p = np.argmax(predictions)\n",
    "\n",
    "# ---- Show result\n",
    "#\n",
    "print(\"\\nPrediction on the left, real stuff on the right :\\n\")\n",
    "ooo.plot_images([x,x_meta[y]], [p,y], range(2),  columns=3,  x_size=3, y_size=2)\n",
    "\n",
    "if p==y:\n",
    "    print(\"YEEES ! that's right!\")\n",
    "else:\n",
    "    print(\"oups, that's wrong ;-(\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "<img width=\"80px\" src=\"../fidle/img/00-Fidle-logo-01.svg\"></img>"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}