{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "<img width=\"800px\" src=\"../fidle/img/00-Fidle-header-01.svg\"></img>\n", "\n", "# <!-- TITLE --> [GTSRB4] - Data augmentation \n", "<!-- DESC --> Episode 4 : Adding data by data augmentation when we lack it, to improve our results\n", "<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->\n", "\n", "## Objectives :\n", " - Trying to improve training by **enhancing the data**\n", " - Using Keras' **data augmentation utilities**, finding their limits...\n", " \n", "The German Traffic Sign Recognition Benchmark (GTSRB) is a dataset with more than 50,000 photos of road signs from about 40 classes. \n", "The final aim is to recognise them ! \n", "\n", "Description is available there : http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset\n", "\n", "\n", "## What we're going to do :\n", " - Increase and improve the training dataset\n", " - Identify the limits of these tools\n", "\n", "## Step 1 - Import and init\n", "### 1.1 - Python stuffs" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import tensorflow as tf\n", "from tensorflow import keras\n", "from tensorflow.keras.callbacks import TensorBoard\n", "\n", "import numpy as np\n", "import h5py\n", "\n", "from sklearn.metrics import confusion_matrix\n", "\n", "import matplotlib.pyplot as plt\n", "import os, sys, time, random\n", "\n", "from importlib import reload\n", "\n", "sys.path.append('..')\n", "import fidle.pwk as pwk\n", "\n", "run_dir = './run/GTSRB4.001'\n", "datasets_dir = pwk.init('GTSRB4', run_dir)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.2 - Parameters\n", "`scale` is the proportion of the dataset that will be used during the training. (1 mean 100%)\\\n", "`fit_verbosity` is the verbosity during training : 0 = silent, 1 = progress bar, 2 = one line per epoch" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "enhanced_dir = './data'\n", "# enhanced_dir = f'{datasets_dir}/GTSRB/enhanced'\n", "\n", "dataset_name = 'set-24x24-L'\n", "batch_size = 64\n", "epochs = 20\n", "scale = 1\n", "fit_verbosity = 1\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Override parameters (batch mode) - Just forget this cell" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pwk.override('enhanced_dir', 'dataset_name', 'batch_size', 'epochs', 'scale', 'fit_verbosity')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 2 - Load dataset\n", "Dataset is one of the saved dataset: RGB25, RGB35, L25, L35, etc. \n", "First of all, we're going to use a smart dataset : **set-24x24-L** \n", "(with a GPU, it only takes 35'' compared to more than 5' with a CPU !)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def read_dataset(enhanced_dir, dataset_name):\n", " '''Reads h5 dataset\n", " Args:\n", " filename : datasets filename\n", " dataset_name : dataset name, without .h5\n", " Returns: x_train,y_train, x_test,y_test data, x_meta,y_meta'''\n", " # ---- Read dataset\n", " pwk.chrono_start()\n", " filename = f'{enhanced_dir}/{dataset_name}.h5'\n", " with h5py.File(filename,'r') as f:\n", " x_train = f['x_train'][:]\n", " y_train = f['y_train'][:]\n", " x_test = f['x_test'][:]\n", " y_test = f['y_test'][:]\n", " x_meta = f['x_meta'][:]\n", " y_meta = f['y_meta'][:]\n", " print(x_train.shape, y_train.shape)\n", " # ---- Shuffle\n", " x_train,y_train=pwk.shuffle_np_dataset(x_train,y_train)\n", "\n", " # ---- done\n", " duration = pwk.chrono_stop(hdelay=True)\n", " size = pwk.hsize(os.path.getsize(filename))\n", " print(f'Dataset \"{dataset_name}\" is loaded and shuffled. ({size} in {duration})')\n", " return x_train,y_train, x_test,y_test, x_meta,y_meta\n", "\n", "# ---- Read dataset\n", "#\n", "x_train,y_train,x_test,y_test, x_meta,y_meta = read_dataset(enhanced_dir, dataset_name)\n", "\n", "# ---- Rescale \n", "#\n", "x_train,y_train, x_test,y_test = pwk.rescale_dataset(x_train,y_train,x_test,y_test, scale=scale)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 3 - Models\n", "We will now build a model and train it...\n", "\n", "This is my model ;-) " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# A basic model\n", "#\n", "def get_model_v1(lx,ly,lz):\n", " \n", " model = keras.models.Sequential()\n", " \n", " model.add( keras.layers.Conv2D(96, (3,3), activation='relu', input_shape=(lx,ly,lz)))\n", " model.add( keras.layers.MaxPooling2D((2, 2)))\n", " model.add( keras.layers.Dropout(0.2))\n", "\n", " model.add( keras.layers.Conv2D(192, (3, 3), activation='relu'))\n", " model.add( keras.layers.MaxPooling2D((2, 2)))\n", " model.add( keras.layers.Dropout(0.2))\n", "\n", " model.add( keras.layers.Flatten()) \n", " model.add( keras.layers.Dense(1500, activation='relu'))\n", " model.add( keras.layers.Dropout(0.5))\n", "\n", " model.add( keras.layers.Dense(43, activation='softmax'))\n", " return model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 4 - Callbacks \n", "We prepare 2 kind callbacks : TensorBoard and Model backup" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pwk.mkdir(run_dir + '/models')\n", "pwk.mkdir(run_dir + '/logs')\n", "\n", "# ---- Callback tensorboard\n", "log_dir = run_dir + \"/logs/tb_\" + pwk.tag_now()\n", "tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)\n", "\n", "# ---- Callback ModelCheckpoint - Save best model\n", "save_dir = run_dir + \"/models/best-model.h5\"\n", "bestmodel_callback = tf.keras.callbacks.ModelCheckpoint(filepath=save_dir, verbose=0, monitor='accuracy', save_best_only=True)\n", "\n", "# ---- Callback ModelCheckpoint - Save model each epochs\n", "save_dir = run_dir + \"/models/model-{epoch:04d}.h5\"\n", "savemodel_callback = tf.keras.callbacks.ModelCheckpoint(filepath=save_dir, verbose=0)\n", "\n", "path=os.path.abspath(f'{run_dir}/logs')\n", "print(f'To run tensorboard :\\ntensorboard --logdir {path}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 5 - Data generator" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "datagen = keras.preprocessing.image.ImageDataGenerator(featurewise_center=False,\n", " featurewise_std_normalization=False,\n", " width_shift_range=0.1,\n", " height_shift_range=0.1,\n", " zoom_range=0.2,\n", " shear_range=0.1,\n", " rotation_range=10.)\n", "datagen.fit(x_train)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 6 - Train the model\n", "**Get my data shape :**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(n,lx,ly,lz) = x_train.shape\n", "print(\"Images of the dataset have this folowing shape : \",(lx,ly,lz))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Get and compile a model, with the data shape :**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model = get_model_v1(lx,ly,lz)\n", "\n", "# model.summary()\n", "\n", "model.compile(optimizer='adam',\n", " loss='sparse_categorical_crossentropy',\n", " metrics=['accuracy'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Train it :** " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pwk.chrono_start()\n", "\n", "history = model.fit( datagen.flow(x_train, y_train, batch_size=batch_size),\n", " steps_per_epoch = int(x_train.shape[0]/batch_size),\n", " epochs=epochs,\n", " verbose=fit_verbosity,\n", " validation_data=(x_test, y_test),\n", " callbacks=[tensorboard_callback, bestmodel_callback, savemodel_callback] )\n", "\n", "model.save(f'{run_dir}/models/last-model.h5')\n", "\n", "pwk.chrono_show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Evaluate it :**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "max_val_accuracy = max(history.history[\"val_accuracy\"])\n", "print(\"Max validation accuracy is : {:.4f}\".format(max_val_accuracy))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "score = model.evaluate(x_test, y_test, verbose=0)\n", "\n", "print('Test loss : {:5.4f}'.format(score[0]))\n", "print('Test accuracy : {:5.4f}'.format(score[1]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 7 - History\n", "The return of model.fit() returns us the learning history" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pwk.plot_history(history, save_as='01-history')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 8 - Evaluate best model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 8.1 - Restore best model :" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "loaded_model = tf.keras.models.load_model(f'{run_dir}/models/best-model.h5')\n", "# best_model.summary()\n", "print(\"Loaded.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 8.2 - Evaluate it :" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "score = loaded_model.evaluate(x_test, y_test, verbose=0)\n", "\n", "print('Test loss : {:5.4f}'.format(score[0]))\n", "print('Test accuracy : {:5.4f}'.format(score[1]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Plot confusion matrix**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "y_sigmoid = model.predict(x_test)\n", "y_pred = np.argmax(y_sigmoid, axis=-1)\n", "\n", "cmap = plt.get_cmap('Oranges')\n", "pwk.plot_confusion_matrix(y_test,y_pred,range(43), figsize=(16, 16),normalize=False, cmap=cmap, save_as='02-confusion-matrix')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pwk.end()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<div class=\"todo\">\n", " What you can do:\n", " <ul>\n", " <li>Try different datasets / models</li>\n", " <li>Test different hyperparameters (epochs, batch size, optimization, etc.)</li>\n", " <li>What's the best strategy? How to compare?</li>\n", " </ul>\n", " \n", "</div>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "<img width=\"80px\" src=\"../fidle/img/00-Fidle-logo-01.svg\"></img>" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }