{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "<img width=\"800px\" src=\"../fidle/img/header.svg\"></img>\n", "\n", "# <!-- TITLE --> [K3GTSRB3] - Training monitoring\n", "<!-- DESC --> Episode 3 : Monitoring, analysis and check points during a training session, using Keras3\n", "<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->\n", "\n", "## Objectives :\n", " - **Understand** what happens during the **training** process\n", " - Implement **monitoring**, **backup** and **recovery** solutions\n", " \n", "The German Traffic Sign Recognition Benchmark (GTSRB) is a dataset with more than 50,000 photos of road signs from about 40 classes. \n", "The final aim is to recognise them ! \n", "Description is available there : http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset\n", "\n", "## What we're going to do :\n", "\n", " - Monitoring and understanding our model training \n", " - Add recovery points\n", " - Analyze the results \n", " - Restore and run recovery points \n", "\n", "## Step 1 - Import and init\n", "### 1.1 - Python stuffs" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "os.environ['KERAS_BACKEND'] = 'torch'\n", "\n", "import keras\n", "\n", "import numpy as np\n", "import os, random\n", "\n", "import fidle\n", "\n", "import modules.my_loader as my_loader\n", "import modules.my_models as my_models\n", "import modules.my_tools as my_tools\n", "from modules.my_TensorboardCallback import TensorboardCallback\n", "\n", "\n", "# Init Fidle environment\n", "run_id, run_dir, datasets_dir = fidle.init('K3GTSRB3')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.2 - Parameters\n", "`scale` is the proportion of the dataset that will be used during the training. (1 mean 100%) \n", "A 20% 24x24 L dataset, 10 epochs, 20% dataset, need 1'30 on a CPU laptop. (Accuracy=91.4)\\\n", "A 20% 48x48 RGB dataset, 10 epochs, 20% dataset, need 6'30s on a CPU laptop. (Accuracy=91.5)\\\n", "`fit_verbosity` is the verbosity during training : 0 = silent, 1 = progress bar, 2 = one line per epoch" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "enhanced_dir = './data'\n", "# enhanced_dir = f'{datasets_dir}/GTSRB/enhanced'\n", "\n", "model_name = 'model_01'\n", "dataset_name = 'set-24x24-L'\n", "batch_size = 64\n", "epochs = 10\n", "scale = 1\n", "fit_verbosity = 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Override parameters (batch mode) - Just forget this cell" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fidle.override('enhanced_dir', 'model_name', 'dataset_name', 'batch_size', 'epochs', 'scale', 'fit_verbosity')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 2 - Load dataset\n", "Dataset is one of the saved dataset..." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x_train,y_train,x_test,y_test, x_meta,y_meta = my_loader.read_dataset(enhanced_dir, dataset_name, scale)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 3 - Have a look to the dataset" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"x_train : \", x_train.shape)\n", "print(\"y_train : \", y_train.shape)\n", "print(\"x_test : \", x_test.shape)\n", "print(\"y_test : \", y_test.shape)\n", "\n", "fidle.scrawler.images(x_train, y_train, range(24), columns=8, x_size=1, y_size=1, save_as='02-dataset-small')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 4 - Get a model" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(n,lx,ly,lz) = x_train.shape\n", "\n", "model = my_models.get_model( model_name, lx,ly,lz )\n", "model.summary()\n", "\n", "model.compile(optimizer='adam',\n", " loss='sparse_categorical_crossentropy',\n", " metrics=['accuracy'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 5 - Prepare callbacks \n", "We will add 2 callbacks : \n", "\n", "**TensorBoard** \n", "Training logs, which can be visualised using [Tensorboard tool](https://www.tensorflow.org/tensorboard). \n", "\n", "**Model backup** \n", " It is possible to save the model each xx epoch or at each improvement. \n", " The model can be saved completely or partially (weight). \n", " See [Keras documentation](https://keras.io/api/callbacks/)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fidle.utils.mkdir(run_dir + '/models')\n", "fidle.utils.mkdir(run_dir + '/logs')\n", "\n", "# ---- Callback for tensorboard (This one is homemade !)\n", "#\n", "tenseorboard_callback = TensorboardCallback(\n", " log_dir=run_dir + \"/logs/tb_\" + fidle.Chrono.tag_now())\n", "\n", "# ---- Callback to save best model\n", "#\n", "bestmodel_callback = keras.callbacks.ModelCheckpoint( \n", " filepath= run_dir + \"/models/best-model.keras\",\n", " monitor='val_accuracy', \n", " mode='max', \n", " save_best_only=True)\n", "\n", "# ---- Callback to save model from each epochs\n", "#\n", "savemodel_callback = keras.callbacks.ModelCheckpoint(\n", " filepath= run_dir + \"/models/{epoch:02d}.keras\",\n", " save_freq=\"epoch\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 6 - Train the model" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# %load_ext tensorboard\n", "# %tensorboard --logdir ./run/K3GTSRB3/logs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Train it :** \n", "Note: The training curve is visible in real time with Tensorboard (see step 5)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "chrono=fidle.Chrono()\n", "chrono.start()\n", "\n", "# ---- Shuffle train data\n", "x_train,y_train=fidle.utils.shuffle_np_dataset(x_train,y_train)\n", "\n", "# ---- Train\n", "# Note: To be faster in our example, we can take only 2000 values\n", "#\n", "history = model.fit( x_train, y_train,\n", " batch_size=batch_size,\n", " epochs=epochs,\n", " verbose=fit_verbosity,\n", " validation_data=(x_test, y_test),\n", " callbacks=[tenseorboard_callback, bestmodel_callback, savemodel_callback] )\n", "\n", "model.save(f'{run_dir}/models/last-model.keras')\n", "\n", "chrono.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Evaluate it :**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "max_val_accuracy = max(history.history[\"val_accuracy\"])\n", "print(\"Max validation accuracy is : {:.4f}\".format(max_val_accuracy))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "score = model.evaluate(x_test, y_test, verbose=0)\n", "\n", "print('Test loss : {:5.4f}'.format(score[0]))\n", "print('Test accuracy : {:5.4f}'.format(score[1]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 7 - History\n", "The return of model.fit() returns us the learning history" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fidle.scrawler.history(history, save_as='03-history')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 8 - Evaluation and confusion" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "y_sigmoid = model.predict(x_test, verbose=fit_verbosity)\n", "y_pred = np.argmax(y_sigmoid, axis=-1)\n", "\n", "fidle.scrawler.confusion_matrix(y_test,y_pred,range(43), figsize=(12, 12),normalize=False, save_as='04-confusion-matrix')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 9 - Restore and evaluate\n", "#### List saved models :" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# !ls -1rt \"$run_dir\"/models/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Restore a model :" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "loaded_model = keras.models.load_model(f'{run_dir}/models/best-model.keras')\n", "# loaded_model.summary()\n", "print(\"Loaded.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Evaluate it :" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "score = loaded_model.evaluate(x_test, y_test, verbose=0)\n", "\n", "print('Test loss : {:5.4f}'.format(score[0]))\n", "print('Test accuracy : {:5.4f}'.format(score[1]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Make a prediction :" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# ---- Pick a random image\n", "#\n", "i = random.randint(1,len(x_test))\n", "x,y = x_test[i], y_test[i]\n", "\n", "# ---- Do prediction\n", "#\n", "prediction = loaded_model.predict( np.array([x]), verbose=fit_verbosity )\n", "\n", "# ---- Show result\n", "\n", "my_tools.show_prediction( prediction, x, y, x_meta )" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fidle.end()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 10 - To go further ;-)\n", "What you can do:\n", "- Try differents models\n", "- Use a subset of the dataset\n", "- Try different datasets\n", "- Try to recognize exotic signs !\n", "- Test different hyperparameters (epochs, batch size, optimization, etc.\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "<img width=\"80px\" src=\"../fidle/img/logo-paysage.svg\"></img>" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.9.2 ('fidle-env')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.2" }, "vscode": { "interpreter": { "hash": "b3929042cc22c1274d74e3e946c52b845b57cb6d84f2d591ffe0519b38e4896d" } } }, "nbformat": 4, "nbformat_minor": 4 }