"README.md" did not exist on "bdc7b815d033f84e5538a1c8db87d3c061b1ca4c"
Newer
Older
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img width=\"800px\" src=\"../fidle/img/00-Fidle-header-01.svg\"></img>\n",
"# <!-- TITLE --> [GTSRB3] - Training monitoring\n",
"<!-- DESC --> Episode 3 : Monitoring, analysis and check points during a training session\n",
"<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->\n",
"\n",
"## Objectives :\n",
" - **Understand** what happens during the **training** process\n",
" - Implement **monitoring**, **backup** and **recovery** solutions\n",
" \n",
"The German Traffic Sign Recognition Benchmark (GTSRB) is a dataset with more than 50,000 photos of road signs from about 40 classes. \n",
"The final aim is to recognise them ! \n",
"Description is available there : http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset\n",
" - Add recovery points\n",
"## Step 1 - Import and init\n",
"### 1.1 - Python stuffs"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import tensorflow as tf\n",
"from tensorflow import keras\n",
"from tensorflow.keras.callbacks import TensorBoard\n",
"\n",
"import numpy as np\n",
"\n",
"from sklearn.metrics import confusion_matrix\n",
"from skimage import io, transform, color\n",
"import matplotlib.pyplot as plt\n",
"run_dir = './run/GTSRB3.001'\n",
"datasets_dir = pwk.init('GTSRB3', run_dir)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 1.2 - Parameters\n",
"`scale` is the proportion of the dataset that will be used during the training. (1 mean 100%) \n",
"A 24x24 dataset, with 5 epochs and a scale of 1, need 3'30 on a CPU laptop.\\\n",
"`fit_verbosity` is the verbosity during training : 0 = silent, 1 = progress bar, 2 = one line per epoch"
"enhanced_dir = './data'\n",
"# enhanced_dir = f'{datasets_dir}/GTSRB/enhanced'\n",
"\n",
"dataset_name = 'set-24x24-L'\n",
"batch_size = 64\n",
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Override parameters (batch mode) - Just forget this cell"
]
},
{
"cell_type": "code",
"pwk.override('enhanced_dir', 'dataset_name', 'batch_size', 'epochs', 'scale', 'fit_verbosity')"
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Dataset is one of the saved dataset: RGB25, RGB35, L25, L35, etc. \n",
"First of all, we're going to use a smart dataset : **set-24x24-L** \n",
"(with a GPU, it only takes 35'' compared to more than 5' with a CPU !)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def read_dataset(enhanced_dir, dataset_name):\n",
" '''Reads h5 dataset\n",
" filename : datasets filename\n",
" dataset_name : dataset name, without .h5\n",
" Returns: x_train,y_train, x_test,y_test data, x_meta,y_meta'''\n",
" # ---- Read dataset\n",
" filename = f'{enhanced_dir}/{dataset_name}.h5'\n",
" with h5py.File(filename,'r') as f:\n",
" x_train = f['x_train'][:]\n",
" y_train = f['y_train'][:]\n",
" x_test = f['x_test'][:]\n",
" y_test = f['y_test'][:]\n",
" x_meta = f['x_meta'][:]\n",
" y_meta = f['y_meta'][:]\n",
" # ---- Shuffle\n",
" x_train,y_train=pwk.shuffle_np_dataset(x_train,y_train)\n",
"\n",
" # ---- done\n",
" duration = pwk.chrono_stop(hdelay=True)\n",
" size = pwk.hsize(os.path.getsize(filename))\n",
" print(f'Dataset \"{dataset_name}\" is loaded and shuffled. ({size} in {duration})')\n",
" return x_train,y_train, x_test,y_test, x_meta,y_meta\n",
"x_train,y_train,x_test,y_test, x_meta,y_meta = read_dataset(enhanced_dir, dataset_name)\n",
"x_train,y_train, x_test,y_test = pwk.rescale_dataset(x_train,y_train,x_test,y_test, scale=scale)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note: Data must be reshape for matplotlib"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"x_train : \", x_train.shape)\n",
"print(\"y_train : \", y_train.shape)\n",
"print(\"x_test : \", x_test.shape)\n",
"print(\"y_test : \", y_test.shape)\n",
"\n",
"pwk.plot_images(x_train, y_train, range(12), columns=6, x_size=2, y_size=2, save_as='01-dataset-medium')\n",
"pwk.plot_images(x_train, y_train, range(36), columns=12, x_size=1, y_size=1, save_as='02-dataset-small')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will now build a model and train it...\n",
"\n",
"Some models... "
]
},
{
"cell_type": "code",
"metadata": {},
"outputs": [],
"source": [
"# A basic model\n",
"#\n",
"def get_model_v1(lx,ly,lz):\n",
" \n",
" model = keras.models.Sequential()\n",
" model.add( keras.layers.Conv2D(96, (3,3), activation='relu', input_shape=(lx,ly,lz)))\n",
" model.add( keras.layers.MaxPooling2D((2, 2)))\n",
" model.add( keras.layers.Dropout(0.2))\n",
"\n",
" model.add( keras.layers.Conv2D(192, (3, 3), activation='relu'))\n",
" model.add( keras.layers.MaxPooling2D((2, 2)))\n",
" model.add( keras.layers.Dropout(0.2))\n",
"\n",
" model.add( keras.layers.Flatten()) \n",
" model.add( keras.layers.Dense(1500, activation='relu'))\n",
" model.add( keras.layers.Dropout(0.5))\n",
"\n",
" model.add( keras.layers.Dense(43, activation='softmax'))\n",
" return model\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will add 2 callbacks : \n",
"\n",
"**TensorBoard** \n",
"Training logs, which can be visualised using [Tensorboard tool](https://www.tensorflow.org/tensorboard). \n",
"\n",
"**Model backup** \n",
" It is possible to save the model each xx epoch or at each improvement. \n",
" The model can be saved completely or partially (weight). \n",
" For full format, we can use HDF5 format."
"execution_count": null,
"metadata": {},
"outputs": [],
"pwk.mkdir(run_dir + '/models')\n",
"pwk.mkdir(run_dir + '/logs')\n",
"tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)\n",
"\n",
"bestmodel_callback = tf.keras.callbacks.ModelCheckpoint(filepath=save_dir, verbose=0, monitor='accuracy', save_best_only=True)\n",
"\n",
"# ---- Callback ModelCheckpoint - Save model each epochs\n",
"save_dir = run_dir + \"/models/model-{epoch:04d}.h5\"\n",
"savemodel_callback = tf.keras.callbacks.ModelCheckpoint(filepath=save_dir, verbose=0)\n",
"\n",
"path=os.path.abspath(f'{run_dir}/logs')\n",
"print(f'To run tensorboard :\\ntensorboard --logdir {path}')"
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Get the shape of my data :**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"(n,lx,ly,lz) = x_train.shape\n",
"print(\"Images of the dataset have this folowing shape : \",(lx,ly,lz))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Get and compile a model, with the data shape :**"
]
},
{
"cell_type": "code",
"metadata": {},
"outputs": [],
"source": [
"model = get_model_v1(lx,ly,lz)\n",
"\n",
"# model.summary()\n",
"\n",
"model.compile(optimizer='adam',\n",
" loss='sparse_categorical_crossentropy',\n",
" metrics=['accuracy'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Train it :** \n",
"Note: The training curve is visible in real time with Tensorboard (see step 5)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"x_train,y_train=pwk.shuffle_np_dataset(x_train,y_train)\n",
"# Note: To be faster in our example, we can take only 2000 values\n",
" batch_size=batch_size,\n",
" epochs=epochs,\n",
" validation_data=(x_test, y_test),\n",
" callbacks=[tensorboard_callback, bestmodel_callback, savemodel_callback] )\n",
"\n",
"model.save(f'{run_dir}/models/last-model.h5')\n",
"\n",
"pwk.chrono_show()"
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Evaluate it :**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"max_val_accuracy = max(history.history[\"val_accuracy\"])\n",
"print(\"Max validation accuracy is : {:.4f}\".format(max_val_accuracy))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"score = model.evaluate(x_test, y_test, verbose=0)\n",
"\n",
"print('Test loss : {:5.4f}'.format(score[0]))\n",
"print('Test accuracy : {:5.4f}'.format(score[1]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The return of model.fit() returns us the learning history"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"pwk.plot_history(history, save_as='03-history')"
{
"cell_type": "markdown",
"metadata": {},
"source": [
"execution_count": null,
"metadata": {},
"outputs": [],
"y_sigmoid = model.predict(x_test)\n",
"y_pred = np.argmax(y_sigmoid, axis=-1)\n",
"\n",
"pwk.plot_confusion_matrix(y_test,y_pred,range(43), figsize=(16, 16),normalize=False, save_as='04-confusion-matrix')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 9 - Restore and evaluate\n",
"### 9.1 - List saved models :"
"execution_count": null,
"metadata": {},
"outputs": [],
{
"cell_type": "markdown",
"metadata": {},
"source": [
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"loaded_model = tf.keras.models.load_model(f'{run_dir}/models/best-model.h5')\n",
"print(\"Loaded.\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"execution_count": null,
"metadata": {},
"outputs": [],
"score = loaded_model.evaluate(x_test, y_test, verbose=0)\n",
"\n",
"print('Test loss : {:5.4f}'.format(score[0]))\n",
"print('Test accuracy : {:5.4f}'.format(score[1]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"execution_count": null,
"metadata": {},
"outputs": [],
"x,y = x_test[i], y_test[i]\n",
"\n",
"# ---- Do prediction\n",
"#\n",
"predictions = loaded_model.predict( np.array([x]) )\n",
"\n",
"# ---- A prediction is just the output layer\n",
"#\n",
"print(\"\\nOutput layer from model is (x100) :\\n\")\n",
"with np.printoptions(precision=2, suppress=True, linewidth=95):\n",
"\n",
"# ---- Graphic visualisation\n",
"#\n",
"print(\"\\nGraphically :\\n\")\n",
"plt.figure(figsize=(12,2))\n",
"plt.bar(range(43), predictions[0], align='center', alpha=0.5)\n",
"plt.ylabel('Probability')\n",
"plt.ylim((0,1))\n",
"plt.xlabel('Class')\n",
"pwk.save_fig('05-prediction-proba')\n",
"plt.show()\n",
"\n",
"# ---- Predict class\n",
"#\n",
"p = np.argmax(predictions)\n",
"\n",
"# ---- Show result\n",
"#\n",
"print(\"\\nThe image : Prediction : Real stuff:\")\n",
"pwk.plot_images([x,x_meta[p], x_meta[y]], [p,p,y], range(3), columns=3, x_size=3, y_size=2, save_as='06-prediction-images')\n",
"\n",
"if p==y:\n",
" print(\"YEEES ! that's right!\")\n",
"else:\n",
" print(\"oups, that's wrong ;-(\")"
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"pwk.end()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 10 - To go further ;-)\n",
"What you can do:\n",
"- Limit model saving: 1 save every 5 epochs\n",
"- Use a subset of the dataset\n",
"- Try different datasets\n",
"- Some exotic signs are waiting to be recognized in dataset_dir/extra !\n",
"- Test different hyperparameters (epochs, batch size, optimization, etc.\n",
" "
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"<img width=\"80px\" src=\"../fidle/img/00-Fidle-logo-01.svg\"></img>"
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
}
},
"nbformat": 4,
"nbformat_minor": 4
}