Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • daconcea/fidle
  • bossardl/fidle
  • Julie.Remenant/fidle
  • abijolao/fidle
  • monsimau/fidle
  • karkars/fidle
  • guilgautier/fidle
  • cailletr/fidle
  • talks/fidle
9 results
Show changes
Commits on Source (28)
......@@ -2,5 +2,6 @@
*/.ipynb_checkpoints/*
__pycache__
*/__pycache__/*
/run/**
run/
*/data/*
!/GTSRB/data/dataset.tar.gz
%% Cell type:markdown id: tags:
German Traffic Sign Recognition Benchmark (GTSRB)
=================================================
---
Introduction au Deep Learning (IDLE) - S. Arias, E. Maldonado, JL. Parouty - CNRS/SARI/DEVLOG - 2020
Version: 1.12
## Episode 1 : Preparation of data
- Understanding the dataset
- Preparing and formatting enhanced data
- Save enhanced datasets in h5 file format
%% Cell type:markdown id: tags:
## 1/ Import and init
%% Cell type:code id: tags:
``` python
import os, time, sys
import csv
import math, random
import numpy as np
import matplotlib.pyplot as plt
import h5py
from skimage.morphology import disk
from skimage.filters import rank
from skimage import io, color, exposure, transform
import idle.pwk as ooo
from importlib import reload
ooo.init()
```
%% Cell type:markdown id: tags:
## 2/ Read the dataset
Description is available there : http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset
- Each directory contains one CSV file with annotations ("GT-<ClassID>.csv") and the training images
- First line is fieldnames: Filename;Width;Height;Roi.X1;Roi.Y1;Roi.X2;Roi.Y2;ClassId
### 2.1/ Usefull functions
%% Cell type:code id: tags:
``` python
def read_dataset_dir(csv_filename):
'''Reads traffic sign data from German Traffic Sign Recognition Benchmark dataset.
Arguments: csv filename
Example /data/GTSRB/Train.csv
Returns: np array of images, np array of corresponding labels'''
# ---- csv filename and path
#
name=os.path.basename(csv_filename)
path=os.path.dirname(csv_filename)
# ---- Read csv file
#
f,x,y = [],[],[]
with open(csv_filename) as csv_file:
reader = csv.DictReader(csv_file, delimiter=',')
for row in reader:
f.append( path+'/'+row['Path'] )
y.append( int(row['ClassId']) )
csv_file.close()
nb_images = len(f)
# ---- Read images
#
for filename in f:
image=io.imread(filename)
x.append(image)
ooo.update_progress(name,len(x),nb_images)
# ---- Return
#
return np.array(x),np.array(y)
```
%% Cell type:markdown id: tags:
### 2.2/ Read the data
We will read the following datasets:
- **x_train, y_train** : Learning data
- **x_test, y_test** : Validation or test data
- x_meta, y_meta : Illustration data
The learning data will be randomly mixted and the illustration data sorted.
Will take about 2-3'
%% Cell type:code id: tags:
``` python
%%time
# ---- Read datasets
(x_train,y_train) = read_dataset_dir('./data/origine/Train.csv')
(x_test ,y_test) = read_dataset_dir('./data/origine/Test.csv')
(x_meta ,y_meta) = read_dataset_dir('./data/origine/Meta.csv')
# ---- Shuffle train set
combined = list(zip(x_train,y_train))
random.shuffle(combined)
x_train,y_train = zip(*combined)
# ---- Sort Meta
combined = list(zip(x_meta,y_meta))
combined.sort(key=lambda x: x[1])
x_meta,y_meta = zip(*combined)
```
%% Cell type:markdown id: tags:
## 3/ Few statistics about train dataset
We want to know if our images are homogeneous in terms of size, ratio, width or height.
### 3.1/ Do statistics
%% Cell type:code id: tags:
``` python
train_size = []
train_ratio = []
train_lx = []
train_ly = []
test_size = []
test_ratio = []
test_lx = []
test_ly = []
for image in x_train:
(lx,ly,lz) = image.shape
train_size.append(lx*ly/1024)
train_ratio.append(lx/ly)
train_lx.append(lx)
train_ly.append(ly)
for image in x_test:
(lx,ly,lz) = image.shape
test_size.append(lx*ly/1024)
test_ratio.append(lx/ly)
test_lx.append(lx)
test_ly.append(ly)
```
%% Cell type:markdown id: tags:
### 3.2/ Show statistics
%% Cell type:code id: tags:
``` python
# ------ Global stuff
print("x_train size : ",len(x_train))
print("y_train size : ",len(y_train))
print("x_test size : ",len(x_test))
print("y_test size : ",len(y_test))
# ------ Statistics / sizes
plt.figure(figsize=(16,6))
plt.hist([train_size,test_size], bins=100)
plt.gca().set(title='Sizes in Kpixels - Train=[{:5.2f}, {:5.2f}]'.format(min(train_size),max(train_size)),
ylabel='Population',
xlim=[0,30])
plt.legend(['Train','Test'])
plt.show()
# ------ Statistics / ratio lx/ly
plt.figure(figsize=(16,6))
plt.hist([train_ratio,test_ratio], bins=100)
plt.gca().set(title='Ratio lx/ly - Train=[{:5.2f}, {:5.2f}]'.format(min(train_ratio),max(train_ratio)),
ylabel='Population',
xlim=[0.8,1.2])
plt.legend(['Train','Test'])
plt.show()
# ------ Statistics / lx
plt.figure(figsize=(16,6))
plt.hist([train_lx,test_lx], bins=100)
plt.gca().set(title='Images lx - Train=[{:5.2f}, {:5.2f}]'.format(min(train_lx),max(train_lx)),
ylabel='Population',
xlim=[20,150])
plt.legend(['Train','Test'])
plt.show()
# ------ Statistics / ly
plt.figure(figsize=(16,6))
plt.hist([train_ly,test_ly], bins=100)
plt.gca().set(title='Images ly - Train=[{:5.2f}, {:5.2f}]'.format(min(train_ly),max(train_ly)),
ylabel='Population',
xlim=[20,150])
plt.legend(['Train','Test'])
plt.show()
# ------ Statistics / classId
plt.figure(figsize=(16,6))
plt.hist([y_train,y_test], bins=43)
plt.gca().set(title='ClassesId',
ylabel='Population',
xlim=[0,43])
plt.legend(['Train','Test'])
plt.show()
```
%% Cell type:markdown id: tags:
## 4/ List of classes
What are the 43 classes of our images...
%% Cell type:code id: tags:
``` python
ooo.plot_images(x_meta,y_meta, range(43), columns=8, x_size=2, y_size=2,
colorbar=False, y_pred=None, cm='binary')
```
%% Cell type:markdown id: tags:
## 5/ What does it really look like
%% Cell type:code id: tags:
``` python
# ---- Get and show few images
samples = [ random.randint(0,len(x_train)-1) for i in range(32)]
ooo.plot_images(x_train,y_train, samples, columns=8, x_size=2, y_size=2, colorbar=False, y_pred=None, cm='binary')
```
%% Cell type:markdown id: tags:
## 6/ dataset cooking...
Images must have the **same size** to match the size of the network.
It is possible to work on **rgb** or **monochrome** images and **equalize** the histograms.
The data must be **normalized**.
See : [Exposure with scikit-image](https://scikit-image.org/docs/dev/api/skimage.exposure.html)
See : [Local histogram equalization](https://scikit-image.org/docs/dev/api/skimage.filters.rank.html#skimage.filters.rank.equalize)
See : [Histogram equalization](https://scikit-image.org/docs/dev/api/skimage.exposure.html#skimage.exposure.equalize_hist)
### 6.1/ Enhancement cook
%% Cell type:code id: tags:
``` python
def images_enhancement(images, width=25, height=25, mode='RGB'):
'''
Resize and convert images - doesn't change originals.
input images must be RGBA or RGB.
args:
images : images list
width,height : new images size (25,25)
mode : RGB | RGB-HE | L | L-HE | L-LHE | L-CLAHE
return:
numpy array of enhanced images
'''
modes = { 'RGB':3, 'RGB-HE':3, 'L':1, 'L-HE':1, 'L-LHE':1, 'L-CLAHE':1}
lz=modes[mode]
out=[]
for img in images:
# ---- if RGBA, convert to RGB
if img.shape[2]==4:
img=color.rgba2rgb(img)
# ---- Resize
img = transform.resize(img, (width,height))
# ---- RGB / Histogram Equalization
if mode=='RGB-HE':
hsv = color.rgb2hsv(img.reshape(width,height,3))
hsv[:, :, 2] = exposure.equalize_hist(hsv[:, :, 2])
img = color.hsv2rgb(hsv)
# ---- Grayscale
if mode=='L':
img=color.rgb2gray(img)
# ---- Grayscale / Histogram Equalization
if mode=='L-HE':
img=color.rgb2gray(img)
img=exposure.equalize_hist(img)
# ---- Grayscale / Local Histogram Equalization
if mode=='L-LHE':
img=color.rgb2gray(img)
img=rank.equalize(img, disk(10))/255.
# ---- Grayscale / Contrast Limited Adaptive Histogram Equalization (CLAHE)
if mode=='L-CLAHE':
img=color.rgb2gray(img)
img=exposure.equalize_adapthist(img)
# ---- Add image in list of list
out.append(img)
ooo.update_progress('Enhancement: ',len(out),len(images))
# ---- Reshape images
# (-1, width,height,1) for L
# (-1, width,height,3) for RGB
#
out = np.array(out,dtype='float64')
out = out.reshape(-1,width,height,lz)
return out
```
%% Cell type:markdown id: tags:
### 6.2/ To get an idea of the different recipes
%% Cell type:code id: tags:
``` python
i=random.randint(0,len(x_train)-16)
x_samples = x_train[i:i+16]
y_samples = y_train[i:i+16]
datasets = {}
datasets['RGB'] = images_enhancement( x_samples, width=25, height=25, mode='RGB' )
datasets['RGB-HE'] = images_enhancement( x_samples, width=25, height=25, mode='RGB-HE' )
datasets['L'] = images_enhancement( x_samples, width=25, height=25, mode='L' )
datasets['L-HE'] = images_enhancement( x_samples, width=25, height=25, mode='L-HE' )
datasets['L-LHE'] = images_enhancement( x_samples, width=25, height=25, mode='L-LHE' )
datasets['L-CLAHE'] = images_enhancement( x_samples, width=25, height=25, mode='L-CLAHE' )
print('\nEXPECTED (Meta) :\n')
x_expected=[ x_meta[i] for i in y_samples]
ooo.plot_images(x_expected, y_samples, range(16), columns=16, x_size=1, y_size=1, colorbar=False, y_pred=None, cm='binary')
print('\nORIGINAL IMAGES :\n')
ooo.plot_images(x_samples, y_samples, range(16), columns=16, x_size=1, y_size=1, colorbar=False, y_pred=None, cm='binary')
print('\nENHANCED :\n')
for k,d in datasets.items():
print("dataset : {} min,max=[{:.3f},{:.3f}] shape={}".format(k,d.min(),d.max(), d.shape))
ooo.plot_images(d, y_samples, range(16), columns=16, x_size=1, y_size=1, colorbar=False, y_pred=None, cm='binary')
```
%% Cell type:markdown id: tags:
### 6.3/ Cook and save
A function to save a dataset
%% Cell type:code id: tags:
``` python
def save_h5_dataset(x_train, y_train, x_test, y_test, x_meta,y_meta, h5name):
# ---- Filename
filename='./data/'+h5name
# ---- Create h5 file
with h5py.File(filename, "w") as f:
f.create_dataset("x_train", data=x_train)
f.create_dataset("y_train", data=y_train)
f.create_dataset("x_test", data=x_test)
f.create_dataset("y_test", data=y_test)
f.create_dataset("x_meta", data=x_meta)
f.create_dataset("y_meta", data=y_meta)
# ---- done
size=os.path.getsize(filename)/(1024*1024)
print('Dataset : {:24s} shape : {:22s} size : {:6.1f} Mo (saved)\n'.format(filename, str(x_train.shape),size))
```
%% Cell type:markdown id: tags:
Create enhanced datasets, and save them...
Will take about 7-8'
%% Cell type:code id: tags:
``` python
%%time
for s in [24, 48]:
for m in ['RGB', 'RGB-HE', 'L', 'L-LHE']:
# ---- A nice dataset name
name='set-{}x{}-{}.h5'.format(s,s,m)
print("\nDataset : ",name)
# ---- Enhancement
x_train_new = images_enhancement( x_train, width=s, height=s, mode=m )
x_test_new = images_enhancement( x_test, width=s, height=s, mode=m )
x_meta_new = images_enhancement( x_meta, width=s, height=s, mode='RGB' )
# ---- Save
save_h5_dataset( x_train_new, y_train, x_test_new, y_test, x_meta_new,y_meta, name)
x_train_new,x_test_new=0,0
```
%% Cell type:markdown id: tags:
## 7/ Reload data to be sure ;-)
%% Cell type:code id: tags:
``` python
%%time
dataset='set-48x48-L'
samples=range(24)
with h5py.File('./data/'+dataset+'.h5') as f:
x_tmp = f['x_train'][:]
y_tmp = f['y_train'][:]
print("dataset loaded from h5 file.")
ooo.plot_images(x_tmp,y_tmp, samples, columns=8, x_size=2, y_size=2, colorbar=False, y_pred=None, cm='binary')
x_tmp,y_tmp=0,0
```
%% Cell type:markdown id: tags:
----
That's all folks !
source diff could not be displayed: it is too large. Options to address this: view the blob.
source diff could not be displayed: it is too large. Options to address this: view the blob.
This diff is collapsed.
This diff is collapsed.
%% Cell type:markdown id: tags:
German Traffic Sign Recognition Benchmark (GTSRB)
=================================================
---
Introduction au Deep Learning (IDLE) - S. Arias, E. Maldonado, JL. Parouty - CNRS/SARI/DEVLOG - 2020
Vesion : 1.2.1
## Episode 7 : Full Convolutions
Our main steps:
- Try n models with n datasets
## 1/ Import and init
%% Cell type:code id: tags:
``` python
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.callbacks import TensorBoard
import numpy as np
import matplotlib.pyplot as plt
import h5py
import os,time
import pandas as pd
import idle.pwk as ooo
from importlib import reload
from IPython.display import display
ooo.init()
```
%% Output
IDLE 2020 - Practical Work Module
Version : 0.1.4
Run time : Friday 17 January 2020, 21:38:34
Matplotlib style : idle/talk.mplstyle
TensorFlow version : 2.0.0
Keras version : 2.2.4-tf
%% Cell type:markdown id: tags:
## 2/ Load dataset functions
%% Cell type:code id: tags:
``` python
def read_dataset(name):
'''Reads h5 dataset from ./data
Arguments: dataset name, without .h5
Returns: x_train,y_train,x_test,y_test data'''
# ---- Read dataset
filename='./data/'+name+'.h5'
with h5py.File(filename) as f:
x_train = f['x_train'][:]
y_train = f['y_train'][:]
x_test = f['x_test'][:]
y_test = f['y_test'][:]
return x_train,y_train,x_test,y_test
```
%% Cell type:markdown id: tags:
## 3/ Models collection
%% Cell type:code id: tags:
``` python
# A basic model
#
def get_model_v1(lx,ly,lz):
model = keras.models.Sequential()
model.add( keras.layers.Conv2D(96, (3,3), activation='relu', input_shape=(lx,ly,lz)))
model.add( keras.layers.MaxPooling2D((2, 2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Conv2D(192, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D((2, 2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Flatten())
model.add( keras.layers.Dense(1500, activation='relu'))
model.add( keras.layers.Dropout(0.5))
model.add( keras.layers.Dense(43, activation='softmax'))
return model
# A more sophisticated model
#
def get_model_v2(lx,ly,lz):
model = keras.models.Sequential()
model.add( keras.layers.Conv2D(64, (3, 3), padding='same', input_shape=(lx,ly,lz), activation='relu'))
model.add( keras.layers.Conv2D(64, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Conv2D(128, (3, 3), padding='same', activation='relu'))
model.add( keras.layers.Conv2D(128, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Conv2D(256, (3, 3), padding='same',activation='relu'))
model.add( keras.layers.Conv2D(256, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Flatten())
model.add( keras.layers.Dense(512, activation='relu'))
model.add( keras.layers.Dropout(0.5))
model.add( keras.layers.Dense(43, activation='softmax'))
return model
# My sphisticated model, but small and fast
#
def get_model_v3(lx,ly,lz):
model = keras.models.Sequential()
model.add( keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(lx,ly,lz)))
model.add( keras.layers.MaxPooling2D((2, 2)))
model.add( keras.layers.Dropout(0.5))
model.add( keras.layers.Conv2D(64, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D((2, 2)))
model.add( keras.layers.Dropout(0.5))
model.add( keras.layers.Conv2D(128, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D((2, 2)))
model.add( keras.layers.Dropout(0.5))
model.add( keras.layers.Conv2D(256, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D((2, 2)))
model.add( keras.layers.Dropout(0.5))
model.add( keras.layers.Flatten())
model.add( keras.layers.Dense(1152, activation='relu'))
model.add( keras.layers.Dropout(0.5))
model.add( keras.layers.Dense(43, activation='softmax'))
return model
```
%% Cell type:markdown id: tags:
## 4/ Callbacks
%% Cell type:code id: tags:
``` python
%%bash
# To clean old logs and saved model, run this cell
#
/bin/rm -r ./run/logs 2>/dev/null
/bin/rm -r ./run/models 2>/dev/null
/bin/ls -l ./run 2>/dev/null
```
%% Output
total 0
%% Cell type:code id: tags:
``` python
ooo.mkdir('./run/models')
ooo.mkdir('./run/logs')
# ---- Callback tensorboard
log_dir = "./run/logs/tb_" + ooo.tag_now()
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)
# ---- Callback ModelCheckpoint - Save best model
save_dir = "./run/models/best-model.h5"
bestmodel_callback = tf.keras.callbacks.ModelCheckpoint(filepath=save_dir, verbose=0, monitor='accuracy', save_best_only=True)
# ---- Callback ModelCheckpoint - Save model each epochs
save_dir = "./run/models/model-{epoch:04d}.h5"
savemodel_callback = tf.keras.callbacks.ModelCheckpoint(filepath=save_dir, verbose=0, save_freq=2000*5)
```
%% Cell type:markdown id: tags:
## 6/ Multiple datasets, multiple models ;-)
%% Cell type:code id: tags:
``` python
def multi_run(datasets, models, batch_size=64, epochs=16):
# ---- Columns of report
#
report={}
report['Dataset']=[]
report['Size'] =[]
for m in models:
report[m+' Accuracy'] = []
report[m+' Duration'] = []
# ---- Let's go
#
for dname in datasets:
print("\nDataset : ",dname)
# ---- Read dataset
x_train,y_train,x_test,y_test = read_dataset(dname)
dsize=os.path.getsize('./data/'+dname+'.h5')/(1024*1024)
report['Dataset'].append(dname)
report['Size'].append(dname)
# ---- Get the shape
(n,lx,ly,lz) = x_train.shape
# ---- For each model
for kmodel,fmodel in models.items():
print(" Run model {} : ".format(kmodel), end='')
# ---- get model
try:
model=fmodel(lx,ly,lz)
# ---- Compile it
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
# ---- Train
start_time = time.time()
history = model.fit( x_train[:1000], y_train[:1000],
batch_size = batch_size,
epochs = epochs,
verbose = 0,
validation_data = (x_test, y_test),
callbacks = [tensorboard_callback, bestmodel_callback, savemodel_callback])
# ---- Result
end_time = time.time()
duration = end_time-start_time
accuracy = max(history.history["val_accuracy"])*100
#
report[kmodel+' Accuracy'].append(accuracy)
report[kmodel+' Duration'].append(duration)
print("Accuracy={:.2f} and Duration={:.2f})".format(accuracy,duration))
except:
report[kmodel+' Accuracy'].append('-')
report[kmodel+' Duration'].append('-')
print('-')
print("\n")
return report
```
%% Cell type:code id: tags:
``` python
%%time
# datasets = ['set-24x24-L', 'set-24x24-RGB', 'set-48x48-L', 'set-48x48-RGB', 'set-24x24-L-LHE', 'set-24x24-RGB-HE', 'set-48x48-L-LHE', 'set-48x48-RGB-HE']
# models = {'v1':get_model_v1, 'v2':get_model_v2, 'v3':get_model_v3}
datasets = ['set-24x24-L', 'set-24x24-RGB']
models = {'v1':get_model_v1, 'v3':get_model_v3}
out = multi_run(datasets, models, batch_size=64, epochs=2)
report = pd.DataFrame (out)
```
%% Output
Dataset : set-24x24-L
Run model v1 : Accuracy=9.46 and Duration=7.51)
Run model v3 : -
Dataset : set-24x24-RGB
Run model v1 : Accuracy=15.95 and Duration=7.95)
Run model v3 : -
CPU times: user 1min 35s, sys: 3.31 s, total: 1min 38s
Wall time: 17 s
%% Cell type:code id: tags:
``` python
display(report)
df.to_hdf('foo.h5', 'df')
```
%% Output
%% Cell type:code id: tags:
``` python
df=pd.read_hdf('foo.h5', 'df')
display(df)
```
%% Output
%% Cell type:markdown id: tags:
---
### Some results :
%% Cell type:markdown id: tags:
| Datasets | Size | Model : v1 | Model : v2 | Model : v3 |
|:------------------------:|:---------------:|:------------------:|:------------------:|:------------------:|
| set-24x24-L | 229 Mo | 95.91% 75.04s | 96.86% 102.28s | - - |
| set-24x24-RGB | 684 Mo | 96.60% 77.24s | 97.32% 103.93s | - - |
| set-48x48-L | 914 Mo | **96.71%** 123.94s | 97.68% 149.57s | 97.60% 91.53s |
| set-48x48-RGB | 2736 Mo | 96.36% 117.74s | **98.20%** 142.63s | 97.28% 91.29s |
| set-24x24-L-LHE | 229 Mo | 95.95% 66.12s | 96.75% 89.45s | - - |
| set-24x24-RGB-HE | 684 Mo | 95.30% 68.89s | 96.28% 92.15s | - - |
| set-48x48-L-LHE | 914 Mo | 96.69% 109.28s | 97.94% 135.17s | **97.97%** 83.80s |
| set-48x48-RGB-HE | 2736 Mo | 95.29% 117.70s | **98.13%** 141.56s | 97.00% 89.38s |
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
Running Tensorboard from Jupyter lab
====================================
---
Introduction au Deep Learning (IDLE) - S. Arias, E. Maldonado, JL. Parouty - CNRS/SARI/DEVLOG - 2020
Vesion : 1.0
%% Cell type:markdown id: tags:
## 1/ Méthode 1 : Shell execute
### 1.1/ Start/stop
%% Cell type:code id: tags:
``` python
%%bash
tensorboard_start --logdir ./run/logs
```
%% Output
Tensorbord started - pid is 211387
%% Cell type:code id: tags:
``` python
%%bash
tensorboard_status
```
%% Output
Tensorboard status - pid is 214798
%% Cell type:code id: tags:
``` python
%%bash
tensorboard_stop
```
%% Output
Tensorboard process not found...
%% Cell type:markdown id: tags:
### 1.3/ Scripts
%% Cell type:code id: tags:
``` python
%%writefile "~/bin/tensorboard_start"
#!/bin/bash
#
# -----------------------------------------------------------
# _____ _ _
# |_ _|__ _ __ ___ ___ _ __| |__ ___ __ _ _ __ __| |
# | |/ _ \ '_ \/ __|/ _ \| '__| '_ \ / _ \ / _` | '__/ _` |
# | | __/ | | \__ \ (_) | | | |_) | (_) | (_| | | | (_| |
# |_|\___|_| |_|___/\___/|_| |_.__/ \___/ \__,_|_| \__,_|
# Start
# -----------------------------------------------------------
# Start tensorboard, with calculate port number and good host
# -----------------------------------------------------------
# Jean-Luc Parouty CNRS/SIMaP Janvier 2020 - version 1.02
VERSION='1.02'
# ---- Usage
#
if [ "$1" == "-?" ] || [ $# -eq 0 ]; then
echo -e "Start tensorboard in GRICAD environment. (v$VERSION)"
echo -e "Usage: $(basename $0) -h | --logdir <logdir> [tensorboard args]"
echo -e "Exemple : $(basename $0) --logdir ./run/logs\n"
exit
fi
# ---- Port number
#
PORT_JPY="$(/usr/bin/id -u)"
PORT_TSB="$(( $PORT_JPY + 10000 ))"
# ---- tmpdir (tmp bug)
#
export TMPDIR="/tmp/$(/usr/bin/id -un)"
/bin/mkdir -p "$TMPDIR"
# ---- Start it
#
tensorboard --port $PORT_TSB --host 0.0.0.0 $@ &>/dev/null &
# ---- Where is it ?
#
sleep 5
p="$(/bin/ps ax | /bin/grep "tensorboard --port $PORT_TSB" | /bin/grep -v grep | awk '{print $1}')"
if [ -z "$p" ]; then
echo "Tensorboard didn't start... check your parameters !"
else
echo "Tensorbord started - pid is $p"
fi
```
%% Output
Overwriting /home/paroutyj/bin/tensorboard_start
%% Cell type:code id: tags:
``` python
%%writefile "~/bin/tensorboard_status"
#!/bin/bash
#
# -----------------------------------------------------------
# _____ _ _
# |_ _|__ _ __ ___ ___ _ __| |__ ___ __ _ _ __ __| |
# | |/ _ \ '_ \/ __|/ _ \| '__| '_ \ / _ \ / _` | '__/ _` |
# | | __/ | | \__ \ (_) | | | |_) | (_) | (_| | | | (_| |
# |_|\___|_| |_|___/\___/|_| |_.__/ \___/ \__,_|_| \__,_|
# Status
# -----------------------------------------------------------
# Stop tensorboard previously started with tensorboard_start
# -----------------------------------------------------------
# Jean-Luc Parouty CNRS/SIMaP Janvier 2020
VERSION="1.02"
# ---- Usage
#
if [ "$1" == "-?" ] ; then
echo -e "Tensorboard status in GRICAD environment. (v$VERSION)"
echo -e "Usage: $(basename $0) [-h ]"
echo -e "Exemple : $(basename $0)\n"
exit
fi
# ---- Process id
#
PORT_JPY="$(id -u)"
PORT_TSB="$(( $PORT_JPY + 10000 ))"
# ---- Where is it ?
#
p="$(ps ax | grep "tensorboard --port $PORT_TSB" | grep -v grep | awk '{print $1}')"
if [ -z "$p" ]; then
echo "Tensorboard status - not found..."
else
echo "Tensorboard status - pid is $p"
fi
```
%% Output
Writing /home/paroutyj/bin/tensorboard_status
%% Cell type:code id: tags:
``` python
%%writefile "~/bin/tensorboard_stop"
#!/bin/bash
#
# -----------------------------------------------------------
# _____ _ _
# |_ _|__ _ __ ___ ___ _ __| |__ ___ __ _ _ __ __| |
# | |/ _ \ '_ \/ __|/ _ \| '__| '_ \ / _ \ / _` | '__/ _` |
# | | __/ | | \__ \ (_) | | | |_) | (_) | (_| | | | (_| |
# |_|\___|_| |_|___/\___/|_| |_.__/ \___/ \__,_|_| \__,_|
# Stop
# -----------------------------------------------------------
# Stop tensorboard previously started with tensorboard_start
# -----------------------------------------------------------
# Jean-Luc Parouty CNRS/SIMaP Janvier 2020
VERSION="1.02"
# ---- Usage
#
if [ "$1" == "-?" ] ; then
echo -e "Stop tensorboard in GRICAD environment. (v$VERSION)"
echo -e "Usage: $(basename $0) [-h ]"
echo -e "Exemple : $(basename $0)\n"
exit
fi
# ---- Process id
#
PORT_JPY="$(id -u)"
PORT_TSB="$(( $PORT_JPY + 10000 ))"
# ---- Where is it ?
#
p="$(ps ax | grep "tensorboard --port $PORT_TSB" | grep -v grep | awk '{print $1}')"
if [ -z "$p" ]; then
echo "Tensorboard process not found..."
else
kill $p
echo "Tensorbord stopped - pid was $p"
fi
```
%% Output
Overwriting /home/paroutyj/bin/tensorboard_stop
%% Cell type:markdown id: tags:
**Set scripts privileges :**
%% Cell type:code id: tags:
``` python
%%bash
/bin/chmod 755 ~/bin/tensorboard_* 2>/dev/null
/bin/ls -l ~/bin/tensorboard_* 2>/dev/null
```
%% Output
-rwxr-xr-x 1 paroutyj l-simap 1560 Jan 14 15:41 /home/paroutyj/bin/tensorboard_start
-rwxr-xr-x 1 paroutyj l-simap 1227 Jan 14 15:52 /home/paroutyj/bin/tensorboard_status
-rwxr-xr-x 1 paroutyj l-simap 1239 Jan 14 15:40 /home/paroutyj/bin/tensorboard_stop
%% Cell type:markdown id: tags:
**Check**
%% Cell type:markdown id: tags:
## Méthode 2 : Magic command
**Start**
%% Cell type:code id: tags:
``` python
%load_ext tensorboard
```
%% Cell type:code id: tags:
``` python
%tensorboard --port 21277 --host 0.0.0.0 --logdir ./run/logs
```
%% Output
%% Cell type:markdown id: tags:
**Stop**
No way... use bash method
## Methode 3 : Tensorboard module
**Start**
%% Cell type:code id: tags:
``` python
import tensorboard.notebook as tsb
```
%% Cell type:code id: tags:
``` python
tsb.start('--port 21277 --host 0.0.0.0 --logdir ./run/logs')
```
%% Output
%% Cell type:markdown id: tags:
**Check**
%% Cell type:code id: tags:
``` python
a=tsb.list()
```
%% Output
No known TensorBoard instances running.
%% Cell type:markdown id: tags:
**Stop**
No way... use bash method
%% Cell type:code id: tags:
``` python
!kill 214798
```
%% Cell type:code id: tags:
``` python
```
This diff is collapsed.
%% Cell type:markdown id: tags:
German Traffic Sign Recognition Benchmark (GTSRB)
=================================================
---
Introduction au Deep Learning (IDLE)
S. Aria, E. Maldonado, JL. Parouty
CNRS/SARI/DEVLOG - 2020
Objectives of this practical work
---------------------------------
Traffic sign classification with **CNN**, using Tensorflow and **Keras**
Prerequisite
------------
Environment, with the following packages :
- Python 3.6
- numpy
- Tensorflow 2.0
- scikit-image
- scikit-learn
- Matplotlib
- seaborn
You can create it from the `environment.yml` file :
```
# conda env create -f environment.yml
```
To manage conda environment see [there](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#)
About the dataset
-----------------
Name : [German Traffic Sign Recognition Benchmark (GTSRB)](http://benchmark.ini.rub.de/?section=gtsrb)
Available [here](https://sid.erda.dk/public/archives/daaeac0d7ce1152aea9b61d9f1e19370/published-archive.html)
or on **[kaggle](https://www.kaggle.com/meowmeowmeowmeowmeow/gtsrb-german-traffic-sign)**
A nice example from : [Alex Staravoitau](https://navoshta.com/traffic-signs-classification/)
In few words :
- Images : Variable dimensions, rgb
- Train set : 39209 images
- Test set : 12630 images
- Classes : 0 to 42
Episodes
--------
**[01 - Preparation of data](01-Preparation-of-data.ipynb)**
- Understanding the dataset
- Preparing and formatting data
- Organize and backup data
**[02 - First convolutions](02-First-convolutions.ipynb)**
- Read dataset
- Build a model
- Train the model
- Model evaluation
%% Cell type:code id: tags:
``` python
```
File added
VERSION='0.1a'
\ No newline at end of file
# ==================================================================
# ____ _ _ _ __ __ _
# | _ \ _ __ __ _ ___| |_(_) ___ __ _| | \ \ / /__ _ __| | __
# | |_) | '__/ _` |/ __| __| |/ __/ _` | | \ \ /\ / / _ \| '__| |/ /
# | __/| | | (_| | (__| |_| | (_| (_| | | \ V V / (_) | | | <
# |_| |_| \__,_|\___|\__|_|\___\__,_|_| \_/\_/ \___/|_| |_|\_\
# module pwk
# ==================================================================
# A simple module to host some common functions for practical work
# pjluc 2019
import os
import glob
from datetime import datetime
import itertools
import datetime
import math
import numpy as np
import tensorflow as tf
from tensorflow import keras
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sn
VERSION='0.1.4'
# -------------------------------------------------------------
# init_all
# -------------------------------------------------------------
#
def init(mplstyle='idle/talk.mplstyle'):
global VERSION
# ---- matplotlib
matplotlib.style.use(mplstyle)
# ---- Hello world
now = datetime.datetime.now()
print('IDLE 2020 - Practical Work Module')
print(' Version :', VERSION)
print(' Run time : {}'.format(now.strftime("%A %-d %B %Y, %H:%M:%S")))
print(' Matplotlib style :', mplstyle)
print(' TensorFlow version :',tf.__version__)
print(' Keras version :',tf.keras.__version__)
# -------------------------------------------------------------
# Folder cooking
# -------------------------------------------------------------
#
def tag_now():
return datetime.datetime.now().strftime("%Y-%m-%d_%Hh%Mm%Ss")
def mkdir(path):
os.makedirs(path, mode=0o750, exist_ok=True)
def get_directory_size(path):
"""
Return the directory size, but only 1 level
args:
path : directory path
return:
size in Mo
"""
size=0
for f in os.listdir(path):
if os.path.isfile(path+'/'+f):
size+=os.path.getsize(path+'/'+f)
return size/(1024*1024)
# -------------------------------------------------------------
# shuffle_dataset
# -------------------------------------------------------------
#
def shuffle_np_dataset(x, y):
assert (len(x) == len(y)), "x and y must have same size"
p = np.random.permutation(len(x))
return x[p], y[p]
def update_progress(what,i,imax):
bar_length = min(40,imax)
if (i%int(imax/bar_length))!=0 and i<imax:
return
progress = float(i/imax)
block = int(round(bar_length * progress))
endofline = '\r' if progress<1 else '\n'
text = "{:16s} [{}] {:>5.1f}% of {}".format( what, "#"*block+"-"*(bar_length-block), progress*100, imax)
print(text, end=endofline)
# -------------------------------------------------------------
# show_images
# -------------------------------------------------------------
#
def plot_images(x,y, indices, columns=12, x_size=1, y_size=1, colorbar=False, y_pred=None, cm='binary'):
"""
Show some images in a grid, with legends
args:
X: images - Shapes must be (-1 lx,ly,1) or (-1 lx,ly,3)
y: real classes
indices: indices of images to show
columns: number of columns (12)
x_size,y_size: figure size
colorbar: show colorbar (False)
y_pred: predicted classes (None)
cm: Matplotlib olor map
returns:
nothing
"""
rows = math.ceil(len(indices)/columns)
fig=plt.figure(figsize=(columns*x_size, rows*(y_size+0.35)))
n=1
errors=0
if np.any(y_pred)==None:
y_pred=y
for i in indices:
axs=fig.add_subplot(rows, columns, n)
n+=1
# Shapes must be differents for RGB and L
(lx,ly,lz)=x[i].shape
if lz==1:
img=axs.imshow(x[i].reshape(lx,ly), cmap = cm, interpolation='lanczos')
else:
img=axs.imshow(x[i].reshape(lx,ly,lz),cmap = cm, interpolation='lanczos')
axs.spines['right'].set_visible(True)
axs.spines['left'].set_visible(True)
axs.spines['top'].set_visible(True)
axs.spines['bottom'].set_visible(True)
axs.set_yticks([])
axs.set_xticks([])
if y[i]!=y_pred[i]:
axs.set_xlabel('{} ({})'.format(y_pred[i],y[i]))
axs.xaxis.label.set_color('red')
errors+=1
else:
axs.set_xlabel(y[i])
if colorbar:
fig.colorbar(img,orientation="vertical", shrink=0.65)
plt.show()
def plot_image(x,cm='binary', figsize=(4,4)):
(lx,ly,lz)=x.shape
plt.figure(figsize=figsize)
if lz==1:
plt.imshow(x.reshape(lx,ly), cmap = cm, interpolation='lanczos')
else:
plt.imshow(x.reshape(lx,ly,lz),cmap = cm, interpolation='lanczos')
plt.show()
# -------------------------------------------------------------
# show_history
# -------------------------------------------------------------
#
def plot_history(history, figsize=(8,6)):
"""
Show history
args:
history: history
save_as: filename to save or None
"""
# Accuracy
plt.figure(figsize=figsize)
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()
# Loss values
plt.figure(figsize=figsize)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()
# -------------------------------------------------------------
# plot_confusion_matrix
# -------------------------------------------------------------
#
def plot_confusion_matrix(cm,
title='Confusion matrix',
figsize=(12,8),
cmap="gist_heat_r",
vmin=0,
vmax=1,
xticks=5,yticks=5):
"""
given a sklearn confusion matrix (cm), make a nice plot
Args:
cm: confusion matrix from sklearn.metrics.confusion_matrix
title: the text to display at the top of the matrix
figsize: Figure size (12,8)
cmap: color map (gist_heat_r)
vmi,vmax: Min/max 0 and 1
"""
accuracy = np.trace(cm) / float(np.sum(cm))
misclass = 1 - accuracy
plt.figure(figsize=figsize)
sn.heatmap(cm, linewidths=1, linecolor="#ffffff",square=True,
cmap=cmap, xticklabels=xticks, yticklabels=yticks,
vmin=vmin,vmax=vmax)
plt.ylabel('True label')
plt.xlabel('Predicted label\naccuracy={:0.4f}; misclass={:0.4f}'.format(accuracy, misclass))
plt.show()
# See : https://matplotlib.org/users/customizing.html
axes.titlesize : 24
axes.labelsize : 20
axes.edgecolor : dimgrey
axes.labelcolor : dimgrey
axes.linewidth : 2
axes.grid : False
axes.prop_cycle : cycler('color', ['steelblue', 'tomato', '2ca02c', 'd62728', '9467bd', '8c564b', 'e377c2', '7f7f7f', 'bcbd22', '17becf'])
lines.linewidth : 3
lines.markersize : 10
xtick.color : black
xtick.labelsize : 18
ytick.color : black
ytick.labelsize : 18
axes.spines.left : True
axes.spines.bottom : True
axes.spines.top : False
axes.spines.right : False
savefig.dpi : 300 # figure dots per inch or 'figure'
savefig.facecolor : white # figure facecolor when saving
savefig.edgecolor : white # figure edgecolor when saving
savefig.format : svg
savefig.bbox : tight
savefig.pad_inches : 0.1
savefig.transparent : True
savefig.jpeg_quality: 95
name: deeplearning2
channels:
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- _tflow_select=2.1.0=gpu
- absl-py=0.8.1=py37_0
- astor=0.8.0=py37_0
- attrs=19.3.0=py_0
- backcall=0.1.0=py37_0
- blas=1.0=mkl
- bleach=3.1.0=py_0
- blosc=1.16.3=hd408876_0
- bzip2=1.0.8=h7b6447c_0
- c-ares=1.15.0=h7b6447c_1001
- ca-certificates=2019.11.27=0
- certifi=2019.11.28=py37_0
- cloudpickle=1.2.2=py_0
- cudatoolkit=10.0.130=0
- cudnn=7.6.4=cuda10.0_0
- cupti=10.0.130=0
- cycler=0.10.0=py37_0
- cytoolz=0.10.1=py37h7b6447c_0
- dask-core=2.9.0=py_0
- dbus=1.13.12=h746ee38_0
- decorator=4.4.1=py_0
- defusedxml=0.6.0=py_0
- entrypoints=0.3=py37_0
- expat=2.2.6=he6710b0_0
- fontconfig=2.13.0=h9420a91_0
- freetype=2.9.1=h8a8886c_1
- gast=0.2.2=py37_0
- glib=2.63.1=h5a9c865_0
- gmp=6.1.2=h6c8ec71_1
- google-pasta=0.1.8=py_0
- grpcio=1.16.1=py37hf8bcb03_1
- gst-plugins-base=1.14.0=hbbd80ab_1
- gstreamer=1.14.0=hb453b48_1
- h5py=2.9.0=py37h7918eee_0
- hdf5=1.10.4=hb1b8bf9_0
- icu=58.2=h9c2bf20_1
- imageio=2.6.1=py37_0
- importlib_metadata=1.3.0=py37_0
- intel-openmp=2019.4=243
- ipykernel=5.1.3=py37h39e3cac_0
- ipython=7.10.2=py37h39e3cac_0
- ipython_genutils=0.2.0=py37_0
- jedi=0.15.1=py37_0
- jinja2=2.10.3=py_0
- joblib=0.14.1=py_0
- jpeg=9b=h024ee3a_2
- json5=0.8.5=py_0
- jsonschema=3.2.0=py37_0
- jupyter_client=5.3.4=py37_0
- jupyter_core=4.6.1=py37_0
- jupyterlab=1.2.4=pyhf63ae98_0
- jupyterlab_server=1.0.6=py_0
- keras-applications=1.0.8=py_0
- keras-preprocessing=1.1.0=py_1
- kiwisolver=1.1.0=py37he6710b0_0
- libedit=3.1.20181209=hc058e9b_0
- libffi=3.2.1=hd88cf55_4
- libgcc-ng=9.1.0=hdf63c60_0
- libgfortran-ng=7.3.0=hdf63c60_0
- libpng=1.6.37=hbc83047_0
- libprotobuf=3.11.2=hd408876_0
- libsodium=1.0.16=h1bed415_0
- libstdcxx-ng=9.1.0=hdf63c60_0
- libtiff=4.1.0=h2733197_0
- libuuid=1.0.3=h1bed415_2
- libxcb=1.13=h1bed415_1
- libxml2=2.9.9=hea5a465_1
- lz4-c=1.8.1.2=h14c3975_0
- lzo=2.10=h49e0be7_2
- markdown=3.1.1=py37_0
- markupsafe=1.1.1=py37h7b6447c_0
- matplotlib=3.1.1=py37h5429711_0
- mistune=0.8.4=py37h7b6447c_0
- mkl=2019.4=243
- mkl-service=2.3.0=py37he904b0f_0
- mkl_fft=1.0.15=py37ha843d7b_0
- mkl_random=1.1.0=py37hd6b4f25_0
- mock=3.0.5=py37_0
- more-itertools=8.0.2=py_0
- nbconvert=5.6.1=py37_0
- nbformat=4.4.0=py37_0
- ncurses=6.1=he6710b0_1
- networkx=2.4=py_0
- notebook=6.0.2=py37_0
- numexpr=2.7.0=py37h9e4a6bb_0
- numpy=1.17.4=py37hc1035e2_0
- numpy-base=1.17.4=py37hde5b4d6_0
- olefile=0.46=py_0
- openssl=1.1.1d=h7b6447c_3
- opt_einsum=3.1.0=py_0
- pandas=0.25.3=py37he6710b0_0
- pandoc=2.2.3.2=0
- pandocfilters=1.4.2=py37_1
- parso=0.5.2=py_0
- patsy=0.5.1=py37_0
- pcre=8.43=he6710b0_0
- pexpect=4.7.0=py37_0
- pickleshare=0.7.5=py37_0
- pillow=6.2.1=py37h34e0f95_0
- pip=19.3.1=py37_0
- prometheus_client=0.7.1=py_0
- prompt_toolkit=3.0.2=py_0
- protobuf=3.11.2=py37he6710b0_0
- ptyprocess=0.6.0=py37_0
- pygments=2.5.2=py_0
- pyparsing=2.4.5=py_0
- pyqt=5.9.2=py37h05f1152_2
- pyrsistent=0.15.6=py37h7b6447c_0
- pytables=3.6.1=py37h71ec239_0
- python=3.7.5=h0371630_0
- python-dateutil=2.8.1=py_0
- pytz=2019.3=py_0
- pywavelets=1.1.1=py37h7b6447c_0
- pyzmq=18.1.0=py37he6710b0_0
- qt=5.9.7=h5867ecd_1
- readline=7.0=h7b6447c_5
- scikit-image=0.15.0=py37he6710b0_0
- scikit-learn=0.22=py37hd81dba3_0
- scipy=1.3.2=py37h7c811a0_0
- seaborn=0.9.0=pyh91ea838_1
- send2trash=1.5.0=py37_0
- setuptools=42.0.2=py37_0
- sip=4.19.8=py37hf484d3e_0
- six=1.13.0=py37_0
- snappy=1.1.7=hbae5bb6_3
- sqlite=3.30.1=h7b6447c_0
- statsmodels=0.10.1=py37hdd07704_0
- tensorboard=2.0.0=pyhb38c66f_1
- tensorflow=2.0.0=gpu_py37h768510d_0
- tensorflow-base=2.0.0=gpu_py37h0ec5d1f_0
- tensorflow-estimator=2.0.0=pyh2649769_0
- tensorflow-gpu=2.0.0=h0d30ee6_0
- termcolor=1.1.0=py37_1
- terminado=0.8.3=py37_0
- testpath=0.4.4=py_0
- tk=8.6.8=hbc83047_0
- toolz=0.10.0=py_0
- tornado=6.0.3=py37h7b6447c_0
- traitlets=4.3.3=py37_0
- wcwidth=0.1.7=py37_0
- webencodings=0.5.1=py37_1
- werkzeug=0.16.0=py_0
- wheel=0.33.6=py37_0
- wrapt=1.11.2=py37h7b6447c_0
- xz=5.2.4=h14c3975_4
- zeromq=4.3.1=he6710b0_3
- zipp=0.6.0=py_0
- zlib=1.2.11=h7b6447c_3
- zstd=1.3.7=h0b5b093_0
prefix: /home/pjluc/anaconda3/envs/deeplearning2