Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • daconcea/fidle
  • bossardl/fidle
  • Julie.Remenant/fidle
  • abijolao/fidle
  • monsimau/fidle
  • karkars/fidle
  • guilgautier/fidle
  • cailletr/fidle
  • talks/fidle
9 results
Show changes
Commits on Source (103)
Showing
with 6889 additions and 616 deletions
......@@ -2,5 +2,10 @@
*/.ipynb_checkpoints/*
__pycache__
*/__pycache__/*
/run/**
*/data/*
run/
GTSRB/data
IMDB/data
MNIST/data
VAE/data
BHPD/data/*
!BHPD/data/BostonHousing.csv
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/00-Fidle-header-01.svg"></img>
# <!-- TITLE --> [BHP1] - Regression with a Dense Network (DNN)
<!-- DESC --> A Simple regression with a Dense Neural Network (DNN) - BHPD dataset
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- Predicts **housing prices** from a set of house features.
- Understanding the **principle** and the **architecture** of a regression with a **dense neural network**
The **[Boston Housing Dataset](https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html)** consists of price of houses in various places in Boston.
Alongside with price, the dataset also provide information such as Crime, areas of non-retail business in the town,
age of people who own the house and many other attributes...
## What we're going to do :
- Retrieve data
- Preparing the data
- Build a model
- Train the model
- Evaluate the result
%% Cell type:markdown id: tags:
## Step 1 - Import and init
%% Cell type:code id: tags:
``` python
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import os,sys
from importlib import reload
sys.path.append('..')
import fidle.pwk as ooo
ooo.init()
```
%% Output
FIDLE 2020 - Practical Work Module
Version : 0.2.9
Run time : Wednesday 19 February 2020, 09:49:10
TensorFlow version : 2.0.0
Keras version : 2.2.4-tf
%% Cell type:markdown id: tags:
## Step 2 - Retrieve data
### 2.1 - Option 1 : From Keras
Boston housing is a famous historic dataset, so we can get it directly from [Keras datasets](https://www.tensorflow.org/api_docs/python/tf/keras/datasets)
%% Cell type:raw id: tags:
(x_train, y_train), (x_test, y_test) = keras.datasets.boston_housing.load_data(test_split=0.2, seed=113)
%% Cell type:markdown id: tags:
### 2.2 - Option 2 : From a csv file
More fun !
%% Cell type:code id: tags:
``` python
data = pd.read_csv('./data/BostonHousing.csv', header=0)
display(data.head(5).style.format("{0:.2f}"))
print('Données manquantes : ',data.isna().sum().sum(), ' Shape is : ', data.shape)
```
%% Output
Données manquantes : 0 Shape is : (506, 14)
%% Cell type:markdown id: tags:
## Step 3 - Preparing the data
### 3.1 - Split data
We will use 70% of the data for training and 30% for validation.
x will be input data and y the expected output
%% Cell type:code id: tags:
``` python
# ---- Split => train, test
#
data_train = data.sample(frac=0.7, axis=0)
data_test = data.drop(data_train.index)
# ---- Split => x,y (medv is price)
#
x_train = data_train.drop('medv', axis=1)
y_train = data_train['medv']
x_test = data_test.drop('medv', axis=1)
y_test = data_test['medv']
print('Original data shape was : ',data.shape)
print('x_train : ',x_train.shape, 'y_train : ',y_train.shape)
print('x_test : ',x_test.shape, 'y_test : ',y_test.shape)
```
%% Output
Original data shape was : (506, 14)
x_train : (354, 13) y_train : (354,)
x_test : (152, 13) y_test : (152,)
%% Cell type:markdown id: tags:
### 3.2 - Data normalization
**Note :**
- All input data must be normalized, train and test.
- To do this we will **subtract the mean** and **divide by the standard deviation**.
- But test data should not be used in any way, even for normalization.
- The mean and the standard deviation will therefore only be calculated with the train data.
%% Cell type:code id: tags:
``` python
display(x_train.describe().style.format("{0:.2f}").set_caption("Before normalization :"))
mean = x_train.mean()
std = x_train.std()
x_train = (x_train - mean) / std
x_test = (x_test - mean) / std
display(x_train.describe().style.format("{0:.2f}").set_caption("After normalization :"))
x_train, y_train = np.array(x_train), np.array(y_train)
x_test, y_test = np.array(x_test), np.array(y_test)
```
%% Output
%% Cell type:markdown id: tags:
## Step 4 - Build a model
About informations about :
- [Optimizer](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers)
- [Activation](https://www.tensorflow.org/api_docs/python/tf/keras/activations)
- [Loss](https://www.tensorflow.org/api_docs/python/tf/keras/losses)
- [Metrics](https://www.tensorflow.org/api_docs/python/tf/keras/metrics)
%% Cell type:code id: tags:
``` python
def get_model_v1(shape):
model = keras.models.Sequential()
model.add(keras.layers.Input(shape, name="InputLayer"))
model.add(keras.layers.Dense(64, activation='relu', name='Dense_n1'))
model.add(keras.layers.Dense(64, activation='relu', name='Dense_n2'))
model.add(keras.layers.Dense(1, name='Output'))
model.compile(optimizer = 'rmsprop',
loss = 'mse',
metrics = ['mae', 'mse'] )
return model
```
%% Cell type:markdown id: tags:
## Step 5 - Train the model
### 5.1 - Get it
%% Cell type:code id: tags:
``` python
model=get_model_v1( (13,) )
model.summary()
keras.utils.plot_model( model, to_file='./run/model.png', show_shapes=True, show_layer_names=True, dpi=96)
```
%% Output
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
Dense_n1 (Dense) (None, 64) 896
_________________________________________________________________
Dense_n2 (Dense) (None, 64) 4160
_________________________________________________________________
Output (Dense) (None, 1) 65
=================================================================
Total params: 5,121
Trainable params: 5,121
Non-trainable params: 0
_________________________________________________________________
<IPython.core.display.Image object>
%% Cell type:markdown id: tags:
### 5.2 - Train it
%% Cell type:code id: tags:
``` python
history = model.fit(x_train,
y_train,
epochs = 100,
batch_size = 10,
verbose = 1,
validation_data = (x_test, y_test))
```
%% Output
Train on 354 samples, validate on 152 samples
Epoch 1/100
354/354 [==============================] - 1s 2ms/sample - loss: 536.0845 - mae: 21.3335 - mse: 536.0846 - val_loss: 439.6562 - val_mae: 19.3198 - val_mse: 439.6562
Epoch 2/100
354/354 [==============================] - 0s 216us/sample - loss: 354.0647 - mae: 16.8618 - mse: 354.0648 - val_loss: 231.3198 - val_mae: 13.5154 - val_mse: 231.3199
Epoch 3/100
354/354 [==============================] - 0s 194us/sample - loss: 155.7450 - mae: 9.9432 - mse: 155.7450 - val_loss: 69.8093 - val_mae: 6.2267 - val_mse: 69.8093
Epoch 4/100
354/354 [==============================] - 0s 170us/sample - loss: 55.4497 - mae: 5.2375 - mse: 55.4497 - val_loss: 28.5090 - val_mae: 4.0794 - val_mse: 28.5090
Epoch 5/100
354/354 [==============================] - 0s 172us/sample - loss: 31.6844 - mae: 4.0017 - mse: 31.6844 - val_loss: 21.9792 - val_mae: 3.3949 - val_mse: 21.9792
Epoch 6/100
354/354 [==============================] - 0s 175us/sample - loss: 24.5126 - mae: 3.4343 - mse: 24.5126 - val_loss: 18.8066 - val_mae: 3.1393 - val_mse: 18.8066
Epoch 7/100
354/354 [==============================] - 0s 176us/sample - loss: 21.5744 - mae: 3.2008 - mse: 21.5744 - val_loss: 16.6019 - val_mae: 3.0136 - val_mse: 16.6019
Epoch 8/100
354/354 [==============================] - 0s 174us/sample - loss: 19.6449 - mae: 3.0134 - mse: 19.6449 - val_loss: 15.8376 - val_mae: 2.9888 - val_mse: 15.8376
Epoch 9/100
354/354 [==============================] - 0s 170us/sample - loss: 18.6252 - mae: 2.9144 - mse: 18.6252 - val_loss: 15.3001 - val_mae: 2.9692 - val_mse: 15.3001
Epoch 10/100
354/354 [==============================] - 0s 173us/sample - loss: 17.0981 - mae: 2.7810 - mse: 17.0981 - val_loss: 14.8818 - val_mae: 2.9166 - val_mse: 14.8818
Epoch 11/100
354/354 [==============================] - 0s 169us/sample - loss: 16.0782 - mae: 2.6914 - mse: 16.0782 - val_loss: 14.3696 - val_mae: 2.8419 - val_mse: 14.3696
Epoch 12/100
354/354 [==============================] - 0s 174us/sample - loss: 15.5677 - mae: 2.6683 - mse: 15.5677 - val_loss: 13.9912 - val_mae: 2.8576 - val_mse: 13.9912
Epoch 13/100
354/354 [==============================] - 0s 185us/sample - loss: 14.8428 - mae: 2.5991 - mse: 14.8428 - val_loss: 14.3104 - val_mae: 2.8784 - val_mse: 14.3104
Epoch 14/100
354/354 [==============================] - 0s 174us/sample - loss: 14.3035 - mae: 2.5320 - mse: 14.3035 - val_loss: 13.7014 - val_mae: 2.7929 - val_mse: 13.7014
Epoch 15/100
354/354 [==============================] - 0s 174us/sample - loss: 13.6874 - mae: 2.4875 - mse: 13.6874 - val_loss: 13.2517 - val_mae: 2.7346 - val_mse: 13.2517
Epoch 16/100
354/354 [==============================] - 0s 169us/sample - loss: 13.3831 - mae: 2.4476 - mse: 13.3831 - val_loss: 13.0551 - val_mae: 2.7135 - val_mse: 13.0551
Epoch 17/100
354/354 [==============================] - 0s 173us/sample - loss: 13.1403 - mae: 2.4844 - mse: 13.1403 - val_loss: 13.0990 - val_mae: 2.6770 - val_mse: 13.0990
Epoch 18/100
354/354 [==============================] - 0s 167us/sample - loss: 12.7370 - mae: 2.3913 - mse: 12.7370 - val_loss: 12.6409 - val_mae: 2.6264 - val_mse: 12.6409
Epoch 19/100
354/354 [==============================] - 0s 175us/sample - loss: 12.3546 - mae: 2.3600 - mse: 12.3546 - val_loss: 12.5174 - val_mae: 2.7141 - val_mse: 12.5174
Epoch 20/100
354/354 [==============================] - 0s 166us/sample - loss: 12.1547 - mae: 2.3828 - mse: 12.1547 - val_loss: 12.1408 - val_mae: 2.6063 - val_mse: 12.1408
Epoch 21/100
354/354 [==============================] - 0s 179us/sample - loss: 11.8888 - mae: 2.3270 - mse: 11.8888 - val_loss: 11.9719 - val_mae: 2.5967 - val_mse: 11.9719
Epoch 22/100
354/354 [==============================] - 0s 189us/sample - loss: 11.6794 - mae: 2.3303 - mse: 11.6794 - val_loss: 11.8047 - val_mae: 2.5511 - val_mse: 11.8047
Epoch 23/100
354/354 [==============================] - 0s 170us/sample - loss: 11.3378 - mae: 2.3021 - mse: 11.3378 - val_loss: 12.4017 - val_mae: 2.6941 - val_mse: 12.4017
Epoch 24/100
354/354 [==============================] - 0s 186us/sample - loss: 10.9016 - mae: 2.3034 - mse: 10.9016 - val_loss: 12.3386 - val_mae: 2.5292 - val_mse: 12.3386
Epoch 25/100
354/354 [==============================] - 0s 202us/sample - loss: 10.7163 - mae: 2.3021 - mse: 10.7163 - val_loss: 12.2563 - val_mae: 2.5674 - val_mse: 12.2563
Epoch 26/100
354/354 [==============================] - 0s 192us/sample - loss: 10.8481 - mae: 2.2104 - mse: 10.8481 - val_loss: 11.2348 - val_mae: 2.4873 - val_mse: 11.2348
Epoch 27/100
354/354 [==============================] - 0s 192us/sample - loss: 10.7446 - mae: 2.2232 - mse: 10.7446 - val_loss: 11.4269 - val_mae: 2.5686 - val_mse: 11.4269
Epoch 28/100
354/354 [==============================] - 0s 187us/sample - loss: 10.1381 - mae: 2.1918 - mse: 10.1381 - val_loss: 13.4143 - val_mae: 2.6246 - val_mse: 13.4143
Epoch 29/100
354/354 [==============================] - 0s 176us/sample - loss: 10.5442 - mae: 2.1971 - mse: 10.5442 - val_loss: 11.4616 - val_mae: 2.4741 - val_mse: 11.4616
Epoch 30/100
354/354 [==============================] - 0s 218us/sample - loss: 10.2099 - mae: 2.1867 - mse: 10.2099 - val_loss: 11.4631 - val_mae: 2.4684 - val_mse: 11.4631
Epoch 31/100
354/354 [==============================] - 0s 202us/sample - loss: 9.5920 - mae: 2.1342 - mse: 9.5920 - val_loss: 12.5109 - val_mae: 2.6033 - val_mse: 12.5109
Epoch 32/100
354/354 [==============================] - 0s 179us/sample - loss: 9.9940 - mae: 2.1424 - mse: 9.9940 - val_loss: 11.1528 - val_mae: 2.4392 - val_mse: 11.1528
Epoch 33/100
354/354 [==============================] - 0s 197us/sample - loss: 9.5950 - mae: 2.1156 - mse: 9.5950 - val_loss: 12.0327 - val_mae: 2.6225 - val_mse: 12.0327
Epoch 34/100
354/354 [==============================] - 0s 228us/sample - loss: 9.6256 - mae: 2.0962 - mse: 9.6256 - val_loss: 10.8296 - val_mae: 2.4168 - val_mse: 10.8296
Epoch 35/100
354/354 [==============================] - 0s 179us/sample - loss: 9.3365 - mae: 2.1271 - mse: 9.3365 - val_loss: 10.7088 - val_mae: 2.5094 - val_mse: 10.7088
Epoch 36/100
354/354 [==============================] - 0s 184us/sample - loss: 9.2796 - mae: 2.0914 - mse: 9.2796 - val_loss: 10.7439 - val_mae: 2.4282 - val_mse: 10.7439
Epoch 37/100
354/354 [==============================] - 0s 186us/sample - loss: 8.7178 - mae: 2.0390 - mse: 8.7178 - val_loss: 13.1923 - val_mae: 2.5942 - val_mse: 13.1923
Epoch 38/100
354/354 [==============================] - 0s 202us/sample - loss: 8.8195 - mae: 2.0927 - mse: 8.8195 - val_loss: 10.9034 - val_mae: 2.5152 - val_mse: 10.9034
Epoch 39/100
354/354 [==============================] - 0s 190us/sample - loss: 8.9152 - mae: 2.0784 - mse: 8.9152 - val_loss: 11.3023 - val_mae: 2.4404 - val_mse: 11.3023
Epoch 40/100
354/354 [==============================] - 0s 196us/sample - loss: 8.8418 - mae: 2.0187 - mse: 8.8418 - val_loss: 10.7721 - val_mae: 2.5067 - val_mse: 10.7721
Epoch 41/100
354/354 [==============================] - 0s 181us/sample - loss: 8.6890 - mae: 2.0260 - mse: 8.6890 - val_loss: 11.0856 - val_mae: 2.5693 - val_mse: 11.0856
Epoch 42/100
354/354 [==============================] - 0s 174us/sample - loss: 8.4768 - mae: 2.0517 - mse: 8.4768 - val_loss: 11.3269 - val_mae: 2.4414 - val_mse: 11.3269
Epoch 43/100
354/354 [==============================] - 0s 171us/sample - loss: 8.5229 - mae: 1.9943 - mse: 8.5229 - val_loss: 10.4669 - val_mae: 2.4794 - val_mse: 10.4669
Epoch 44/100
354/354 [==============================] - 0s 172us/sample - loss: 8.0707 - mae: 1.9900 - mse: 8.0707 - val_loss: 11.6943 - val_mae: 2.5034 - val_mse: 11.6943
Epoch 45/100
354/354 [==============================] - 0s 172us/sample - loss: 8.1752 - mae: 1.9715 - mse: 8.1752 - val_loss: 10.6043 - val_mae: 2.3636 - val_mse: 10.6043
Epoch 46/100
354/354 [==============================] - 0s 174us/sample - loss: 8.2037 - mae: 1.9739 - mse: 8.2037 - val_loss: 10.5447 - val_mae: 2.3784 - val_mse: 10.5447
Epoch 47/100
354/354 [==============================] - 0s 173us/sample - loss: 7.9866 - mae: 1.9744 - mse: 7.9866 - val_loss: 10.6746 - val_mae: 2.4501 - val_mse: 10.6746
Epoch 48/100
354/354 [==============================] - 0s 165us/sample - loss: 7.7703 - mae: 1.9705 - mse: 7.7703 - val_loss: 10.4041 - val_mae: 2.4620 - val_mse: 10.4041
Epoch 49/100
354/354 [==============================] - 0s 182us/sample - loss: 7.8774 - mae: 1.9809 - mse: 7.8774 - val_loss: 10.6823 - val_mae: 2.4969 - val_mse: 10.6823
Epoch 50/100
354/354 [==============================] - 0s 167us/sample - loss: 7.8654 - mae: 1.9666 - mse: 7.8654 - val_loss: 10.6351 - val_mae: 2.4191 - val_mse: 10.6351
Epoch 51/100
354/354 [==============================] - 0s 180us/sample - loss: 7.6560 - mae: 1.9236 - mse: 7.6560 - val_loss: 10.3918 - val_mae: 2.3943 - val_mse: 10.3918
Epoch 52/100
354/354 [==============================] - 0s 170us/sample - loss: 7.3560 - mae: 1.8763 - mse: 7.3560 - val_loss: 10.3560 - val_mae: 2.5009 - val_mse: 10.3560
Epoch 53/100
354/354 [==============================] - 0s 163us/sample - loss: 7.5076 - mae: 1.8973 - mse: 7.5076 - val_loss: 10.5798 - val_mae: 2.4698 - val_mse: 10.5798
Epoch 54/100
354/354 [==============================] - 0s 164us/sample - loss: 7.4315 - mae: 1.8962 - mse: 7.4315 - val_loss: 10.0018 - val_mae: 2.3756 - val_mse: 10.0018
Epoch 55/100
354/354 [==============================] - 0s 170us/sample - loss: 7.2476 - mae: 1.9127 - mse: 7.2476 - val_loss: 10.0664 - val_mae: 2.4074 - val_mse: 10.0664
Epoch 56/100
354/354 [==============================] - 0s 168us/sample - loss: 7.1336 - mae: 1.8297 - mse: 7.1336 - val_loss: 10.5519 - val_mae: 2.4670 - val_mse: 10.5519
Epoch 57/100
354/354 [==============================] - 0s 177us/sample - loss: 7.0707 - mae: 1.8462 - mse: 7.0707 - val_loss: 11.4684 - val_mae: 2.7035 - val_mse: 11.4684
Epoch 58/100
354/354 [==============================] - 0s 173us/sample - loss: 6.9632 - mae: 1.8780 - mse: 6.9632 - val_loss: 10.6361 - val_mae: 2.4145 - val_mse: 10.6361
Epoch 59/100
354/354 [==============================] - 0s 208us/sample - loss: 7.1218 - mae: 1.8522 - mse: 7.1218 - val_loss: 10.3080 - val_mae: 2.3628 - val_mse: 10.3080
Epoch 60/100
354/354 [==============================] - 0s 261us/sample - loss: 6.7623 - mae: 1.7823 - mse: 6.7623 - val_loss: 10.3923 - val_mae: 2.3174 - val_mse: 10.3923
Epoch 61/100
354/354 [==============================] - 0s 166us/sample - loss: 6.9012 - mae: 1.8504 - mse: 6.9012 - val_loss: 10.1488 - val_mae: 2.3802 - val_mse: 10.1488
Epoch 62/100
354/354 [==============================] - 0s 171us/sample - loss: 6.6419 - mae: 1.8210 - mse: 6.6419 - val_loss: 10.7578 - val_mae: 2.5222 - val_mse: 10.7578
Epoch 63/100
354/354 [==============================] - 0s 181us/sample - loss: 6.5397 - mae: 1.8096 - mse: 6.5397 - val_loss: 10.5892 - val_mae: 2.5217 - val_mse: 10.5892
Epoch 64/100
354/354 [==============================] - 0s 171us/sample - loss: 6.4273 - mae: 1.7990 - mse: 6.4273 - val_loss: 10.7066 - val_mae: 2.4491 - val_mse: 10.7066
Epoch 65/100
354/354 [==============================] - 0s 164us/sample - loss: 6.2635 - mae: 1.7888 - mse: 6.2635 - val_loss: 10.2444 - val_mae: 2.4960 - val_mse: 10.2444
Epoch 66/100
354/354 [==============================] - 0s 173us/sample - loss: 6.3313 - mae: 1.7769 - mse: 6.3313 - val_loss: 10.1284 - val_mae: 2.3855 - val_mse: 10.1284
Epoch 67/100
354/354 [==============================] - 0s 169us/sample - loss: 6.2141 - mae: 1.7620 - mse: 6.2141 - val_loss: 10.3170 - val_mae: 2.4570 - val_mse: 10.3170
Epoch 68/100
354/354 [==============================] - 0s 183us/sample - loss: 6.1732 - mae: 1.7589 - mse: 6.1732 - val_loss: 9.7494 - val_mae: 2.3912 - val_mse: 9.7494
Epoch 69/100
354/354 [==============================] - 0s 173us/sample - loss: 6.1812 - mae: 1.7704 - mse: 6.1812 - val_loss: 10.7702 - val_mae: 2.3626 - val_mse: 10.7702
Epoch 70/100
354/354 [==============================] - 0s 171us/sample - loss: 6.1634 - mae: 1.8019 - mse: 6.1634 - val_loss: 9.6836 - val_mae: 2.3618 - val_mse: 9.6836
Epoch 71/100
354/354 [==============================] - 0s 169us/sample - loss: 6.0410 - mae: 1.7080 - mse: 6.0410 - val_loss: 9.8525 - val_mae: 2.3718 - val_mse: 9.8525
Epoch 72/100
354/354 [==============================] - 0s 166us/sample - loss: 5.7556 - mae: 1.7068 - mse: 5.7556 - val_loss: 11.4228 - val_mae: 2.4962 - val_mse: 11.4228
Epoch 73/100
354/354 [==============================] - 0s 176us/sample - loss: 5.8854 - mae: 1.7138 - mse: 5.8854 - val_loss: 9.8943 - val_mae: 2.4214 - val_mse: 9.8943
Epoch 74/100
354/354 [==============================] - 0s 177us/sample - loss: 5.6033 - mae: 1.6994 - mse: 5.6033 - val_loss: 10.2695 - val_mae: 2.3981 - val_mse: 10.2695
Epoch 75/100
354/354 [==============================] - 0s 173us/sample - loss: 5.7909 - mae: 1.6973 - mse: 5.7909 - val_loss: 10.0138 - val_mae: 2.3440 - val_mse: 10.0138
Epoch 76/100
354/354 [==============================] - 0s 171us/sample - loss: 5.4470 - mae: 1.6519 - mse: 5.4470 - val_loss: 9.7148 - val_mae: 2.4004 - val_mse: 9.7148
Epoch 77/100
354/354 [==============================] - 0s 176us/sample - loss: 5.6775 - mae: 1.6463 - mse: 5.6775 - val_loss: 10.6783 - val_mae: 2.3670 - val_mse: 10.6783
Epoch 78/100
354/354 [==============================] - 0s 172us/sample - loss: 5.4289 - mae: 1.7021 - mse: 5.4289 - val_loss: 10.2150 - val_mae: 2.3861 - val_mse: 10.2150
Epoch 79/100
354/354 [==============================] - 0s 166us/sample - loss: 5.4991 - mae: 1.6477 - mse: 5.4991 - val_loss: 9.6550 - val_mae: 2.3681 - val_mse: 9.6550
Epoch 80/100
354/354 [==============================] - 0s 176us/sample - loss: 5.3646 - mae: 1.6555 - mse: 5.3646 - val_loss: 11.0607 - val_mae: 2.4424 - val_mse: 11.0607
Epoch 81/100
354/354 [==============================] - 0s 174us/sample - loss: 5.3874 - mae: 1.6344 - mse: 5.3874 - val_loss: 11.2996 - val_mae: 2.6303 - val_mse: 11.2996
Epoch 82/100
354/354 [==============================] - 0s 167us/sample - loss: 5.3116 - mae: 1.6345 - mse: 5.3116 - val_loss: 10.2543 - val_mae: 2.3943 - val_mse: 10.2543
Epoch 83/100
354/354 [==============================] - 0s 166us/sample - loss: 5.1442 - mae: 1.6227 - mse: 5.1442 - val_loss: 10.5314 - val_mae: 2.3998 - val_mse: 10.5314
Epoch 84/100
354/354 [==============================] - 0s 171us/sample - loss: 5.2872 - mae: 1.6288 - mse: 5.2872 - val_loss: 9.8682 - val_mae: 2.3268 - val_mse: 9.8682
Epoch 85/100
354/354 [==============================] - 0s 170us/sample - loss: 5.1584 - mae: 1.6282 - mse: 5.1584 - val_loss: 10.2676 - val_mae: 2.4443 - val_mse: 10.2676
Epoch 86/100
354/354 [==============================] - 0s 173us/sample - loss: 5.0609 - mae: 1.6078 - mse: 5.0609 - val_loss: 10.0901 - val_mae: 2.4020 - val_mse: 10.0901
Epoch 87/100
354/354 [==============================] - 0s 163us/sample - loss: 5.1753 - mae: 1.6148 - mse: 5.1753 - val_loss: 10.7763 - val_mae: 2.3816 - val_mse: 10.7763
Epoch 88/100
354/354 [==============================] - 0s 169us/sample - loss: 5.0408 - mae: 1.6055 - mse: 5.0408 - val_loss: 10.1056 - val_mae: 2.3234 - val_mse: 10.1056
Epoch 89/100
354/354 [==============================] - 0s 173us/sample - loss: 5.0175 - mae: 1.6009 - mse: 5.0175 - val_loss: 9.6620 - val_mae: 2.3334 - val_mse: 9.6620
Epoch 90/100
354/354 [==============================] - 0s 173us/sample - loss: 4.7522 - mae: 1.5615 - mse: 4.7522 - val_loss: 9.8084 - val_mae: 2.3036 - val_mse: 9.8084
Epoch 91/100
354/354 [==============================] - 0s 169us/sample - loss: 4.8323 - mae: 1.5873 - mse: 4.8323 - val_loss: 10.7285 - val_mae: 2.4886 - val_mse: 10.7285
Epoch 92/100
354/354 [==============================] - 0s 165us/sample - loss: 4.8179 - mae: 1.5678 - mse: 4.8179 - val_loss: 10.1033 - val_mae: 2.3372 - val_mse: 10.1033
Epoch 93/100
354/354 [==============================] - 0s 168us/sample - loss: 4.7970 - mae: 1.5422 - mse: 4.7970 - val_loss: 9.8511 - val_mae: 2.3521 - val_mse: 9.8511
Epoch 94/100
354/354 [==============================] - 0s 180us/sample - loss: 4.7676 - mae: 1.5674 - mse: 4.7676 - val_loss: 10.1749 - val_mae: 2.4087 - val_mse: 10.1749
Epoch 95/100
354/354 [==============================] - 0s 170us/sample - loss: 4.7223 - mae: 1.5431 - mse: 4.7222 - val_loss: 10.2481 - val_mae: 2.3268 - val_mse: 10.2481
Epoch 96/100
354/354 [==============================] - 0s 164us/sample - loss: 4.6685 - mae: 1.5333 - mse: 4.6685 - val_loss: 10.7347 - val_mae: 2.5154 - val_mse: 10.7347
Epoch 97/100
354/354 [==============================] - 0s 177us/sample - loss: 4.5642 - mae: 1.5675 - mse: 4.5642 - val_loss: 11.3132 - val_mae: 2.4601 - val_mse: 11.3132
Epoch 98/100
354/354 [==============================] - 0s 177us/sample - loss: 4.3886 - mae: 1.4906 - mse: 4.3886 - val_loss: 12.2466 - val_mae: 2.7436 - val_mse: 12.2466
Epoch 99/100
354/354 [==============================] - 0s 177us/sample - loss: 4.4689 - mae: 1.5368 - mse: 4.4689 - val_loss: 10.4188 - val_mae: 2.3596 - val_mse: 10.4188
Epoch 100/100
354/354 [==============================] - 0s 168us/sample - loss: 4.6496 - mae: 1.5348 - mse: 4.6496 - val_loss: 10.0829 - val_mae: 2.3822 - val_mse: 10.0829
%% Cell type:markdown id: tags:
## Step 6 - Evaluate
### 6.1 - Model evaluation
MAE = Mean Absolute Error (between the labels and predictions)
A mae equal to 3 represents an average error in prediction of $3k.
%% Cell type:code id: tags:
``` python
score = model.evaluate(x_test, y_test, verbose=0)
print('x_test / loss : {:5.4f}'.format(score[0]))
print('x_test / mae : {:5.4f}'.format(score[1]))
print('x_test / mse : {:5.4f}'.format(score[2]))
```
%% Output
x_test / loss : 10.0829
x_test / mae : 2.3822
x_test / mse : 10.0829
%% Cell type:markdown id: tags:
### 6.2 - Training history
What was the best result during our training ?
%% Cell type:code id: tags:
``` python
df=pd.DataFrame(data=history.history)
df.describe()
```
%% Output
loss mae mse val_loss val_mae val_mse
count 100.000000 100.000000 100.000000 100.000000 100.000000 100.000000
mean 19.466892 2.462477 19.466893 18.670107 2.852570 18.670107
std 64.483863 2.592690 64.483872 48.257937 2.039701 48.257935
min 4.388600 1.490624 4.388600 9.655048 2.303586 9.655047
25% 5.658976 1.698877 5.658976 10.269067 2.393491 10.269066
50% 7.713175 1.945081 7.713175 10.750849 2.469115 10.750849
75% 10.770471 2.242925 10.770470 12.249026 2.610316 12.249027
max 536.084498 21.333506 536.084595 439.656211 19.319771 439.656189
%% Cell type:code id: tags:
``` python
print("min( val_mae ) : {:.4f}".format( min(history.history["val_mae"]) ) )
```
%% Output
min( val_mae ) : 2.3036
%% Cell type:code id: tags:
``` python
ooo.plot_history(history, plot={'MSE' :['mse', 'val_mse'],
'MAE' :['mae', 'val_mae'],
'LOSS':['loss','val_loss']})
```
%% Output
%% Cell type:markdown id: tags:
## Step 7 - Make a prediction
%% Cell type:code id: tags:
``` python
my_data = [ 1.26425925, -0.48522739, 1.0436489 , -0.23112788, 1.37120745,
-2.14308942, 1.13489104, -1.06802005, 1.71189006, 1.57042287,
0.77859951, 0.14769795, 2.7585581 ]
real_price = 10.4
my_data=np.array(my_data).reshape(1,13)
```
%% Cell type:code id: tags:
``` python
predictions = model.predict( my_data )
print("Prédiction : {:.2f} K$".format(predictions[0][0]))
print("Reality : {:.2f} K$".format(real_price))
```
%% Output
Prédiction : 9.70 K$
Reality : 10.40 K$
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/00-Fidle-logo-01.svg"></img>
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/00-Fidle-header-01.svg"></img>
# <!-- TITLE --> [BHP2] - Regression with a Dense Network (DNN) - Advanced code
<!-- DESC --> More advanced example of DNN network code - BHPD dataset
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- Predicts **housing prices** from a set of house features.
- Understanding the principle and the architecture of a regression with a dense neural network with backup and restore of the trained model.
The **[Boston Housing Dataset](https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html)** consists of price of houses in various places in Boston.
Alongside with price, the dataset also provide information such as Crime, areas of non-retail business in the town,
age of people who own the house and many other attributes...
## What we're going to do :
- (Retrieve data)
- (Preparing the data)
- (Build a model)
- Train and save the model
- Restore saved model
- Evaluate the model
- Make some predictions
%% Cell type:markdown id: tags:
## Step 1 - Import and init
%% Cell type:code id: tags:
``` python
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import os,sys
from IPython.display import display, Markdown
from importlib import reload
sys.path.append('..')
import fidle.pwk as ooo
ooo.init()
os.makedirs('./run/models', mode=0o750, exist_ok=True)
```
%% Output
FIDLE 2020 - Practical Work Module
Version : 0.2.9
Run time : Wednesday 19 February 2020, 10:13:01
TensorFlow version : 2.0.0
Keras version : 2.2.4-tf
%% Cell type:markdown id: tags:
## Step 2 - Retrieve data
### 2.1 - Option 1 : From Keras
Boston housing is a famous historic dataset, so we can get it directly from [Keras datasets](https://www.tensorflow.org/api_docs/python/tf/keras/datasets)
%% Cell type:raw id: tags:
(x_train, y_train), (x_test, y_test) = keras.datasets.boston_housing.load_data(test_split=0.2, seed=113)
%% Cell type:markdown id: tags:
### 2.2 - Option 2 : From a csv file
More fun !
%% Cell type:code id: tags:
``` python
data = pd.read_csv('./data/BostonHousing.csv', header=0)
display(data.head(5).style.format("{0:.2f}"))
print('Données manquantes : ',data.isna().sum().sum(), ' Shape is : ', data.shape)
```
%% Output
Données manquantes : 0 Shape is : (506, 14)
%% Cell type:markdown id: tags:
## Step 3 - Preparing the data
### 3.1 - Split data
We will use 80% of the data for training and 20% for validation.
x will be input data and y the expected output
%% Cell type:code id: tags:
``` python
# ---- Split => train, test
#
data_train = data.sample(frac=0.7, axis=0)
data_test = data.drop(data_train.index)
# ---- Split => x,y (medv is price)
#
x_train = data_train.drop('medv', axis=1)
y_train = data_train['medv']
x_test = data_test.drop('medv', axis=1)
y_test = data_test['medv']
print('Original data shape was : ',data.shape)
print('x_train : ',x_train.shape, 'y_train : ',y_train.shape)
print('x_test : ',x_test.shape, 'y_test : ',y_test.shape)
```
%% Output
Original data shape was : (506, 14)
x_train : (354, 13) y_train : (354,)
x_test : (152, 13) y_test : (152,)
%% Cell type:markdown id: tags:
### 3.2 - Data normalization
**Note :**
- All input data must be normalized, train and test.
- To do this we will subtract the mean and divide by the standard deviation.
- But test data should not be used in any way, even for normalization.
- The mean and the standard deviation will therefore only be calculated with the train data.
%% Cell type:code id: tags:
``` python
display(x_train.describe().style.format("{0:.2f}").set_caption("Before normalization :"))
mean = x_train.mean()
std = x_train.std()
x_train = (x_train - mean) / std
x_test = (x_test - mean) / std
display(x_train.describe().style.format("{0:.2f}").set_caption("After normalization :"))
x_train, y_train = np.array(x_train), np.array(y_train)
x_test, y_test = np.array(x_test), np.array(y_test)
```
%% Output
%% Cell type:markdown id: tags:
## Step 4 - Build a model
More informations about :
- [Optimizer](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers)
- [Activation](https://www.tensorflow.org/api_docs/python/tf/keras/activations)
- [Loss](https://www.tensorflow.org/api_docs/python/tf/keras/losses)
- [Metrics](https://www.tensorflow.org/api_docs/python/tf/keras/metrics)
%% Cell type:code id: tags:
``` python
def get_model_v1(shape):
model = keras.models.Sequential()
model.add(keras.layers.Input(shape, name="InputLayer"))
model.add(keras.layers.Dense(64, activation='relu', name='Dense_n1'))
model.add(keras.layers.Dense(64, activation='relu', name='Dense_n2'))
model.add(keras.layers.Dense(1, name='Output'))
model.compile(optimizer = 'rmsprop',
loss = 'mse',
metrics = ['mae', 'mse'] )
return model
```
%% Cell type:markdown id: tags:
## 5 - Train the model
### 5.1 - Get it
%% Cell type:code id: tags:
``` python
model=get_model_v1( (13,) )
model.summary()
keras.utils.plot_model( model, to_file='./run/model.png', show_shapes=True, show_layer_names=True, dpi=96)
```
%% Output
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
Dense_n1 (Dense) (None, 64) 896
_________________________________________________________________
Dense_n2 (Dense) (None, 64) 4160
_________________________________________________________________
Output (Dense) (None, 1) 65
=================================================================
Total params: 5,121
Trainable params: 5,121
Non-trainable params: 0
_________________________________________________________________
<IPython.core.display.Image object>
%% Cell type:markdown id: tags:
### 5.2 - Add callback
%% Cell type:code id: tags:
``` python
os.makedirs('./run/models', mode=0o750, exist_ok=True)
save_dir = "./run/models/best_model.h5"
savemodel_callback = tf.keras.callbacks.ModelCheckpoint(filepath=save_dir, verbose=0, save_best_only=True)
```
%% Cell type:markdown id: tags:
### 5.3 - Train it
%% Cell type:code id: tags:
``` python
history = model.fit(x_train,
y_train,
epochs = 100,
batch_size = 10,
verbose = 1,
validation_data = (x_test, y_test),
callbacks = [savemodel_callback])
```
%% Output
Train on 354 samples, validate on 152 samples
Epoch 1/100
354/354 [==============================] - 1s 2ms/sample - loss: 483.2389 - mae: 20.0008 - mse: 483.2388 - val_loss: 421.2562 - val_mae: 18.2848 - val_mse: 421.2561
Epoch 2/100
354/354 [==============================] - 0s 232us/sample - loss: 270.7346 - mae: 13.9655 - mse: 270.7346 - val_loss: 187.7437 - val_mae: 10.9696 - val_mse: 187.7437
Epoch 3/100
354/354 [==============================] - 0s 223us/sample - loss: 108.5703 - mae: 7.5648 - mse: 108.5703 - val_loss: 70.6387 - val_mae: 6.1047 - val_mse: 70.6387
Epoch 4/100
354/354 [==============================] - 0s 234us/sample - loss: 53.8803 - mae: 5.1135 - mse: 53.8803 - val_loss: 40.0765 - val_mae: 4.6628 - val_mse: 40.0765
Epoch 5/100
354/354 [==============================] - 0s 223us/sample - loss: 34.2413 - mae: 4.1321 - mse: 34.2413 - val_loss: 29.1298 - val_mae: 3.8690 - val_mse: 29.1298
Epoch 6/100
354/354 [==============================] - 0s 226us/sample - loss: 24.9834 - mae: 3.4851 - mse: 24.9834 - val_loss: 22.8731 - val_mae: 3.3820 - val_mse: 22.8731
Epoch 7/100
354/354 [==============================] - 0s 228us/sample - loss: 21.2207 - mae: 3.2139 - mse: 21.2207 - val_loss: 20.9766 - val_mae: 3.2391 - val_mse: 20.9766
Epoch 8/100
354/354 [==============================] - 0s 223us/sample - loss: 19.2641 - mae: 3.0025 - mse: 19.2641 - val_loss: 19.5046 - val_mae: 3.0795 - val_mse: 19.5046
Epoch 9/100
354/354 [==============================] - 0s 220us/sample - loss: 17.8432 - mae: 2.8878 - mse: 17.8432 - val_loss: 18.3068 - val_mae: 3.0834 - val_mse: 18.3068
Epoch 10/100
354/354 [==============================] - 0s 224us/sample - loss: 16.7673 - mae: 2.7365 - mse: 16.7673 - val_loss: 17.4260 - val_mae: 3.0035 - val_mse: 17.4260
Epoch 11/100
354/354 [==============================] - 0s 225us/sample - loss: 15.6927 - mae: 2.6873 - mse: 15.6927 - val_loss: 17.3096 - val_mae: 3.0492 - val_mse: 17.3096
Epoch 12/100
354/354 [==============================] - 0s 222us/sample - loss: 15.4113 - mae: 2.6274 - mse: 15.4113 - val_loss: 15.7095 - val_mae: 2.8104 - val_mse: 15.7095
Epoch 13/100
354/354 [==============================] - 0s 220us/sample - loss: 14.7243 - mae: 2.5393 - mse: 14.7243 - val_loss: 15.6497 - val_mae: 2.9052 - val_mse: 15.6497
Epoch 14/100
354/354 [==============================] - 0s 228us/sample - loss: 14.2611 - mae: 2.5371 - mse: 14.2611 - val_loss: 14.9650 - val_mae: 2.8165 - val_mse: 14.9650
Epoch 15/100
354/354 [==============================] - 0s 222us/sample - loss: 14.0530 - mae: 2.5289 - mse: 14.0530 - val_loss: 14.8840 - val_mae: 2.8196 - val_mse: 14.8840
Epoch 16/100
354/354 [==============================] - 0s 224us/sample - loss: 13.3820 - mae: 2.4568 - mse: 13.3820 - val_loss: 13.7568 - val_mae: 2.6754 - val_mse: 13.7568
Epoch 17/100
354/354 [==============================] - 0s 218us/sample - loss: 13.2232 - mae: 2.4318 - mse: 13.2232 - val_loss: 13.6934 - val_mae: 2.6355 - val_mse: 13.6934
Epoch 18/100
354/354 [==============================] - 0s 183us/sample - loss: 12.8038 - mae: 2.3743 - mse: 12.8038 - val_loss: 13.7276 - val_mae: 2.6466 - val_mse: 13.7276
Epoch 19/100
354/354 [==============================] - 0s 223us/sample - loss: 12.4826 - mae: 2.3804 - mse: 12.4826 - val_loss: 13.0037 - val_mae: 2.5279 - val_mse: 13.0037
Epoch 20/100
354/354 [==============================] - 0s 222us/sample - loss: 12.2345 - mae: 2.3264 - mse: 12.2345 - val_loss: 12.8911 - val_mae: 2.5583 - val_mse: 12.8911
Epoch 21/100
354/354 [==============================] - 0s 231us/sample - loss: 12.0720 - mae: 2.3410 - mse: 12.0720 - val_loss: 12.5983 - val_mae: 2.5747 - val_mse: 12.5983
Epoch 22/100
354/354 [==============================] - 0s 224us/sample - loss: 11.7805 - mae: 2.2897 - mse: 11.7805 - val_loss: 12.1645 - val_mae: 2.5094 - val_mse: 12.1645
Epoch 23/100
354/354 [==============================] - 0s 174us/sample - loss: 11.4012 - mae: 2.2581 - mse: 11.4012 - val_loss: 13.6673 - val_mae: 2.7201 - val_mse: 13.6673
Epoch 24/100
354/354 [==============================] - 0s 227us/sample - loss: 11.2741 - mae: 2.2712 - mse: 11.2741 - val_loss: 11.6918 - val_mae: 2.4039 - val_mse: 11.6918
Epoch 25/100
354/354 [==============================] - 0s 179us/sample - loss: 11.2056 - mae: 2.2226 - mse: 11.2056 - val_loss: 12.3935 - val_mae: 2.6021 - val_mse: 12.3935
Epoch 26/100
354/354 [==============================] - 0s 173us/sample - loss: 10.8629 - mae: 2.2289 - mse: 10.8629 - val_loss: 11.9155 - val_mae: 2.3744 - val_mse: 11.9155
Epoch 27/100
354/354 [==============================] - 0s 218us/sample - loss: 11.0500 - mae: 2.2151 - mse: 11.0500 - val_loss: 11.2193 - val_mae: 2.3695 - val_mse: 11.2193
Epoch 28/100
354/354 [==============================] - 0s 180us/sample - loss: 10.4915 - mae: 2.1578 - mse: 10.4915 - val_loss: 11.9919 - val_mae: 2.5344 - val_mse: 11.9919
Epoch 29/100
354/354 [==============================] - 0s 182us/sample - loss: 10.5519 - mae: 2.1307 - mse: 10.5519 - val_loss: 11.3573 - val_mae: 2.4664 - val_mse: 11.3573
Epoch 30/100
354/354 [==============================] - 0s 170us/sample - loss: 10.0504 - mae: 2.1281 - mse: 10.0504 - val_loss: 11.7304 - val_mae: 2.5102 - val_mse: 11.7304
Epoch 31/100
354/354 [==============================] - 0s 216us/sample - loss: 9.8992 - mae: 2.1397 - mse: 9.8992 - val_loss: 10.9137 - val_mae: 2.3602 - val_mse: 10.9137
Epoch 32/100
354/354 [==============================] - 0s 175us/sample - loss: 9.9473 - mae: 2.0665 - mse: 9.9473 - val_loss: 11.1929 - val_mae: 2.4503 - val_mse: 11.1929
Epoch 33/100
354/354 [==============================] - 0s 168us/sample - loss: 9.6057 - mae: 2.0609 - mse: 9.6057 - val_loss: 11.5105 - val_mae: 2.4419 - val_mse: 11.5105
Epoch 34/100
354/354 [==============================] - 0s 178us/sample - loss: 9.6783 - mae: 2.0484 - mse: 9.6783 - val_loss: 11.0130 - val_mae: 2.4072 - val_mse: 11.0130
Epoch 35/100
354/354 [==============================] - 0s 211us/sample - loss: 9.3834 - mae: 2.0337 - mse: 9.3834 - val_loss: 10.8769 - val_mae: 2.3960 - val_mse: 10.8769
Epoch 36/100
354/354 [==============================] - 0s 222us/sample - loss: 9.4563 - mae: 2.0349 - mse: 9.4563 - val_loss: 10.7918 - val_mae: 2.4397 - val_mse: 10.7918
Epoch 37/100
354/354 [==============================] - 0s 223us/sample - loss: 9.4023 - mae: 2.0246 - mse: 9.4023 - val_loss: 10.4927 - val_mae: 2.3926 - val_mse: 10.4927
Epoch 38/100
354/354 [==============================] - 0s 175us/sample - loss: 8.9702 - mae: 2.0006 - mse: 8.9702 - val_loss: 10.9715 - val_mae: 2.4245 - val_mse: 10.9715
Epoch 39/100
354/354 [==============================] - 0s 174us/sample - loss: 9.0225 - mae: 2.0207 - mse: 9.0225 - val_loss: 10.9499 - val_mae: 2.4785 - val_mse: 10.9499
Epoch 40/100
354/354 [==============================] - 0s 177us/sample - loss: 8.8586 - mae: 1.9994 - mse: 8.8586 - val_loss: 10.5540 - val_mae: 2.3401 - val_mse: 10.5540
Epoch 41/100
354/354 [==============================] - 0s 214us/sample - loss: 8.7666 - mae: 1.9705 - mse: 8.7666 - val_loss: 10.3300 - val_mae: 2.3298 - val_mse: 10.3300
Epoch 42/100
354/354 [==============================] - 0s 177us/sample - loss: 8.4090 - mae: 1.9556 - mse: 8.4090 - val_loss: 11.9413 - val_mae: 2.5568 - val_mse: 11.9413
Epoch 43/100
354/354 [==============================] - 0s 216us/sample - loss: 8.4974 - mae: 1.9809 - mse: 8.4974 - val_loss: 10.2694 - val_mae: 2.2804 - val_mse: 10.2694
Epoch 44/100
354/354 [==============================] - 0s 179us/sample - loss: 8.4512 - mae: 1.9371 - mse: 8.4512 - val_loss: 10.6134 - val_mae: 2.3782 - val_mse: 10.6134
Epoch 45/100
354/354 [==============================] - 0s 168us/sample - loss: 8.3356 - mae: 1.9116 - mse: 8.3356 - val_loss: 10.5007 - val_mae: 2.3672 - val_mse: 10.5007
Epoch 46/100
354/354 [==============================] - 0s 220us/sample - loss: 8.0746 - mae: 1.9163 - mse: 8.0746 - val_loss: 9.9081 - val_mae: 2.1968 - val_mse: 9.9081
Epoch 47/100
354/354 [==============================] - 0s 183us/sample - loss: 8.2374 - mae: 1.9080 - mse: 8.2374 - val_loss: 10.2771 - val_mae: 2.3529 - val_mse: 10.2771
Epoch 48/100
354/354 [==============================] - 0s 216us/sample - loss: 8.0765 - mae: 1.9000 - mse: 8.0765 - val_loss: 9.7120 - val_mae: 2.1879 - val_mse: 9.7120
Epoch 49/100
354/354 [==============================] - 0s 163us/sample - loss: 7.7848 - mae: 1.8825 - mse: 7.7848 - val_loss: 10.2084 - val_mae: 2.2360 - val_mse: 10.2084
Epoch 50/100
354/354 [==============================] - 0s 178us/sample - loss: 7.5973 - mae: 1.8669 - mse: 7.5973 - val_loss: 10.1582 - val_mae: 2.2808 - val_mse: 10.1582
Epoch 51/100
354/354 [==============================] - 0s 168us/sample - loss: 7.8596 - mae: 1.9102 - mse: 7.8596 - val_loss: 9.9785 - val_mae: 2.3041 - val_mse: 9.9785
Epoch 52/100
354/354 [==============================] - 0s 172us/sample - loss: 7.5027 - mae: 1.8527 - mse: 7.5027 - val_loss: 10.2315 - val_mae: 2.3614 - val_mse: 10.2315
Epoch 53/100
354/354 [==============================] - 0s 174us/sample - loss: 7.3160 - mae: 1.8556 - mse: 7.3160 - val_loss: 10.7149 - val_mae: 2.4225 - val_mse: 10.7149
Epoch 54/100
354/354 [==============================] - 0s 178us/sample - loss: 7.4478 - mae: 1.8692 - mse: 7.4478 - val_loss: 13.1244 - val_mae: 2.7923 - val_mse: 13.1244
Epoch 55/100
354/354 [==============================] - 0s 222us/sample - loss: 7.2579 - mae: 1.8375 - mse: 7.2579 - val_loss: 9.4053 - val_mae: 2.1927 - val_mse: 9.4053
Epoch 56/100
354/354 [==============================] - 0s 178us/sample - loss: 7.3045 - mae: 1.8785 - mse: 7.3045 - val_loss: 10.3231 - val_mae: 2.4311 - val_mse: 10.3231
Epoch 57/100
354/354 [==============================] - 0s 168us/sample - loss: 6.8708 - mae: 1.8047 - mse: 6.8708 - val_loss: 11.3678 - val_mae: 2.6010 - val_mse: 11.3678
Epoch 58/100
354/354 [==============================] - 0s 180us/sample - loss: 6.9471 - mae: 1.8179 - mse: 6.9471 - val_loss: 10.2855 - val_mae: 2.3937 - val_mse: 10.2855
Epoch 59/100
354/354 [==============================] - 0s 217us/sample - loss: 6.8858 - mae: 1.7987 - mse: 6.8858 - val_loss: 9.1795 - val_mae: 2.1552 - val_mse: 9.1795
Epoch 60/100
354/354 [==============================] - 0s 179us/sample - loss: 6.8982 - mae: 1.7783 - mse: 6.8982 - val_loss: 10.0291 - val_mae: 2.3000 - val_mse: 10.0291
Epoch 61/100
354/354 [==============================] - 0s 168us/sample - loss: 6.8502 - mae: 1.7688 - mse: 6.8502 - val_loss: 9.5141 - val_mae: 2.2370 - val_mse: 9.5141
Epoch 62/100
354/354 [==============================] - 0s 173us/sample - loss: 6.6801 - mae: 1.7737 - mse: 6.6801 - val_loss: 9.6853 - val_mae: 2.2719 - val_mse: 9.6853
Epoch 63/100
354/354 [==============================] - 0s 178us/sample - loss: 6.5468 - mae: 1.7479 - mse: 6.5468 - val_loss: 9.5858 - val_mae: 2.2346 - val_mse: 9.5858
Epoch 64/100
354/354 [==============================] - 0s 172us/sample - loss: 6.3406 - mae: 1.6985 - mse: 6.3406 - val_loss: 9.8893 - val_mae: 2.2439 - val_mse: 9.8893
Epoch 65/100
354/354 [==============================] - 0s 177us/sample - loss: 6.4070 - mae: 1.7780 - mse: 6.4071 - val_loss: 10.4085 - val_mae: 2.3908 - val_mse: 10.4085
Epoch 66/100
354/354 [==============================] - 0s 170us/sample - loss: 6.4227 - mae: 1.7042 - mse: 6.4227 - val_loss: 9.5313 - val_mae: 2.1998 - val_mse: 9.5313
Epoch 67/100
354/354 [==============================] - 0s 178us/sample - loss: 6.3353 - mae: 1.7095 - mse: 6.3353 - val_loss: 9.9436 - val_mae: 2.2965 - val_mse: 9.9436
Epoch 68/100
354/354 [==============================] - 0s 173us/sample - loss: 5.8545 - mae: 1.6760 - mse: 5.8545 - val_loss: 9.9311 - val_mae: 2.2837 - val_mse: 9.9311
Epoch 69/100
354/354 [==============================] - 0s 171us/sample - loss: 6.1148 - mae: 1.7286 - mse: 6.1148 - val_loss: 9.6456 - val_mae: 2.1932 - val_mse: 9.6456
Epoch 70/100
354/354 [==============================] - 0s 179us/sample - loss: 6.0462 - mae: 1.7194 - mse: 6.0462 - val_loss: 10.7485 - val_mae: 2.3224 - val_mse: 10.7485
Epoch 71/100
354/354 [==============================] - 0s 171us/sample - loss: 5.8132 - mae: 1.7049 - mse: 5.8132 - val_loss: 9.8704 - val_mae: 2.1916 - val_mse: 9.8704
Epoch 72/100
354/354 [==============================] - 0s 174us/sample - loss: 5.7957 - mae: 1.6492 - mse: 5.7957 - val_loss: 10.0593 - val_mae: 2.3159 - val_mse: 10.0593
Epoch 73/100
354/354 [==============================] - 0s 178us/sample - loss: 5.9002 - mae: 1.6952 - mse: 5.9002 - val_loss: 10.1425 - val_mae: 2.3594 - val_mse: 10.1425
Epoch 74/100
354/354 [==============================] - 0s 174us/sample - loss: 5.5721 - mae: 1.6277 - mse: 5.5721 - val_loss: 9.9564 - val_mae: 2.2284 - val_mse: 9.9564
Epoch 75/100
354/354 [==============================] - 0s 177us/sample - loss: 5.6730 - mae: 1.6669 - mse: 5.6730 - val_loss: 10.0358 - val_mae: 2.2259 - val_mse: 10.0358
Epoch 76/100
354/354 [==============================] - 0s 168us/sample - loss: 5.5947 - mae: 1.6216 - mse: 5.5947 - val_loss: 9.7815 - val_mae: 2.2282 - val_mse: 9.7815
Epoch 77/100
354/354 [==============================] - 0s 175us/sample - loss: 5.2870 - mae: 1.6492 - mse: 5.2870 - val_loss: 9.3813 - val_mae: 2.1987 - val_mse: 9.3813
Epoch 78/100
354/354 [==============================] - 0s 166us/sample - loss: 5.6015 - mae: 1.6183 - mse: 5.6015 - val_loss: 9.5577 - val_mae: 2.2139 - val_mse: 9.5577
Epoch 79/100
354/354 [==============================] - 0s 191us/sample - loss: 5.3793 - mae: 1.6202 - mse: 5.3793 - val_loss: 9.4099 - val_mae: 2.1957 - val_mse: 9.4099
Epoch 80/100
354/354 [==============================] - 0s 172us/sample - loss: 5.4258 - mae: 1.5943 - mse: 5.4258 - val_loss: 9.7489 - val_mae: 2.2233 - val_mse: 9.7489
Epoch 81/100
354/354 [==============================] - 0s 181us/sample - loss: 5.3006 - mae: 1.5934 - mse: 5.3006 - val_loss: 10.0298 - val_mae: 2.2258 - val_mse: 10.0298
Epoch 82/100
354/354 [==============================] - 0s 177us/sample - loss: 5.2590 - mae: 1.5854 - mse: 5.2590 - val_loss: 9.9642 - val_mae: 2.2718 - val_mse: 9.9642
Epoch 83/100
354/354 [==============================] - 0s 178us/sample - loss: 5.1325 - mae: 1.5765 - mse: 5.1325 - val_loss: 10.0795 - val_mae: 2.2524 - val_mse: 10.0795
Epoch 84/100
354/354 [==============================] - 0s 174us/sample - loss: 5.0736 - mae: 1.5846 - mse: 5.0736 - val_loss: 10.1607 - val_mae: 2.3146 - val_mse: 10.1607
Epoch 85/100
354/354 [==============================] - 0s 168us/sample - loss: 5.0863 - mae: 1.5598 - mse: 5.0863 - val_loss: 10.0663 - val_mae: 2.2961 - val_mse: 10.0663
Epoch 86/100
354/354 [==============================] - 0s 175us/sample - loss: 5.0422 - mae: 1.5758 - mse: 5.0422 - val_loss: 9.3842 - val_mae: 2.2033 - val_mse: 9.3842
Epoch 87/100
354/354 [==============================] - 0s 179us/sample - loss: 4.8308 - mae: 1.5587 - mse: 4.8308 - val_loss: 9.4605 - val_mae: 2.1797 - val_mse: 9.4605
Epoch 88/100
354/354 [==============================] - 0s 172us/sample - loss: 4.7424 - mae: 1.5468 - mse: 4.7424 - val_loss: 12.0587 - val_mae: 2.6306 - val_mse: 12.0587
Epoch 89/100
354/354 [==============================] - 0s 172us/sample - loss: 4.9329 - mae: 1.5937 - mse: 4.9329 - val_loss: 9.9514 - val_mae: 2.2366 - val_mse: 9.9514
Epoch 90/100
354/354 [==============================] - 0s 176us/sample - loss: 4.7181 - mae: 1.5625 - mse: 4.7181 - val_loss: 9.6245 - val_mae: 2.1626 - val_mse: 9.6245
Epoch 91/100
354/354 [==============================] - 0s 182us/sample - loss: 4.6726 - mae: 1.5040 - mse: 4.6726 - val_loss: 9.9543 - val_mae: 2.2394 - val_mse: 9.9543
Epoch 92/100
354/354 [==============================] - 0s 180us/sample - loss: 4.7058 - mae: 1.5416 - mse: 4.7058 - val_loss: 10.6368 - val_mae: 2.3900 - val_mse: 10.6368
Epoch 93/100
354/354 [==============================] - 0s 176us/sample - loss: 4.6515 - mae: 1.5235 - mse: 4.6515 - val_loss: 10.0118 - val_mae: 2.2661 - val_mse: 10.0118
Epoch 94/100
354/354 [==============================] - 0s 163us/sample - loss: 4.6973 - mae: 1.5262 - mse: 4.6973 - val_loss: 9.4214 - val_mae: 2.1961 - val_mse: 9.4214
Epoch 95/100
354/354 [==============================] - 0s 174us/sample - loss: 4.7056 - mae: 1.5392 - mse: 4.7056 - val_loss: 9.6110 - val_mae: 2.1998 - val_mse: 9.6110
Epoch 96/100
354/354 [==============================] - 0s 167us/sample - loss: 4.4156 - mae: 1.4496 - mse: 4.4156 - val_loss: 10.1083 - val_mae: 2.3143 - val_mse: 10.1083
Epoch 97/100
354/354 [==============================] - 0s 173us/sample - loss: 4.5201 - mae: 1.5019 - mse: 4.5201 - val_loss: 9.7179 - val_mae: 2.2635 - val_mse: 9.7179
Epoch 98/100
354/354 [==============================] - 0s 179us/sample - loss: 4.3824 - mae: 1.4403 - mse: 4.3824 - val_loss: 10.2802 - val_mae: 2.2846 - val_mse: 10.2802
Epoch 99/100
354/354 [==============================] - 0s 175us/sample - loss: 4.3252 - mae: 1.4806 - mse: 4.3252 - val_loss: 9.5943 - val_mae: 2.1745 - val_mse: 9.5943
Epoch 100/100
354/354 [==============================] - 0s 178us/sample - loss: 4.4134 - mae: 1.4451 - mse: 4.4134 - val_loss: 12.2396 - val_mae: 2.6152 - val_mse: 12.2396
%% Cell type:markdown id: tags:
## Step 6 - Evaluate
### 6.1 - Model evaluation
MAE = Mean Absolute Error (between the labels and predictions)
A mae equal to 3 represents an average error in prediction of $3k.
%% Cell type:code id: tags:
``` python
score = model.evaluate(x_test, y_test, verbose=0)
print('x_test / loss : {:5.4f}'.format(score[0]))
print('x_test / mae : {:5.4f}'.format(score[1]))
print('x_test / mse : {:5.4f}'.format(score[2]))
```
%% Output
x_test / loss : 12.2396
x_test / mae : 2.6152
x_test / mse : 12.2396
%% Cell type:markdown id: tags:
### 6.2 - Training history
What was the best result during our training ?
%% Cell type:code id: tags:
``` python
print("min( val_mae ) : {:.4f}".format( min(history.history["val_mae"]) ) )
```
%% Output
min( val_mae ) : 2.1552
%% Cell type:code id: tags:
``` python
ooo.plot_history(history, plot={'MSE' :['mse', 'val_mse'],
'MAE' :['mae', 'val_mae'],
'LOSS':['loss','val_loss']})
```
%% Output
%% Cell type:markdown id: tags:
## Step 7 - Restore a model :
%% Cell type:markdown id: tags:
### 7.1 - Reload model
%% Cell type:code id: tags:
``` python
loaded_model = tf.keras.models.load_model('./run/models/best_model.h5')
loaded_model.summary()
print("Loaded.")
```
%% Output
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
Dense_n1 (Dense) (None, 64) 896
_________________________________________________________________
Dense_n2 (Dense) (None, 64) 4160
_________________________________________________________________
Output (Dense) (None, 1) 65
=================================================================
Total params: 5,121
Trainable params: 5,121
Non-trainable params: 0
_________________________________________________________________
Loaded.
%% Cell type:markdown id: tags:
### 7.2 - Evaluate it :
%% Cell type:code id: tags:
``` python
score = loaded_model.evaluate(x_test, y_test, verbose=0)
print('x_test / loss : {:5.4f}'.format(score[0]))
print('x_test / mae : {:5.4f}'.format(score[1]))
print('x_test / mse : {:5.4f}'.format(score[2]))
```
%% Output
x_test / loss : 9.1795
x_test / mae : 2.1552
x_test / mse : 9.1795
%% Cell type:markdown id: tags:
### 7.3 - Make a prediction
%% Cell type:code id: tags:
``` python
mon_test=[-0.20113196, -0.48631663, 1.23572348, -0.26929877, 2.67879106,
-0.89623587, 1.09961251, -1.05826704, -0.55823117, -0.06159088,
-1.76085159, -1.97039608, 0.52775666]
mon_test=np.array(mon_test).reshape(1,13)
```
%% Cell type:code id: tags:
``` python
predictions = loaded_model.predict( mon_test )
print("Prédiction : {:.2f} K$ Reality : {:.2f} K$".format(predictions[0][0], y_train[13]))
```
%% Output
Prédiction : 16.51 K$ Reality : 20.20 K$
%% Cell type:markdown id: tags:
---
![](../fidle/img/00-Fidle-logo-01_s.png)
"crim","zn","indus","chas","nox","rm","age","dis","rad","tax","ptratio","b","lstat","medv"
0.00632,18,2.31,"0",0.538,6.575,65.2,4.09,1,296,15.3,396.9,4.98,24
0.02731,0,7.07,"0",0.469,6.421,78.9,4.9671,2,242,17.8,396.9,9.14,21.6
0.02729,0,7.07,"0",0.469,7.185,61.1,4.9671,2,242,17.8,392.83,4.03,34.7
0.03237,0,2.18,"0",0.458,6.998,45.8,6.0622,3,222,18.7,394.63,2.94,33.4
0.06905,0,2.18,"0",0.458,7.147,54.2,6.0622,3,222,18.7,396.9,5.33,36.2
0.02985,0,2.18,"0",0.458,6.43,58.7,6.0622,3,222,18.7,394.12,5.21,28.7
0.08829,12.5,7.87,"0",0.524,6.012,66.6,5.5605,5,311,15.2,395.6,12.43,22.9
0.14455,12.5,7.87,"0",0.524,6.172,96.1,5.9505,5,311,15.2,396.9,19.15,27.1
0.21124,12.5,7.87,"0",0.524,5.631,100,6.0821,5,311,15.2,386.63,29.93,16.5
0.17004,12.5,7.87,"0",0.524,6.004,85.9,6.5921,5,311,15.2,386.71,17.1,18.9
0.22489,12.5,7.87,"0",0.524,6.377,94.3,6.3467,5,311,15.2,392.52,20.45,15
0.11747,12.5,7.87,"0",0.524,6.009,82.9,6.2267,5,311,15.2,396.9,13.27,18.9
0.09378,12.5,7.87,"0",0.524,5.889,39,5.4509,5,311,15.2,390.5,15.71,21.7
0.62976,0,8.14,"0",0.538,5.949,61.8,4.7075,4,307,21,396.9,8.26,20.4
0.63796,0,8.14,"0",0.538,6.096,84.5,4.4619,4,307,21,380.02,10.26,18.2
0.62739,0,8.14,"0",0.538,5.834,56.5,4.4986,4,307,21,395.62,8.47,19.9
1.05393,0,8.14,"0",0.538,5.935,29.3,4.4986,4,307,21,386.85,6.58,23.1
0.7842,0,8.14,"0",0.538,5.99,81.7,4.2579,4,307,21,386.75,14.67,17.5
0.80271,0,8.14,"0",0.538,5.456,36.6,3.7965,4,307,21,288.99,11.69,20.2
0.7258,0,8.14,"0",0.538,5.727,69.5,3.7965,4,307,21,390.95,11.28,18.2
1.25179,0,8.14,"0",0.538,5.57,98.1,3.7979,4,307,21,376.57,21.02,13.6
0.85204,0,8.14,"0",0.538,5.965,89.2,4.0123,4,307,21,392.53,13.83,19.6
1.23247,0,8.14,"0",0.538,6.142,91.7,3.9769,4,307,21,396.9,18.72,15.2
0.98843,0,8.14,"0",0.538,5.813,100,4.0952,4,307,21,394.54,19.88,14.5
0.75026,0,8.14,"0",0.538,5.924,94.1,4.3996,4,307,21,394.33,16.3,15.6
0.84054,0,8.14,"0",0.538,5.599,85.7,4.4546,4,307,21,303.42,16.51,13.9
0.67191,0,8.14,"0",0.538,5.813,90.3,4.682,4,307,21,376.88,14.81,16.6
0.95577,0,8.14,"0",0.538,6.047,88.8,4.4534,4,307,21,306.38,17.28,14.8
0.77299,0,8.14,"0",0.538,6.495,94.4,4.4547,4,307,21,387.94,12.8,18.4
1.00245,0,8.14,"0",0.538,6.674,87.3,4.239,4,307,21,380.23,11.98,21
1.13081,0,8.14,"0",0.538,5.713,94.1,4.233,4,307,21,360.17,22.6,12.7
1.35472,0,8.14,"0",0.538,6.072,100,4.175,4,307,21,376.73,13.04,14.5
1.38799,0,8.14,"0",0.538,5.95,82,3.99,4,307,21,232.6,27.71,13.2
1.15172,0,8.14,"0",0.538,5.701,95,3.7872,4,307,21,358.77,18.35,13.1
1.61282,0,8.14,"0",0.538,6.096,96.9,3.7598,4,307,21,248.31,20.34,13.5
0.06417,0,5.96,"0",0.499,5.933,68.2,3.3603,5,279,19.2,396.9,9.68,18.9
0.09744,0,5.96,"0",0.499,5.841,61.4,3.3779,5,279,19.2,377.56,11.41,20
0.08014,0,5.96,"0",0.499,5.85,41.5,3.9342,5,279,19.2,396.9,8.77,21
0.17505,0,5.96,"0",0.499,5.966,30.2,3.8473,5,279,19.2,393.43,10.13,24.7
0.02763,75,2.95,"0",0.428,6.595,21.8,5.4011,3,252,18.3,395.63,4.32,30.8
0.03359,75,2.95,"0",0.428,7.024,15.8,5.4011,3,252,18.3,395.62,1.98,34.9
0.12744,0,6.91,"0",0.448,6.77,2.9,5.7209,3,233,17.9,385.41,4.84,26.6
0.1415,0,6.91,"0",0.448,6.169,6.6,5.7209,3,233,17.9,383.37,5.81,25.3
0.15936,0,6.91,"0",0.448,6.211,6.5,5.7209,3,233,17.9,394.46,7.44,24.7
0.12269,0,6.91,"0",0.448,6.069,40,5.7209,3,233,17.9,389.39,9.55,21.2
0.17142,0,6.91,"0",0.448,5.682,33.8,5.1004,3,233,17.9,396.9,10.21,19.3
0.18836,0,6.91,"0",0.448,5.786,33.3,5.1004,3,233,17.9,396.9,14.15,20
0.22927,0,6.91,"0",0.448,6.03,85.5,5.6894,3,233,17.9,392.74,18.8,16.6
0.25387,0,6.91,"0",0.448,5.399,95.3,5.87,3,233,17.9,396.9,30.81,14.4
0.21977,0,6.91,"0",0.448,5.602,62,6.0877,3,233,17.9,396.9,16.2,19.4
0.08873,21,5.64,"0",0.439,5.963,45.7,6.8147,4,243,16.8,395.56,13.45,19.7
0.04337,21,5.64,"0",0.439,6.115,63,6.8147,4,243,16.8,393.97,9.43,20.5
0.0536,21,5.64,"0",0.439,6.511,21.1,6.8147,4,243,16.8,396.9,5.28,25
0.04981,21,5.64,"0",0.439,5.998,21.4,6.8147,4,243,16.8,396.9,8.43,23.4
0.0136,75,4,"0",0.41,5.888,47.6,7.3197,3,469,21.1,396.9,14.8,18.9
0.01311,90,1.22,"0",0.403,7.249,21.9,8.6966,5,226,17.9,395.93,4.81,35.4
0.02055,85,0.74,"0",0.41,6.383,35.7,9.1876,2,313,17.3,396.9,5.77,24.7
0.01432,100,1.32,"0",0.411,6.816,40.5,8.3248,5,256,15.1,392.9,3.95,31.6
0.15445,25,5.13,"0",0.453,6.145,29.2,7.8148,8,284,19.7,390.68,6.86,23.3
0.10328,25,5.13,"0",0.453,5.927,47.2,6.932,8,284,19.7,396.9,9.22,19.6
0.14932,25,5.13,"0",0.453,5.741,66.2,7.2254,8,284,19.7,395.11,13.15,18.7
0.17171,25,5.13,"0",0.453,5.966,93.4,6.8185,8,284,19.7,378.08,14.44,16
0.11027,25,5.13,"0",0.453,6.456,67.8,7.2255,8,284,19.7,396.9,6.73,22.2
0.1265,25,5.13,"0",0.453,6.762,43.4,7.9809,8,284,19.7,395.58,9.5,25
0.01951,17.5,1.38,"0",0.4161,7.104,59.5,9.2229,3,216,18.6,393.24,8.05,33
0.03584,80,3.37,"0",0.398,6.29,17.8,6.6115,4,337,16.1,396.9,4.67,23.5
0.04379,80,3.37,"0",0.398,5.787,31.1,6.6115,4,337,16.1,396.9,10.24,19.4
0.05789,12.5,6.07,"0",0.409,5.878,21.4,6.498,4,345,18.9,396.21,8.1,22
0.13554,12.5,6.07,"0",0.409,5.594,36.8,6.498,4,345,18.9,396.9,13.09,17.4
0.12816,12.5,6.07,"0",0.409,5.885,33,6.498,4,345,18.9,396.9,8.79,20.9
0.08826,0,10.81,"0",0.413,6.417,6.6,5.2873,4,305,19.2,383.73,6.72,24.2
0.15876,0,10.81,"0",0.413,5.961,17.5,5.2873,4,305,19.2,376.94,9.88,21.7
0.09164,0,10.81,"0",0.413,6.065,7.8,5.2873,4,305,19.2,390.91,5.52,22.8
0.19539,0,10.81,"0",0.413,6.245,6.2,5.2873,4,305,19.2,377.17,7.54,23.4
0.07896,0,12.83,"0",0.437,6.273,6,4.2515,5,398,18.7,394.92,6.78,24.1
0.09512,0,12.83,"0",0.437,6.286,45,4.5026,5,398,18.7,383.23,8.94,21.4
0.10153,0,12.83,"0",0.437,6.279,74.5,4.0522,5,398,18.7,373.66,11.97,20
0.08707,0,12.83,"0",0.437,6.14,45.8,4.0905,5,398,18.7,386.96,10.27,20.8
0.05646,0,12.83,"0",0.437,6.232,53.7,5.0141,5,398,18.7,386.4,12.34,21.2
0.08387,0,12.83,"0",0.437,5.874,36.6,4.5026,5,398,18.7,396.06,9.1,20.3
0.04113,25,4.86,"0",0.426,6.727,33.5,5.4007,4,281,19,396.9,5.29,28
0.04462,25,4.86,"0",0.426,6.619,70.4,5.4007,4,281,19,395.63,7.22,23.9
0.03659,25,4.86,"0",0.426,6.302,32.2,5.4007,4,281,19,396.9,6.72,24.8
0.03551,25,4.86,"0",0.426,6.167,46.7,5.4007,4,281,19,390.64,7.51,22.9
0.05059,0,4.49,"0",0.449,6.389,48,4.7794,3,247,18.5,396.9,9.62,23.9
0.05735,0,4.49,"0",0.449,6.63,56.1,4.4377,3,247,18.5,392.3,6.53,26.6
0.05188,0,4.49,"0",0.449,6.015,45.1,4.4272,3,247,18.5,395.99,12.86,22.5
0.07151,0,4.49,"0",0.449,6.121,56.8,3.7476,3,247,18.5,395.15,8.44,22.2
0.0566,0,3.41,"0",0.489,7.007,86.3,3.4217,2,270,17.8,396.9,5.5,23.6
0.05302,0,3.41,"0",0.489,7.079,63.1,3.4145,2,270,17.8,396.06,5.7,28.7
0.04684,0,3.41,"0",0.489,6.417,66.1,3.0923,2,270,17.8,392.18,8.81,22.6
0.03932,0,3.41,"0",0.489,6.405,73.9,3.0921,2,270,17.8,393.55,8.2,22
0.04203,28,15.04,"0",0.464,6.442,53.6,3.6659,4,270,18.2,395.01,8.16,22.9
0.02875,28,15.04,"0",0.464,6.211,28.9,3.6659,4,270,18.2,396.33,6.21,25
0.04294,28,15.04,"0",0.464,6.249,77.3,3.615,4,270,18.2,396.9,10.59,20.6
0.12204,0,2.89,"0",0.445,6.625,57.8,3.4952,2,276,18,357.98,6.65,28.4
0.11504,0,2.89,"0",0.445,6.163,69.6,3.4952,2,276,18,391.83,11.34,21.4
0.12083,0,2.89,"0",0.445,8.069,76,3.4952,2,276,18,396.9,4.21,38.7
0.08187,0,2.89,"0",0.445,7.82,36.9,3.4952,2,276,18,393.53,3.57,43.8
0.0686,0,2.89,"0",0.445,7.416,62.5,3.4952,2,276,18,396.9,6.19,33.2
0.14866,0,8.56,"0",0.52,6.727,79.9,2.7778,5,384,20.9,394.76,9.42,27.5
0.11432,0,8.56,"0",0.52,6.781,71.3,2.8561,5,384,20.9,395.58,7.67,26.5
0.22876,0,8.56,"0",0.52,6.405,85.4,2.7147,5,384,20.9,70.8,10.63,18.6
0.21161,0,8.56,"0",0.52,6.137,87.4,2.7147,5,384,20.9,394.47,13.44,19.3
0.1396,0,8.56,"0",0.52,6.167,90,2.421,5,384,20.9,392.69,12.33,20.1
0.13262,0,8.56,"0",0.52,5.851,96.7,2.1069,5,384,20.9,394.05,16.47,19.5
0.1712,0,8.56,"0",0.52,5.836,91.9,2.211,5,384,20.9,395.67,18.66,19.5
0.13117,0,8.56,"0",0.52,6.127,85.2,2.1224,5,384,20.9,387.69,14.09,20.4
0.12802,0,8.56,"0",0.52,6.474,97.1,2.4329,5,384,20.9,395.24,12.27,19.8
0.26363,0,8.56,"0",0.52,6.229,91.2,2.5451,5,384,20.9,391.23,15.55,19.4
0.10793,0,8.56,"0",0.52,6.195,54.4,2.7778,5,384,20.9,393.49,13,21.7
0.10084,0,10.01,"0",0.547,6.715,81.6,2.6775,6,432,17.8,395.59,10.16,22.8
0.12329,0,10.01,"0",0.547,5.913,92.9,2.3534,6,432,17.8,394.95,16.21,18.8
0.22212,0,10.01,"0",0.547,6.092,95.4,2.548,6,432,17.8,396.9,17.09,18.7
0.14231,0,10.01,"0",0.547,6.254,84.2,2.2565,6,432,17.8,388.74,10.45,18.5
0.17134,0,10.01,"0",0.547,5.928,88.2,2.4631,6,432,17.8,344.91,15.76,18.3
0.13158,0,10.01,"0",0.547,6.176,72.5,2.7301,6,432,17.8,393.3,12.04,21.2
0.15098,0,10.01,"0",0.547,6.021,82.6,2.7474,6,432,17.8,394.51,10.3,19.2
0.13058,0,10.01,"0",0.547,5.872,73.1,2.4775,6,432,17.8,338.63,15.37,20.4
0.14476,0,10.01,"0",0.547,5.731,65.2,2.7592,6,432,17.8,391.5,13.61,19.3
0.06899,0,25.65,"0",0.581,5.87,69.7,2.2577,2,188,19.1,389.15,14.37,22
0.07165,0,25.65,"0",0.581,6.004,84.1,2.1974,2,188,19.1,377.67,14.27,20.3
0.09299,0,25.65,"0",0.581,5.961,92.9,2.0869,2,188,19.1,378.09,17.93,20.5
0.15038,0,25.65,"0",0.581,5.856,97,1.9444,2,188,19.1,370.31,25.41,17.3
0.09849,0,25.65,"0",0.581,5.879,95.8,2.0063,2,188,19.1,379.38,17.58,18.8
0.16902,0,25.65,"0",0.581,5.986,88.4,1.9929,2,188,19.1,385.02,14.81,21.4
0.38735,0,25.65,"0",0.581,5.613,95.6,1.7572,2,188,19.1,359.29,27.26,15.7
0.25915,0,21.89,"0",0.624,5.693,96,1.7883,4,437,21.2,392.11,17.19,16.2
0.32543,0,21.89,"0",0.624,6.431,98.8,1.8125,4,437,21.2,396.9,15.39,18
0.88125,0,21.89,"0",0.624,5.637,94.7,1.9799,4,437,21.2,396.9,18.34,14.3
0.34006,0,21.89,"0",0.624,6.458,98.9,2.1185,4,437,21.2,395.04,12.6,19.2
1.19294,0,21.89,"0",0.624,6.326,97.7,2.271,4,437,21.2,396.9,12.26,19.6
0.59005,0,21.89,"0",0.624,6.372,97.9,2.3274,4,437,21.2,385.76,11.12,23
0.32982,0,21.89,"0",0.624,5.822,95.4,2.4699,4,437,21.2,388.69,15.03,18.4
0.97617,0,21.89,"0",0.624,5.757,98.4,2.346,4,437,21.2,262.76,17.31,15.6
0.55778,0,21.89,"0",0.624,6.335,98.2,2.1107,4,437,21.2,394.67,16.96,18.1
0.32264,0,21.89,"0",0.624,5.942,93.5,1.9669,4,437,21.2,378.25,16.9,17.4
0.35233,0,21.89,"0",0.624,6.454,98.4,1.8498,4,437,21.2,394.08,14.59,17.1
0.2498,0,21.89,"0",0.624,5.857,98.2,1.6686,4,437,21.2,392.04,21.32,13.3
0.54452,0,21.89,"0",0.624,6.151,97.9,1.6687,4,437,21.2,396.9,18.46,17.8
0.2909,0,21.89,"0",0.624,6.174,93.6,1.6119,4,437,21.2,388.08,24.16,14
1.62864,0,21.89,"0",0.624,5.019,100,1.4394,4,437,21.2,396.9,34.41,14.4
3.32105,0,19.58,"1",0.871,5.403,100,1.3216,5,403,14.7,396.9,26.82,13.4
4.0974,0,19.58,"0",0.871,5.468,100,1.4118,5,403,14.7,396.9,26.42,15.6
2.77974,0,19.58,"0",0.871,4.903,97.8,1.3459,5,403,14.7,396.9,29.29,11.8
2.37934,0,19.58,"0",0.871,6.13,100,1.4191,5,403,14.7,172.91,27.8,13.8
2.15505,0,19.58,"0",0.871,5.628,100,1.5166,5,403,14.7,169.27,16.65,15.6
2.36862,0,19.58,"0",0.871,4.926,95.7,1.4608,5,403,14.7,391.71,29.53,14.6
2.33099,0,19.58,"0",0.871,5.186,93.8,1.5296,5,403,14.7,356.99,28.32,17.8
2.73397,0,19.58,"0",0.871,5.597,94.9,1.5257,5,403,14.7,351.85,21.45,15.4
1.6566,0,19.58,"0",0.871,6.122,97.3,1.618,5,403,14.7,372.8,14.1,21.5
1.49632,0,19.58,"0",0.871,5.404,100,1.5916,5,403,14.7,341.6,13.28,19.6
1.12658,0,19.58,"1",0.871,5.012,88,1.6102,5,403,14.7,343.28,12.12,15.3
2.14918,0,19.58,"0",0.871,5.709,98.5,1.6232,5,403,14.7,261.95,15.79,19.4
1.41385,0,19.58,"1",0.871,6.129,96,1.7494,5,403,14.7,321.02,15.12,17
3.53501,0,19.58,"1",0.871,6.152,82.6,1.7455,5,403,14.7,88.01,15.02,15.6
2.44668,0,19.58,"0",0.871,5.272,94,1.7364,5,403,14.7,88.63,16.14,13.1
1.22358,0,19.58,"0",0.605,6.943,97.4,1.8773,5,403,14.7,363.43,4.59,41.3
1.34284,0,19.58,"0",0.605,6.066,100,1.7573,5,403,14.7,353.89,6.43,24.3
1.42502,0,19.58,"0",0.871,6.51,100,1.7659,5,403,14.7,364.31,7.39,23.3
1.27346,0,19.58,"1",0.605,6.25,92.6,1.7984,5,403,14.7,338.92,5.5,27
1.46336,0,19.58,"0",0.605,7.489,90.8,1.9709,5,403,14.7,374.43,1.73,50
1.83377,0,19.58,"1",0.605,7.802,98.2,2.0407,5,403,14.7,389.61,1.92,50
1.51902,0,19.58,"1",0.605,8.375,93.9,2.162,5,403,14.7,388.45,3.32,50
2.24236,0,19.58,"0",0.605,5.854,91.8,2.422,5,403,14.7,395.11,11.64,22.7
2.924,0,19.58,"0",0.605,6.101,93,2.2834,5,403,14.7,240.16,9.81,25
2.01019,0,19.58,"0",0.605,7.929,96.2,2.0459,5,403,14.7,369.3,3.7,50
1.80028,0,19.58,"0",0.605,5.877,79.2,2.4259,5,403,14.7,227.61,12.14,23.8
2.3004,0,19.58,"0",0.605,6.319,96.1,2.1,5,403,14.7,297.09,11.1,23.8
2.44953,0,19.58,"0",0.605,6.402,95.2,2.2625,5,403,14.7,330.04,11.32,22.3
1.20742,0,19.58,"0",0.605,5.875,94.6,2.4259,5,403,14.7,292.29,14.43,17.4
2.3139,0,19.58,"0",0.605,5.88,97.3,2.3887,5,403,14.7,348.13,12.03,19.1
0.13914,0,4.05,"0",0.51,5.572,88.5,2.5961,5,296,16.6,396.9,14.69,23.1
0.09178,0,4.05,"0",0.51,6.416,84.1,2.6463,5,296,16.6,395.5,9.04,23.6
0.08447,0,4.05,"0",0.51,5.859,68.7,2.7019,5,296,16.6,393.23,9.64,22.6
0.06664,0,4.05,"0",0.51,6.546,33.1,3.1323,5,296,16.6,390.96,5.33,29.4
0.07022,0,4.05,"0",0.51,6.02,47.2,3.5549,5,296,16.6,393.23,10.11,23.2
0.05425,0,4.05,"0",0.51,6.315,73.4,3.3175,5,296,16.6,395.6,6.29,24.6
0.06642,0,4.05,"0",0.51,6.86,74.4,2.9153,5,296,16.6,391.27,6.92,29.9
0.0578,0,2.46,"0",0.488,6.98,58.4,2.829,3,193,17.8,396.9,5.04,37.2
0.06588,0,2.46,"0",0.488,7.765,83.3,2.741,3,193,17.8,395.56,7.56,39.8
0.06888,0,2.46,"0",0.488,6.144,62.2,2.5979,3,193,17.8,396.9,9.45,36.2
0.09103,0,2.46,"0",0.488,7.155,92.2,2.7006,3,193,17.8,394.12,4.82,37.9
0.10008,0,2.46,"0",0.488,6.563,95.6,2.847,3,193,17.8,396.9,5.68,32.5
0.08308,0,2.46,"0",0.488,5.604,89.8,2.9879,3,193,17.8,391,13.98,26.4
0.06047,0,2.46,"0",0.488,6.153,68.8,3.2797,3,193,17.8,387.11,13.15,29.6
0.05602,0,2.46,"0",0.488,7.831,53.6,3.1992,3,193,17.8,392.63,4.45,50
0.07875,45,3.44,"0",0.437,6.782,41.1,3.7886,5,398,15.2,393.87,6.68,32
0.12579,45,3.44,"0",0.437,6.556,29.1,4.5667,5,398,15.2,382.84,4.56,29.8
0.0837,45,3.44,"0",0.437,7.185,38.9,4.5667,5,398,15.2,396.9,5.39,34.9
0.09068,45,3.44,"0",0.437,6.951,21.5,6.4798,5,398,15.2,377.68,5.1,37
0.06911,45,3.44,"0",0.437,6.739,30.8,6.4798,5,398,15.2,389.71,4.69,30.5
0.08664,45,3.44,"0",0.437,7.178,26.3,6.4798,5,398,15.2,390.49,2.87,36.4
0.02187,60,2.93,"0",0.401,6.8,9.9,6.2196,1,265,15.6,393.37,5.03,31.1
0.01439,60,2.93,"0",0.401,6.604,18.8,6.2196,1,265,15.6,376.7,4.38,29.1
0.01381,80,0.46,"0",0.422,7.875,32,5.6484,4,255,14.4,394.23,2.97,50
0.04011,80,1.52,"0",0.404,7.287,34.1,7.309,2,329,12.6,396.9,4.08,33.3
0.04666,80,1.52,"0",0.404,7.107,36.6,7.309,2,329,12.6,354.31,8.61,30.3
0.03768,80,1.52,"0",0.404,7.274,38.3,7.309,2,329,12.6,392.2,6.62,34.6
0.0315,95,1.47,"0",0.403,6.975,15.3,7.6534,3,402,17,396.9,4.56,34.9
0.01778,95,1.47,"0",0.403,7.135,13.9,7.6534,3,402,17,384.3,4.45,32.9
0.03445,82.5,2.03,"0",0.415,6.162,38.4,6.27,2,348,14.7,393.77,7.43,24.1
0.02177,82.5,2.03,"0",0.415,7.61,15.7,6.27,2,348,14.7,395.38,3.11,42.3
0.0351,95,2.68,"0",0.4161,7.853,33.2,5.118,4,224,14.7,392.78,3.81,48.5
0.02009,95,2.68,"0",0.4161,8.034,31.9,5.118,4,224,14.7,390.55,2.88,50
0.13642,0,10.59,"0",0.489,5.891,22.3,3.9454,4,277,18.6,396.9,10.87,22.6
0.22969,0,10.59,"0",0.489,6.326,52.5,4.3549,4,277,18.6,394.87,10.97,24.4
0.25199,0,10.59,"0",0.489,5.783,72.7,4.3549,4,277,18.6,389.43,18.06,22.5
0.13587,0,10.59,"1",0.489,6.064,59.1,4.2392,4,277,18.6,381.32,14.66,24.4
0.43571,0,10.59,"1",0.489,5.344,100,3.875,4,277,18.6,396.9,23.09,20
0.17446,0,10.59,"1",0.489,5.96,92.1,3.8771,4,277,18.6,393.25,17.27,21.7
0.37578,0,10.59,"1",0.489,5.404,88.6,3.665,4,277,18.6,395.24,23.98,19.3
0.21719,0,10.59,"1",0.489,5.807,53.8,3.6526,4,277,18.6,390.94,16.03,22.4
0.14052,0,10.59,"0",0.489,6.375,32.3,3.9454,4,277,18.6,385.81,9.38,28.1
0.28955,0,10.59,"0",0.489,5.412,9.8,3.5875,4,277,18.6,348.93,29.55,23.7
0.19802,0,10.59,"0",0.489,6.182,42.4,3.9454,4,277,18.6,393.63,9.47,25
0.0456,0,13.89,"1",0.55,5.888,56,3.1121,5,276,16.4,392.8,13.51,23.3
0.07013,0,13.89,"0",0.55,6.642,85.1,3.4211,5,276,16.4,392.78,9.69,28.7
0.11069,0,13.89,"1",0.55,5.951,93.8,2.8893,5,276,16.4,396.9,17.92,21.5
0.11425,0,13.89,"1",0.55,6.373,92.4,3.3633,5,276,16.4,393.74,10.5,23
0.35809,0,6.2,"1",0.507,6.951,88.5,2.8617,8,307,17.4,391.7,9.71,26.7
0.40771,0,6.2,"1",0.507,6.164,91.3,3.048,8,307,17.4,395.24,21.46,21.7
0.62356,0,6.2,"1",0.507,6.879,77.7,3.2721,8,307,17.4,390.39,9.93,27.5
0.6147,0,6.2,"0",0.507,6.618,80.8,3.2721,8,307,17.4,396.9,7.6,30.1
0.31533,0,6.2,"0",0.504,8.266,78.3,2.8944,8,307,17.4,385.05,4.14,44.8
0.52693,0,6.2,"0",0.504,8.725,83,2.8944,8,307,17.4,382,4.63,50
0.38214,0,6.2,"0",0.504,8.04,86.5,3.2157,8,307,17.4,387.38,3.13,37.6
0.41238,0,6.2,"0",0.504,7.163,79.9,3.2157,8,307,17.4,372.08,6.36,31.6
0.29819,0,6.2,"0",0.504,7.686,17,3.3751,8,307,17.4,377.51,3.92,46.7
0.44178,0,6.2,"0",0.504,6.552,21.4,3.3751,8,307,17.4,380.34,3.76,31.5
0.537,0,6.2,"0",0.504,5.981,68.1,3.6715,8,307,17.4,378.35,11.65,24.3
0.46296,0,6.2,"0",0.504,7.412,76.9,3.6715,8,307,17.4,376.14,5.25,31.7
0.57529,0,6.2,"0",0.507,8.337,73.3,3.8384,8,307,17.4,385.91,2.47,41.7
0.33147,0,6.2,"0",0.507,8.247,70.4,3.6519,8,307,17.4,378.95,3.95,48.3
0.44791,0,6.2,"1",0.507,6.726,66.5,3.6519,8,307,17.4,360.2,8.05,29
0.33045,0,6.2,"0",0.507,6.086,61.5,3.6519,8,307,17.4,376.75,10.88,24
0.52058,0,6.2,"1",0.507,6.631,76.5,4.148,8,307,17.4,388.45,9.54,25.1
0.51183,0,6.2,"0",0.507,7.358,71.6,4.148,8,307,17.4,390.07,4.73,31.5
0.08244,30,4.93,"0",0.428,6.481,18.5,6.1899,6,300,16.6,379.41,6.36,23.7
0.09252,30,4.93,"0",0.428,6.606,42.2,6.1899,6,300,16.6,383.78,7.37,23.3
0.11329,30,4.93,"0",0.428,6.897,54.3,6.3361,6,300,16.6,391.25,11.38,22
0.10612,30,4.93,"0",0.428,6.095,65.1,6.3361,6,300,16.6,394.62,12.4,20.1
0.1029,30,4.93,"0",0.428,6.358,52.9,7.0355,6,300,16.6,372.75,11.22,22.2
0.12757,30,4.93,"0",0.428,6.393,7.8,7.0355,6,300,16.6,374.71,5.19,23.7
0.20608,22,5.86,"0",0.431,5.593,76.5,7.9549,7,330,19.1,372.49,12.5,17.6
0.19133,22,5.86,"0",0.431,5.605,70.2,7.9549,7,330,19.1,389.13,18.46,18.5
0.33983,22,5.86,"0",0.431,6.108,34.9,8.0555,7,330,19.1,390.18,9.16,24.3
0.19657,22,5.86,"0",0.431,6.226,79.2,8.0555,7,330,19.1,376.14,10.15,20.5
0.16439,22,5.86,"0",0.431,6.433,49.1,7.8265,7,330,19.1,374.71,9.52,24.5
0.19073,22,5.86,"0",0.431,6.718,17.5,7.8265,7,330,19.1,393.74,6.56,26.2
0.1403,22,5.86,"0",0.431,6.487,13,7.3967,7,330,19.1,396.28,5.9,24.4
0.21409,22,5.86,"0",0.431,6.438,8.9,7.3967,7,330,19.1,377.07,3.59,24.8
0.08221,22,5.86,"0",0.431,6.957,6.8,8.9067,7,330,19.1,386.09,3.53,29.6
0.36894,22,5.86,"0",0.431,8.259,8.4,8.9067,7,330,19.1,396.9,3.54,42.8
0.04819,80,3.64,"0",0.392,6.108,32,9.2203,1,315,16.4,392.89,6.57,21.9
0.03548,80,3.64,"0",0.392,5.876,19.1,9.2203,1,315,16.4,395.18,9.25,20.9
0.01538,90,3.75,"0",0.394,7.454,34.2,6.3361,3,244,15.9,386.34,3.11,44
0.61154,20,3.97,"0",0.647,8.704,86.9,1.801,5,264,13,389.7,5.12,50
0.66351,20,3.97,"0",0.647,7.333,100,1.8946,5,264,13,383.29,7.79,36
0.65665,20,3.97,"0",0.647,6.842,100,2.0107,5,264,13,391.93,6.9,30.1
0.54011,20,3.97,"0",0.647,7.203,81.8,2.1121,5,264,13,392.8,9.59,33.8
0.53412,20,3.97,"0",0.647,7.52,89.4,2.1398,5,264,13,388.37,7.26,43.1
0.52014,20,3.97,"0",0.647,8.398,91.5,2.2885,5,264,13,386.86,5.91,48.8
0.82526,20,3.97,"0",0.647,7.327,94.5,2.0788,5,264,13,393.42,11.25,31
0.55007,20,3.97,"0",0.647,7.206,91.6,1.9301,5,264,13,387.89,8.1,36.5
0.76162,20,3.97,"0",0.647,5.56,62.8,1.9865,5,264,13,392.4,10.45,22.8
0.7857,20,3.97,"0",0.647,7.014,84.6,2.1329,5,264,13,384.07,14.79,30.7
0.57834,20,3.97,"0",0.575,8.297,67,2.4216,5,264,13,384.54,7.44,50
0.5405,20,3.97,"0",0.575,7.47,52.6,2.872,5,264,13,390.3,3.16,43.5
0.09065,20,6.96,"1",0.464,5.92,61.5,3.9175,3,223,18.6,391.34,13.65,20.7
0.29916,20,6.96,"0",0.464,5.856,42.1,4.429,3,223,18.6,388.65,13,21.1
0.16211,20,6.96,"0",0.464,6.24,16.3,4.429,3,223,18.6,396.9,6.59,25.2
0.1146,20,6.96,"0",0.464,6.538,58.7,3.9175,3,223,18.6,394.96,7.73,24.4
0.22188,20,6.96,"1",0.464,7.691,51.8,4.3665,3,223,18.6,390.77,6.58,35.2
0.05644,40,6.41,"1",0.447,6.758,32.9,4.0776,4,254,17.6,396.9,3.53,32.4
0.09604,40,6.41,"0",0.447,6.854,42.8,4.2673,4,254,17.6,396.9,2.98,32
0.10469,40,6.41,"1",0.447,7.267,49,4.7872,4,254,17.6,389.25,6.05,33.2
0.06127,40,6.41,"1",0.447,6.826,27.6,4.8628,4,254,17.6,393.45,4.16,33.1
0.07978,40,6.41,"0",0.447,6.482,32.1,4.1403,4,254,17.6,396.9,7.19,29.1
0.21038,20,3.33,"0",0.4429,6.812,32.2,4.1007,5,216,14.9,396.9,4.85,35.1
0.03578,20,3.33,"0",0.4429,7.82,64.5,4.6947,5,216,14.9,387.31,3.76,45.4
0.03705,20,3.33,"0",0.4429,6.968,37.2,5.2447,5,216,14.9,392.23,4.59,35.4
0.06129,20,3.33,"1",0.4429,7.645,49.7,5.2119,5,216,14.9,377.07,3.01,46
0.01501,90,1.21,"1",0.401,7.923,24.8,5.885,1,198,13.6,395.52,3.16,50
0.00906,90,2.97,"0",0.4,7.088,20.8,7.3073,1,285,15.3,394.72,7.85,32.2
0.01096,55,2.25,"0",0.389,6.453,31.9,7.3073,1,300,15.3,394.72,8.23,22
0.01965,80,1.76,"0",0.385,6.23,31.5,9.0892,1,241,18.2,341.6,12.93,20.1
0.03871,52.5,5.32,"0",0.405,6.209,31.3,7.3172,6,293,16.6,396.9,7.14,23.2
0.0459,52.5,5.32,"0",0.405,6.315,45.6,7.3172,6,293,16.6,396.9,7.6,22.3
0.04297,52.5,5.32,"0",0.405,6.565,22.9,7.3172,6,293,16.6,371.72,9.51,24.8
0.03502,80,4.95,"0",0.411,6.861,27.9,5.1167,4,245,19.2,396.9,3.33,28.5
0.07886,80,4.95,"0",0.411,7.148,27.7,5.1167,4,245,19.2,396.9,3.56,37.3
0.03615,80,4.95,"0",0.411,6.63,23.4,5.1167,4,245,19.2,396.9,4.7,27.9
0.08265,0,13.92,"0",0.437,6.127,18.4,5.5027,4,289,16,396.9,8.58,23.9
0.08199,0,13.92,"0",0.437,6.009,42.3,5.5027,4,289,16,396.9,10.4,21.7
0.12932,0,13.92,"0",0.437,6.678,31.1,5.9604,4,289,16,396.9,6.27,28.6
0.05372,0,13.92,"0",0.437,6.549,51,5.9604,4,289,16,392.85,7.39,27.1
0.14103,0,13.92,"0",0.437,5.79,58,6.32,4,289,16,396.9,15.84,20.3
0.06466,70,2.24,"0",0.4,6.345,20.1,7.8278,5,358,14.8,368.24,4.97,22.5
0.05561,70,2.24,"0",0.4,7.041,10,7.8278,5,358,14.8,371.58,4.74,29
0.04417,70,2.24,"0",0.4,6.871,47.4,7.8278,5,358,14.8,390.86,6.07,24.8
0.03537,34,6.09,"0",0.433,6.59,40.4,5.4917,7,329,16.1,395.75,9.5,22
0.09266,34,6.09,"0",0.433,6.495,18.4,5.4917,7,329,16.1,383.61,8.67,26.4
0.1,34,6.09,"0",0.433,6.982,17.7,5.4917,7,329,16.1,390.43,4.86,33.1
0.05515,33,2.18,"0",0.472,7.236,41.1,4.022,7,222,18.4,393.68,6.93,36.1
0.05479,33,2.18,"0",0.472,6.616,58.1,3.37,7,222,18.4,393.36,8.93,28.4
0.07503,33,2.18,"0",0.472,7.42,71.9,3.0992,7,222,18.4,396.9,6.47,33.4
0.04932,33,2.18,"0",0.472,6.849,70.3,3.1827,7,222,18.4,396.9,7.53,28.2
0.49298,0,9.9,"0",0.544,6.635,82.5,3.3175,4,304,18.4,396.9,4.54,22.8
0.3494,0,9.9,"0",0.544,5.972,76.7,3.1025,4,304,18.4,396.24,9.97,20.3
2.63548,0,9.9,"0",0.544,4.973,37.8,2.5194,4,304,18.4,350.45,12.64,16.1
0.79041,0,9.9,"0",0.544,6.122,52.8,2.6403,4,304,18.4,396.9,5.98,22.1
0.26169,0,9.9,"0",0.544,6.023,90.4,2.834,4,304,18.4,396.3,11.72,19.4
0.26938,0,9.9,"0",0.544,6.266,82.8,3.2628,4,304,18.4,393.39,7.9,21.6
0.3692,0,9.9,"0",0.544,6.567,87.3,3.6023,4,304,18.4,395.69,9.28,23.8
0.25356,0,9.9,"0",0.544,5.705,77.7,3.945,4,304,18.4,396.42,11.5,16.2
0.31827,0,9.9,"0",0.544,5.914,83.2,3.9986,4,304,18.4,390.7,18.33,17.8
0.24522,0,9.9,"0",0.544,5.782,71.7,4.0317,4,304,18.4,396.9,15.94,19.8
0.40202,0,9.9,"0",0.544,6.382,67.2,3.5325,4,304,18.4,395.21,10.36,23.1
0.47547,0,9.9,"0",0.544,6.113,58.8,4.0019,4,304,18.4,396.23,12.73,21
0.1676,0,7.38,"0",0.493,6.426,52.3,4.5404,5,287,19.6,396.9,7.2,23.8
0.18159,0,7.38,"0",0.493,6.376,54.3,4.5404,5,287,19.6,396.9,6.87,23.1
0.35114,0,7.38,"0",0.493,6.041,49.9,4.7211,5,287,19.6,396.9,7.7,20.4
0.28392,0,7.38,"0",0.493,5.708,74.3,4.7211,5,287,19.6,391.13,11.74,18.5
0.34109,0,7.38,"0",0.493,6.415,40.1,4.7211,5,287,19.6,396.9,6.12,25
0.19186,0,7.38,"0",0.493,6.431,14.7,5.4159,5,287,19.6,393.68,5.08,24.6
0.30347,0,7.38,"0",0.493,6.312,28.9,5.4159,5,287,19.6,396.9,6.15,23
0.24103,0,7.38,"0",0.493,6.083,43.7,5.4159,5,287,19.6,396.9,12.79,22.2
0.06617,0,3.24,"0",0.46,5.868,25.8,5.2146,4,430,16.9,382.44,9.97,19.3
0.06724,0,3.24,"0",0.46,6.333,17.2,5.2146,4,430,16.9,375.21,7.34,22.6
0.04544,0,3.24,"0",0.46,6.144,32.2,5.8736,4,430,16.9,368.57,9.09,19.8
0.05023,35,6.06,"0",0.4379,5.706,28.4,6.6407,1,304,16.9,394.02,12.43,17.1
0.03466,35,6.06,"0",0.4379,6.031,23.3,6.6407,1,304,16.9,362.25,7.83,19.4
0.05083,0,5.19,"0",0.515,6.316,38.1,6.4584,5,224,20.2,389.71,5.68,22.2
0.03738,0,5.19,"0",0.515,6.31,38.5,6.4584,5,224,20.2,389.4,6.75,20.7
0.03961,0,5.19,"0",0.515,6.037,34.5,5.9853,5,224,20.2,396.9,8.01,21.1
0.03427,0,5.19,"0",0.515,5.869,46.3,5.2311,5,224,20.2,396.9,9.8,19.5
0.03041,0,5.19,"0",0.515,5.895,59.6,5.615,5,224,20.2,394.81,10.56,18.5
0.03306,0,5.19,"0",0.515,6.059,37.3,4.8122,5,224,20.2,396.14,8.51,20.6
0.05497,0,5.19,"0",0.515,5.985,45.4,4.8122,5,224,20.2,396.9,9.74,19
0.06151,0,5.19,"0",0.515,5.968,58.5,4.8122,5,224,20.2,396.9,9.29,18.7
0.01301,35,1.52,"0",0.442,7.241,49.3,7.0379,1,284,15.5,394.74,5.49,32.7
0.02498,0,1.89,"0",0.518,6.54,59.7,6.2669,1,422,15.9,389.96,8.65,16.5
0.02543,55,3.78,"0",0.484,6.696,56.4,5.7321,5,370,17.6,396.9,7.18,23.9
0.03049,55,3.78,"0",0.484,6.874,28.1,6.4654,5,370,17.6,387.97,4.61,31.2
0.03113,0,4.39,"0",0.442,6.014,48.5,8.0136,3,352,18.8,385.64,10.53,17.5
0.06162,0,4.39,"0",0.442,5.898,52.3,8.0136,3,352,18.8,364.61,12.67,17.2
0.0187,85,4.15,"0",0.429,6.516,27.7,8.5353,4,351,17.9,392.43,6.36,23.1
0.01501,80,2.01,"0",0.435,6.635,29.7,8.344,4,280,17,390.94,5.99,24.5
0.02899,40,1.25,"0",0.429,6.939,34.5,8.7921,1,335,19.7,389.85,5.89,26.6
0.06211,40,1.25,"0",0.429,6.49,44.4,8.7921,1,335,19.7,396.9,5.98,22.9
0.0795,60,1.69,"0",0.411,6.579,35.9,10.7103,4,411,18.3,370.78,5.49,24.1
0.07244,60,1.69,"0",0.411,5.884,18.5,10.7103,4,411,18.3,392.33,7.79,18.6
0.01709,90,2.02,"0",0.41,6.728,36.1,12.1265,5,187,17,384.46,4.5,30.1
0.04301,80,1.91,"0",0.413,5.663,21.9,10.5857,4,334,22,382.8,8.05,18.2
0.10659,80,1.91,"0",0.413,5.936,19.5,10.5857,4,334,22,376.04,5.57,20.6
8.98296,0,18.1,"1",0.77,6.212,97.4,2.1222,24,666,20.2,377.73,17.6,17.8
3.8497,0,18.1,"1",0.77,6.395,91,2.5052,24,666,20.2,391.34,13.27,21.7
5.20177,0,18.1,"1",0.77,6.127,83.4,2.7227,24,666,20.2,395.43,11.48,22.7
4.26131,0,18.1,"0",0.77,6.112,81.3,2.5091,24,666,20.2,390.74,12.67,22.6
4.54192,0,18.1,"0",0.77,6.398,88,2.5182,24,666,20.2,374.56,7.79,25
3.83684,0,18.1,"0",0.77,6.251,91.1,2.2955,24,666,20.2,350.65,14.19,19.9
3.67822,0,18.1,"0",0.77,5.362,96.2,2.1036,24,666,20.2,380.79,10.19,20.8
4.22239,0,18.1,"1",0.77,5.803,89,1.9047,24,666,20.2,353.04,14.64,16.8
3.47428,0,18.1,"1",0.718,8.78,82.9,1.9047,24,666,20.2,354.55,5.29,21.9
4.55587,0,18.1,"0",0.718,3.561,87.9,1.6132,24,666,20.2,354.7,7.12,27.5
3.69695,0,18.1,"0",0.718,4.963,91.4,1.7523,24,666,20.2,316.03,14,21.9
13.5222,0,18.1,"0",0.631,3.863,100,1.5106,24,666,20.2,131.42,13.33,23.1
4.89822,0,18.1,"0",0.631,4.97,100,1.3325,24,666,20.2,375.52,3.26,50
5.66998,0,18.1,"1",0.631,6.683,96.8,1.3567,24,666,20.2,375.33,3.73,50
6.53876,0,18.1,"1",0.631,7.016,97.5,1.2024,24,666,20.2,392.05,2.96,50
9.2323,0,18.1,"0",0.631,6.216,100,1.1691,24,666,20.2,366.15,9.53,50
8.26725,0,18.1,"1",0.668,5.875,89.6,1.1296,24,666,20.2,347.88,8.88,50
11.1081,0,18.1,"0",0.668,4.906,100,1.1742,24,666,20.2,396.9,34.77,13.8
18.4982,0,18.1,"0",0.668,4.138,100,1.137,24,666,20.2,396.9,37.97,13.8
19.6091,0,18.1,"0",0.671,7.313,97.9,1.3163,24,666,20.2,396.9,13.44,15
15.288,0,18.1,"0",0.671,6.649,93.3,1.3449,24,666,20.2,363.02,23.24,13.9
9.82349,0,18.1,"0",0.671,6.794,98.8,1.358,24,666,20.2,396.9,21.24,13.3
23.6482,0,18.1,"0",0.671,6.38,96.2,1.3861,24,666,20.2,396.9,23.69,13.1
17.8667,0,18.1,"0",0.671,6.223,100,1.3861,24,666,20.2,393.74,21.78,10.2
88.9762,0,18.1,"0",0.671,6.968,91.9,1.4165,24,666,20.2,396.9,17.21,10.4
15.8744,0,18.1,"0",0.671,6.545,99.1,1.5192,24,666,20.2,396.9,21.08,10.9
9.18702,0,18.1,"0",0.7,5.536,100,1.5804,24,666,20.2,396.9,23.6,11.3
7.99248,0,18.1,"0",0.7,5.52,100,1.5331,24,666,20.2,396.9,24.56,12.3
20.0849,0,18.1,"0",0.7,4.368,91.2,1.4395,24,666,20.2,285.83,30.63,8.8
16.8118,0,18.1,"0",0.7,5.277,98.1,1.4261,24,666,20.2,396.9,30.81,7.2
24.3938,0,18.1,"0",0.7,4.652,100,1.4672,24,666,20.2,396.9,28.28,10.5
22.5971,0,18.1,"0",0.7,5,89.5,1.5184,24,666,20.2,396.9,31.99,7.4
14.3337,0,18.1,"0",0.7,4.88,100,1.5895,24,666,20.2,372.92,30.62,10.2
8.15174,0,18.1,"0",0.7,5.39,98.9,1.7281,24,666,20.2,396.9,20.85,11.5
6.96215,0,18.1,"0",0.7,5.713,97,1.9265,24,666,20.2,394.43,17.11,15.1
5.29305,0,18.1,"0",0.7,6.051,82.5,2.1678,24,666,20.2,378.38,18.76,23.2
11.5779,0,18.1,"0",0.7,5.036,97,1.77,24,666,20.2,396.9,25.68,9.7
8.64476,0,18.1,"0",0.693,6.193,92.6,1.7912,24,666,20.2,396.9,15.17,13.8
13.3598,0,18.1,"0",0.693,5.887,94.7,1.7821,24,666,20.2,396.9,16.35,12.7
8.71675,0,18.1,"0",0.693,6.471,98.8,1.7257,24,666,20.2,391.98,17.12,13.1
5.87205,0,18.1,"0",0.693,6.405,96,1.6768,24,666,20.2,396.9,19.37,12.5
7.67202,0,18.1,"0",0.693,5.747,98.9,1.6334,24,666,20.2,393.1,19.92,8.5
38.3518,0,18.1,"0",0.693,5.453,100,1.4896,24,666,20.2,396.9,30.59,5
9.91655,0,18.1,"0",0.693,5.852,77.8,1.5004,24,666,20.2,338.16,29.97,6.3
25.0461,0,18.1,"0",0.693,5.987,100,1.5888,24,666,20.2,396.9,26.77,5.6
14.2362,0,18.1,"0",0.693,6.343,100,1.5741,24,666,20.2,396.9,20.32,7.2
9.59571,0,18.1,"0",0.693,6.404,100,1.639,24,666,20.2,376.11,20.31,12.1
24.8017,0,18.1,"0",0.693,5.349,96,1.7028,24,666,20.2,396.9,19.77,8.3
41.5292,0,18.1,"0",0.693,5.531,85.4,1.6074,24,666,20.2,329.46,27.38,8.5
67.9208,0,18.1,"0",0.693,5.683,100,1.4254,24,666,20.2,384.97,22.98,5
20.7162,0,18.1,"0",0.659,4.138,100,1.1781,24,666,20.2,370.22,23.34,11.9
11.9511,0,18.1,"0",0.659,5.608,100,1.2852,24,666,20.2,332.09,12.13,27.9
7.40389,0,18.1,"0",0.597,5.617,97.9,1.4547,24,666,20.2,314.64,26.4,17.2
14.4383,0,18.1,"0",0.597,6.852,100,1.4655,24,666,20.2,179.36,19.78,27.5
51.1358,0,18.1,"0",0.597,5.757,100,1.413,24,666,20.2,2.6,10.11,15
14.0507,0,18.1,"0",0.597,6.657,100,1.5275,24,666,20.2,35.05,21.22,17.2
18.811,0,18.1,"0",0.597,4.628,100,1.5539,24,666,20.2,28.79,34.37,17.9
28.6558,0,18.1,"0",0.597,5.155,100,1.5894,24,666,20.2,210.97,20.08,16.3
45.7461,0,18.1,"0",0.693,4.519,100,1.6582,24,666,20.2,88.27,36.98,7
18.0846,0,18.1,"0",0.679,6.434,100,1.8347,24,666,20.2,27.25,29.05,7.2
10.8342,0,18.1,"0",0.679,6.782,90.8,1.8195,24,666,20.2,21.57,25.79,7.5
25.9406,0,18.1,"0",0.679,5.304,89.1,1.6475,24,666,20.2,127.36,26.64,10.4
73.5341,0,18.1,"0",0.679,5.957,100,1.8026,24,666,20.2,16.45,20.62,8.8
11.8123,0,18.1,"0",0.718,6.824,76.5,1.794,24,666,20.2,48.45,22.74,8.4
11.0874,0,18.1,"0",0.718,6.411,100,1.8589,24,666,20.2,318.75,15.02,16.7
7.02259,0,18.1,"0",0.718,6.006,95.3,1.8746,24,666,20.2,319.98,15.7,14.2
12.0482,0,18.1,"0",0.614,5.648,87.6,1.9512,24,666,20.2,291.55,14.1,20.8
7.05042,0,18.1,"0",0.614,6.103,85.1,2.0218,24,666,20.2,2.52,23.29,13.4
8.79212,0,18.1,"0",0.584,5.565,70.6,2.0635,24,666,20.2,3.65,17.16,11.7
15.8603,0,18.1,"0",0.679,5.896,95.4,1.9096,24,666,20.2,7.68,24.39,8.3
12.2472,0,18.1,"0",0.584,5.837,59.7,1.9976,24,666,20.2,24.65,15.69,10.2
37.6619,0,18.1,"0",0.679,6.202,78.7,1.8629,24,666,20.2,18.82,14.52,10.9
7.36711,0,18.1,"0",0.679,6.193,78.1,1.9356,24,666,20.2,96.73,21.52,11
9.33889,0,18.1,"0",0.679,6.38,95.6,1.9682,24,666,20.2,60.72,24.08,9.5
8.49213,0,18.1,"0",0.584,6.348,86.1,2.0527,24,666,20.2,83.45,17.64,14.5
10.0623,0,18.1,"0",0.584,6.833,94.3,2.0882,24,666,20.2,81.33,19.69,14.1
6.44405,0,18.1,"0",0.584,6.425,74.8,2.2004,24,666,20.2,97.95,12.03,16.1
5.58107,0,18.1,"0",0.713,6.436,87.9,2.3158,24,666,20.2,100.19,16.22,14.3
13.9134,0,18.1,"0",0.713,6.208,95,2.2222,24,666,20.2,100.63,15.17,11.7
11.1604,0,18.1,"0",0.74,6.629,94.6,2.1247,24,666,20.2,109.85,23.27,13.4
14.4208,0,18.1,"0",0.74,6.461,93.3,2.0026,24,666,20.2,27.49,18.05,9.6
15.1772,0,18.1,"0",0.74,6.152,100,1.9142,24,666,20.2,9.32,26.45,8.7
13.6781,0,18.1,"0",0.74,5.935,87.9,1.8206,24,666,20.2,68.95,34.02,8.4
9.39063,0,18.1,"0",0.74,5.627,93.9,1.8172,24,666,20.2,396.9,22.88,12.8
22.0511,0,18.1,"0",0.74,5.818,92.4,1.8662,24,666,20.2,391.45,22.11,10.5
9.72418,0,18.1,"0",0.74,6.406,97.2,2.0651,24,666,20.2,385.96,19.52,17.1
5.66637,0,18.1,"0",0.74,6.219,100,2.0048,24,666,20.2,395.69,16.59,18.4
9.96654,0,18.1,"0",0.74,6.485,100,1.9784,24,666,20.2,386.73,18.85,15.4
12.8023,0,18.1,"0",0.74,5.854,96.6,1.8956,24,666,20.2,240.52,23.79,10.8
10.6718,0,18.1,"0",0.74,6.459,94.8,1.9879,24,666,20.2,43.06,23.98,11.8
6.28807,0,18.1,"0",0.74,6.341,96.4,2.072,24,666,20.2,318.01,17.79,14.9
9.92485,0,18.1,"0",0.74,6.251,96.6,2.198,24,666,20.2,388.52,16.44,12.6
9.32909,0,18.1,"0",0.713,6.185,98.7,2.2616,24,666,20.2,396.9,18.13,14.1
7.52601,0,18.1,"0",0.713,6.417,98.3,2.185,24,666,20.2,304.21,19.31,13
6.71772,0,18.1,"0",0.713,6.749,92.6,2.3236,24,666,20.2,0.32,17.44,13.4
5.44114,0,18.1,"0",0.713,6.655,98.2,2.3552,24,666,20.2,355.29,17.73,15.2
5.09017,0,18.1,"0",0.713,6.297,91.8,2.3682,24,666,20.2,385.09,17.27,16.1
8.24809,0,18.1,"0",0.713,7.393,99.3,2.4527,24,666,20.2,375.87,16.74,17.8
9.51363,0,18.1,"0",0.713,6.728,94.1,2.4961,24,666,20.2,6.68,18.71,14.9
4.75237,0,18.1,"0",0.713,6.525,86.5,2.4358,24,666,20.2,50.92,18.13,14.1
4.66883,0,18.1,"0",0.713,5.976,87.9,2.5806,24,666,20.2,10.48,19.01,12.7
8.20058,0,18.1,"0",0.713,5.936,80.3,2.7792,24,666,20.2,3.5,16.94,13.5
7.75223,0,18.1,"0",0.713,6.301,83.7,2.7831,24,666,20.2,272.21,16.23,14.9
6.80117,0,18.1,"0",0.713,6.081,84.4,2.7175,24,666,20.2,396.9,14.7,20
4.81213,0,18.1,"0",0.713,6.701,90,2.5975,24,666,20.2,255.23,16.42,16.4
3.69311,0,18.1,"0",0.713,6.376,88.4,2.5671,24,666,20.2,391.43,14.65,17.7
6.65492,0,18.1,"0",0.713,6.317,83,2.7344,24,666,20.2,396.9,13.99,19.5
5.82115,0,18.1,"0",0.713,6.513,89.9,2.8016,24,666,20.2,393.82,10.29,20.2
7.83932,0,18.1,"0",0.655,6.209,65.4,2.9634,24,666,20.2,396.9,13.22,21.4
3.1636,0,18.1,"0",0.655,5.759,48.2,3.0665,24,666,20.2,334.4,14.13,19.9
3.77498,0,18.1,"0",0.655,5.952,84.7,2.8715,24,666,20.2,22.01,17.15,19
4.42228,0,18.1,"0",0.584,6.003,94.5,2.5403,24,666,20.2,331.29,21.32,19.1
15.5757,0,18.1,"0",0.58,5.926,71,2.9084,24,666,20.2,368.74,18.13,19.1
13.0751,0,18.1,"0",0.58,5.713,56.7,2.8237,24,666,20.2,396.9,14.76,20.1
4.34879,0,18.1,"0",0.58,6.167,84,3.0334,24,666,20.2,396.9,16.29,19.9
4.03841,0,18.1,"0",0.532,6.229,90.7,3.0993,24,666,20.2,395.33,12.87,19.6
3.56868,0,18.1,"0",0.58,6.437,75,2.8965,24,666,20.2,393.37,14.36,23.2
4.64689,0,18.1,"0",0.614,6.98,67.6,2.5329,24,666,20.2,374.68,11.66,29.8
8.05579,0,18.1,"0",0.584,5.427,95.4,2.4298,24,666,20.2,352.58,18.14,13.8
6.39312,0,18.1,"0",0.584,6.162,97.4,2.206,24,666,20.2,302.76,24.1,13.3
4.87141,0,18.1,"0",0.614,6.484,93.6,2.3053,24,666,20.2,396.21,18.68,16.7
15.0234,0,18.1,"0",0.614,5.304,97.3,2.1007,24,666,20.2,349.48,24.91,12
10.233,0,18.1,"0",0.614,6.185,96.7,2.1705,24,666,20.2,379.7,18.03,14.6
14.3337,0,18.1,"0",0.614,6.229,88,1.9512,24,666,20.2,383.32,13.11,21.4
5.82401,0,18.1,"0",0.532,6.242,64.7,3.4242,24,666,20.2,396.9,10.74,23
5.70818,0,18.1,"0",0.532,6.75,74.9,3.3317,24,666,20.2,393.07,7.74,23.7
5.73116,0,18.1,"0",0.532,7.061,77,3.4106,24,666,20.2,395.28,7.01,25
2.81838,0,18.1,"0",0.532,5.762,40.3,4.0983,24,666,20.2,392.92,10.42,21.8
2.37857,0,18.1,"0",0.583,5.871,41.9,3.724,24,666,20.2,370.73,13.34,20.6
3.67367,0,18.1,"0",0.583,6.312,51.9,3.9917,24,666,20.2,388.62,10.58,21.2
5.69175,0,18.1,"0",0.583,6.114,79.8,3.5459,24,666,20.2,392.68,14.98,19.1
4.83567,0,18.1,"0",0.583,5.905,53.2,3.1523,24,666,20.2,388.22,11.45,20.6
0.15086,0,27.74,"0",0.609,5.454,92.7,1.8209,4,711,20.1,395.09,18.06,15.2
0.18337,0,27.74,"0",0.609,5.414,98.3,1.7554,4,711,20.1,344.05,23.97,7
0.20746,0,27.74,"0",0.609,5.093,98,1.8226,4,711,20.1,318.43,29.68,8.1
0.10574,0,27.74,"0",0.609,5.983,98.8,1.8681,4,711,20.1,390.11,18.07,13.6
0.11132,0,27.74,"0",0.609,5.983,83.5,2.1099,4,711,20.1,396.9,13.35,20.1
0.17331,0,9.69,"0",0.585,5.707,54,2.3817,6,391,19.2,396.9,12.01,21.8
0.27957,0,9.69,"0",0.585,5.926,42.6,2.3817,6,391,19.2,396.9,13.59,24.5
0.17899,0,9.69,"0",0.585,5.67,28.8,2.7986,6,391,19.2,393.29,17.6,23.1
0.2896,0,9.69,"0",0.585,5.39,72.9,2.7986,6,391,19.2,396.9,21.14,19.7
0.26838,0,9.69,"0",0.585,5.794,70.6,2.8927,6,391,19.2,396.9,14.1,18.3
0.23912,0,9.69,"0",0.585,6.019,65.3,2.4091,6,391,19.2,396.9,12.92,21.2
0.17783,0,9.69,"0",0.585,5.569,73.5,2.3999,6,391,19.2,395.77,15.1,17.5
0.22438,0,9.69,"0",0.585,6.027,79.7,2.4982,6,391,19.2,396.9,14.33,16.8
0.06263,0,11.93,"0",0.573,6.593,69.1,2.4786,1,273,21,391.99,9.67,22.4
0.04527,0,11.93,"0",0.573,6.12,76.7,2.2875,1,273,21,396.9,9.08,20.6
0.06076,0,11.93,"0",0.573,6.976,91,2.1675,1,273,21,396.9,5.64,23.9
0.10959,0,11.93,"0",0.573,6.794,89.3,2.3889,1,273,21,393.45,6.48,22
0.04741,0,11.93,"0",0.573,6.03,80.8,2.505,1,273,21,396.9,7.88,11.9
source diff could not be displayed: it is too large. Options to address this: view the blob.
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/00-Fidle-header-01.svg"></img>
# <!-- TITLE --> [GTS2] - CNN with GTSRB dataset - First convolutions
<!-- DESC --> Episode 2 : First convolutions and first results
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- Recognizing traffic signs
- Understand the **principles** and **architecture** of a **convolutional neural network** for image classification
The German Traffic Sign Recognition Benchmark (GTSRB) is a dataset with more than 50,000 photos of road signs from about 40 classes.
The final aim is to recognise them !
Description is available there : http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset
## What we're going to do :
- Read H5 dataset
- Build a model
- Train the model
- Evaluate the model
## Step 1 - Import and init
%% Cell type:code id: tags:
``` python
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.callbacks import TensorBoard
import numpy as np
import matplotlib.pyplot as plt
import h5py
import os,time,sys
from importlib import reload
sys.path.append('..')
import fidle.pwk as ooo
ooo.init()
```
%% Cell type:markdown id: tags:
## Step 2 - Load dataset
We're going to retrieve a previously recorded dataset.
For example: set-24x24-L
%% Cell type:code id: tags:
``` python
%%time
def read_dataset(name):
'''Reads h5 dataset from ./data
Arguments: dataset name, without .h5
Returns: x_train,y_train,x_test,y_test data'''
# ---- Read dataset
filename='./data/'+name+'.h5'
with h5py.File(filename) as f:
x_train = f['x_train'][:]
y_train = f['y_train'][:]
x_test = f['x_test'][:]
y_test = f['y_test'][:]
# ---- done
print('Dataset "{}" is loaded. ({:.1f} Mo)\n'.format(name,os.path.getsize(filename)/(1024*1024)))
return x_train,y_train,x_test,y_test
x_train,y_train,x_test,y_test = read_dataset('set-24x24-L')
```
%% Cell type:markdown id: tags:
## Step 3 - Have a look to the dataset
We take a quick look as we go by...
%% Cell type:code id: tags:
``` python
print("x_train : ", x_train.shape)
print("y_train : ", y_train.shape)
print("x_test : ", x_test.shape)
print("y_test : ", y_test.shape)
ooo.plot_images(x_train, y_train, range(12), columns=6, x_size=2, y_size=2)
ooo.plot_images(x_train, y_train, range(36), columns=12, x_size=1, y_size=1)
```
%% Cell type:markdown id: tags:
## Step 4 - Create model
We will now build a model and train it...
Some models :
%% Cell type:code id: tags:
``` python
# A basic model
#
def get_model_v1(lx,ly,lz):
model = keras.models.Sequential()
model.add( keras.layers.Conv2D(96, (3,3), activation='relu', input_shape=(lx,ly,lz)))
model.add( keras.layers.MaxPooling2D((2, 2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Conv2D(192, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D((2, 2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Flatten())
model.add( keras.layers.Dense(1500, activation='relu'))
model.add( keras.layers.Dropout(0.5))
model.add( keras.layers.Dense(43, activation='softmax'))
return model
# A more sophisticated model
#
def get_model_v2(lx,ly,lz):
model = keras.models.Sequential()
model.add( keras.layers.Conv2D(64, (3, 3), padding='same', input_shape=(lx,ly,lz), activation='relu'))
model.add( keras.layers.Conv2D(64, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Conv2D(128, (3, 3), padding='same', activation='relu'))
model.add( keras.layers.Conv2D(128, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Conv2D(256, (3, 3), padding='same',activation='relu'))
model.add( keras.layers.Conv2D(256, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Flatten())
model.add( keras.layers.Dense(512, activation='relu'))
model.add( keras.layers.Dropout(0.5))
model.add( keras.layers.Dense(43, activation='softmax'))
return model
# My sphisticated model, but small and fast
#
def get_model_v3(lx,ly,lz):
model = keras.models.Sequential()
model.add( keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(lx,ly,lz)))
model.add( keras.layers.MaxPooling2D((2, 2)))
model.add( keras.layers.Dropout(0.5))
model.add( keras.layers.Conv2D(64, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D((2, 2)))
model.add( keras.layers.Dropout(0.5))
model.add( keras.layers.Conv2D(128, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D((2, 2)))
model.add( keras.layers.Dropout(0.5))
model.add( keras.layers.Conv2D(256, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D((2, 2)))
model.add( keras.layers.Dropout(0.5))
model.add( keras.layers.Flatten())
model.add( keras.layers.Dense(1152, activation='relu'))
model.add( keras.layers.Dropout(0.5))
model.add( keras.layers.Dense(43, activation='softmax'))
return model
```
%% Cell type:markdown id: tags:
## Step 5 - Train the model
**Get the shape of my data :**
%% Cell type:code id: tags:
``` python
(n,lx,ly,lz) = x_train.shape
print("Images of the dataset have this folowing shape : ",(lx,ly,lz))
```
%% Cell type:markdown id: tags:
**Get and compile a model, with the data shape :**
%% Cell type:code id: tags:
``` python
model = get_model_v1(lx,ly,lz)
model.summary()
model.compile(optimizer = 'adam',
loss = 'sparse_categorical_crossentropy',
metrics = ['accuracy'])
```
%% Cell type:markdown id: tags:
**Train it :**
%% Cell type:code id: tags:
``` python
%%time
batch_size = 64
epochs = 5
# ---- Shuffle train data
x_train,y_train=ooo.shuffle_np_dataset(x_train,y_train)
# ---- Train
history = model.fit( x_train, y_train,
batch_size = batch_size,
epochs = epochs,
verbose = 1,
validation_data = (x_test, y_test))
```
%% Cell type:markdown id: tags:
**Evaluate it :**
%% Cell type:code id: tags:
``` python
max_val_accuracy = max(history.history["val_accuracy"])
print("Max validation accuracy is : {:.4f}".format(max_val_accuracy))
```
%% Cell type:code id: tags:
``` python
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss : {:5.4f}'.format(score[0]))
print('Test accuracy : {:5.4f}'.format(score[1]))
```
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/00-Fidle-logo-01.svg"></img>
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/00-Fidle-header-01.svg"></img>
# <!-- TITLE --> [GTS3] - CNN with GTSRB dataset - Monitoring
<!-- DESC --> Episode 3: Monitoring and analysing training, managing checkpoints
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- **Understand** what happens during the **training** process
- Implement **monitoring**, **backup** and **recovery** solutions
The German Traffic Sign Recognition Benchmark (GTSRB) is a dataset with more than 50,000 photos of road signs from about 40 classes.
The final aim is to recognise them !
Description is available there : http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset
## What we're going to do :
- Monitoring and understanding our model training
- Add recovery points
- Analyze the results
- Restore and run recovery points
## Step 1 - Import and init
%% Cell type:code id: tags:
``` python
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.callbacks import TensorBoard
import numpy as np
import h5py
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sn
import os, sys, time, random
from importlib import reload
sys.path.append('..')
import fidle.pwk as ooo
ooo.init()
```
%% Cell type:markdown id: tags:
## Step 2 - Load dataset
Dataset is one of the saved dataset: RGB25, RGB35, L25, L35, etc.
First of all, we're going to use a smart dataset : **set-24x24-L**
(with a GPU, it only takes 35'' compared to more than 5' with a CPU !)
%% Cell type:code id: tags:
``` python
%%time
def read_dataset(name):
'''Reads h5 dataset from ./data
Arguments: dataset name, without .h5
Returns: x_train,y_train,x_test,y_test data'''
# ---- Read dataset
filename='./data/'+name+'.h5'
with h5py.File(filename) as f:
x_train = f['x_train'][:]
y_train = f['y_train'][:]
x_test = f['x_test'][:]
y_test = f['y_test'][:]
x_meta = f['x_meta'][:]
y_meta = f['y_meta'][:]
# ---- done
print('Dataset "{}" is loaded. ({:.1f} Mo)\n'.format(name,os.path.getsize(filename)/(1024*1024)))
return x_train,y_train,x_test,y_test,x_meta,y_meta
x_train,y_train,x_test,y_test,x_meta,y_meta = read_dataset('set-24x24-L')
```
%% Cell type:markdown id: tags:
## Step 3 - Have a look to the dataset
Note: Data must be reshape for matplotlib
%% Cell type:code id: tags:
``` python
print("x_train : ", x_train.shape)
print("y_train : ", y_train.shape)
print("x_test : ", x_test.shape)
print("y_test : ", y_test.shape)
ooo.plot_images(x_train, y_train, range(12), columns=6, x_size=2, y_size=2)
ooo.plot_images(x_train, y_train, range(36), columns=12, x_size=1, y_size=1)
```
%% Cell type:markdown id: tags:
## Step 4 - Create model
We will now build a model and train it...
Some models...
%% Cell type:code id: tags:
``` python
# A basic model
#
def get_model_v1(lx,ly,lz):
model = keras.models.Sequential()
model.add( keras.layers.Conv2D(96, (3,3), activation='relu', input_shape=(lx,ly,lz)))
model.add( keras.layers.MaxPooling2D((2, 2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Conv2D(192, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D((2, 2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Flatten())
model.add( keras.layers.Dense(1500, activation='relu'))
model.add( keras.layers.Dropout(0.5))
model.add( keras.layers.Dense(43, activation='softmax'))
return model
```
%% Cell type:markdown id: tags:
## Step 5 - Prepare callbacks
We will add 2 callbacks :
- **TensorBoard**
Training logs, which can be visualised with Tensorboard.
`#tensorboard --logdir ./run/logs`
IMPORTANT : Relancer tensorboard à chaque run
- **Model backup**
It is possible to save the model each xx epoch or at each improvement.
The model can be saved completely or partially (weight).
For full format, we can use HDF5 format.
%% Cell type:raw id: tags:
%%bash
# To clean old logs and saved model, run this cell
#
/bin/rm -r ./run/logs 2>/dev/null
/bin/rm -r ./run/models 2>/dev/null
/bin/mkdir -p -m 755 ./run/logs
/bin/mkdir -p -m 755 ./run/models
echo -e "Reset directories : ./run/logs and ./run/models ."
%% Cell type:code id: tags:
``` python
ooo.mkdir('./run/models')
ooo.mkdir('./run/logs')
# ---- Callback tensorboard
log_dir = "./run/logs/tb_" + ooo.tag_now()
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)
# ---- Callback ModelCheckpoint - Save best model
save_dir = "./run/models/best-model.h5"
bestmodel_callback = tf.keras.callbacks.ModelCheckpoint(filepath=save_dir, verbose=0, monitor='accuracy', save_best_only=True)
# ---- Callback ModelCheckpoint - Save model each epochs
save_dir = "./run/models/model-{epoch:04d}.h5"
savemodel_callback = tf.keras.callbacks.ModelCheckpoint(filepath=save_dir, verbose=0, save_freq=2000*5)
```
%% Cell type:markdown id: tags:
## Step 6 - Train the model
**Get the shape of my data :**
%% Cell type:code id: tags:
``` python
(n,lx,ly,lz) = x_train.shape
print("Images of the dataset have this folowing shape : ",(lx,ly,lz))
```
%% Cell type:markdown id: tags:
**Get and compile a model, with the data shape :**
%% Cell type:code id: tags:
``` python
model = get_model_v1(lx,ly,lz)
# model.summary()
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
```
%% Cell type:markdown id: tags:
**Train it :**
Note: The training curve is visible in real time with Tensorboard :
`#tensorboard --logdir ./run/logs`
%% Cell type:code id: tags:
``` python
%%time
batch_size = 64
epochs = 30
# ---- Shuffle train data
x_train,y_train=ooo.shuffle_np_dataset(x_train,y_train)
# ---- Train
# Note: To be faster in our example, we can take only 2000 values
#
history = model.fit( x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test),
callbacks=[tensorboard_callback, bestmodel_callback, savemodel_callback] )
model.save('./run/models/last-model.h5')
```
%% Cell type:markdown id: tags:
**Evaluate it :**
%% Cell type:code id: tags:
``` python
max_val_accuracy = max(history.history["val_accuracy"])
print("Max validation accuracy is : {:.4f}".format(max_val_accuracy))
```
%% Cell type:code id: tags:
``` python
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss : {:5.4f}'.format(score[0]))
print('Test accuracy : {:5.4f}'.format(score[1]))
```
%% Cell type:markdown id: tags:
## Step 7 - History
The return of model.fit() returns us the learning history
%% Cell type:code id: tags:
``` python
ooo.plot_history(history)
```
%% Cell type:markdown id: tags:
## Step 8 - Evaluation and confusion
%% Cell type:code id: tags:
``` python
y_pred = model.predict_classes(x_test)
conf_mat = confusion_matrix(y_test,y_pred, normalize="true", labels=range(43))
ooo.plot_confusion_matrix(conf_mat)
```
%% Cell type:markdown id: tags:
## Step 9 - Restore and evaluate
### 9.1 - List saved models :
%% Cell type:code id: tags:
``` python
!find ./run/models/
```
%% Cell type:markdown id: tags:
### 9.2 - Restore a model :
%% Cell type:code id: tags:
``` python
loaded_model = tf.keras.models.load_model('./run/models/best-model.h5')
# loaded_model.summary()
print("Loaded.")
```
%% Cell type:markdown id: tags:
### 9.3 - Evaluate it :
%% Cell type:code id: tags:
``` python
score = loaded_model.evaluate(x_test, y_test, verbose=0)
print('Test loss : {:5.4f}'.format(score[0]))
print('Test accuracy : {:5.4f}'.format(score[1]))
```
%% Cell type:markdown id: tags:
### 9.4 - Make a prediction :
%% Cell type:code id: tags:
``` python
# ---- Get a random image
#
i = random.randint(1,len(x_test))
x,y = x_test[i], y_test[i]
# ---- Do prediction
#
predictions = loaded_model.predict( np.array([x]) )
# ---- A prediction is just the output layer
#
print("\nOutput layer from model is (x100) :\n")
with np.printoptions(precision=2, suppress=True, linewidth=95):
print(predictions*100)
# ---- Graphic visualisation
#
print("\nGraphically :\n")
plt.figure(figsize=(12,2))
plt.bar(range(43), predictions[0], align='center', alpha=0.5)
plt.ylabel('Probability')
plt.ylim((0,1))
plt.xlabel('Class')
plt.title('Trafic Sign prediction')
plt.show()
# ---- Predict class
#
p = np.argmax(predictions)
# ---- Show result
#
print("\nPrediction on the left, real stuff on the right :\n")
ooo.plot_images([x,x_meta[y]], [p,y], range(2), columns=3, x_size=3, y_size=2)
if p==y:
print("YEEES ! that's right!")
else:
print("oups, that's wrong ;-(")
```
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/00-Fidle-logo-01.svg"></img>
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/00-Fidle-header-01.svg"></img>
# <!-- TITLE --> [GTS4] - CNN with GTSRB dataset - Data augmentation
<!-- DESC --> Episode 4: Improving the results with data augmentation
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- Trying to improve training by **enhancing the data**
- Using Keras' **data augmentation utilities**, finding their limits...
The German Traffic Sign Recognition Benchmark (GTSRB) is a dataset with more than 50,000 photos of road signs from about 40 classes.
The final aim is to recognise them !
Description is available there : http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset
## What we're going to do :
- Increase and improve the training dataset
- Identify the limits of these tools
## Step 1 - Import and init
%% Cell type:code id: tags:
``` python
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.callbacks import TensorBoard
import numpy as np
import h5py
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sn
import os, sys, time, random
from importlib import reload
sys.path.append('..')
import fidle.pwk as ooo
ooo.init()
```
%% Cell type:markdown id: tags:
## Step 2 - Dataset loader
Dataset is one of the saved dataset: RGB25, RGB35, L25, L35, etc.
First of all, we're going to use a smart dataset : **set-24x24-L**
(with a GPU, it only takes 35'' compared to more than 5' with a CPU !)
%% Cell type:code id: tags:
``` python
%%time
def read_dataset(name):
'''Reads h5 dataset from ./data
Arguments: dataset name, without .h5
Returns: x_train,y_train,x_test,y_test data'''
# ---- Read dataset
filename='./data/'+name+'.h5'
with h5py.File(filename) as f:
x_train = f['x_train'][:]
y_train = f['y_train'][:]
x_test = f['x_test'][:]
y_test = f['y_test'][:]
# ---- done
print('Dataset "{}" is loaded. ({:.1f} Mo)\n'.format(name,os.path.getsize(filename)/(1024*1024)))
return x_train,y_train,x_test,y_test
```
%% Cell type:markdown id: tags:
## Step 3 - Models
We will now build a model and train it...
This is my model ;-)
%% Cell type:code id: tags:
``` python
# A basic model
#
def get_model_v1(lx,ly,lz):
model = keras.models.Sequential()
model.add( keras.layers.Conv2D(96, (3,3), activation='relu', input_shape=(lx,ly,lz)))
model.add( keras.layers.MaxPooling2D((2, 2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Conv2D(192, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D((2, 2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Flatten())
model.add( keras.layers.Dense(1500, activation='relu'))
model.add( keras.layers.Dropout(0.5))
model.add( keras.layers.Dense(43, activation='softmax'))
return model
```
%% Cell type:markdown id: tags:
## Step 4 - Callbacks
We prepare 2 kind callbacks : TensorBoard and Model backup
%% Cell type:code id: tags:
``` python
%%bash
# To clean old logs and saved model, run this cell
#
/bin/rm -r ./run/logs 2>/dev/null
/bin/rm -r ./run/models 2>/dev/null
/bin/mkdir -p -m 755 ./run/logs
/bin/mkdir -p -m 755 ./run/models
echo -e "Reset directories : ./run/logs and ./run/models ."
```
%% Cell type:code id: tags:
``` python
# ---- Callback tensorboard
log_dir = "./run/logs/tb_" + ooo.tag_now()
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)
# ---- Callback ModelCheckpoint - Save best model
save_dir = "./run/models/best-model.h5"
bestmodel_callback = tf.keras.callbacks.ModelCheckpoint(filepath=save_dir, verbose=0, monitor='accuracy', save_best_only=True)
# ---- Callback ModelCheckpoint - Save model each epochs
save_dir = "./run/models/model-{epoch:04d}.h5"
savemodel_callback = tf.keras.callbacks.ModelCheckpoint(filepath=save_dir, verbose=0, save_freq=2000*5)
```
%% Cell type:markdown id: tags:
## Step 5 - Load and prepare dataset
### 5.1 - Load
%% Cell type:code id: tags:
``` python
x_train,y_train,x_test,y_test = read_dataset('set-48x48-L-LHE')
```
%% Cell type:markdown id: tags:
### 5.2 - Data augmentation
%% Cell type:code id: tags:
``` python
datagen = keras.preprocessing.image.ImageDataGenerator(featurewise_center=False,
featurewise_std_normalization=False,
width_shift_range=0.1,
height_shift_range=0.1,
zoom_range=0.2,
shear_range=0.1,
rotation_range=10.)
datagen.fit(x_train)
```
%% Cell type:markdown id: tags:
## Step 6 - Train the model
**Get the shape of my data :**
%% Cell type:code id: tags:
``` python
(n,lx,ly,lz) = x_train.shape
print("Images of the dataset have this folowing shape : ",(lx,ly,lz))
```
%% Cell type:markdown id: tags:
**Get and compile a model, with the data shape :**
%% Cell type:code id: tags:
``` python
model = get_model_v3(lx,ly,lz)
# model.summary()
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
```
%% Cell type:markdown id: tags:
**Train it :**
Note : La courbe d'apprentissage est visible en temps réel avec Tensorboard :
`#tensorboard --logdir ./run/logs`
%% Cell type:code id: tags:
``` python
%%time
batch_size = 64
epochs = 30
# ---- Shuffle train data
#x_train,y_train=ooo.shuffle_np_dataset(x_train,y_train)
# ---- Train
#
history = model.fit( datagen.flow(x_train, y_train, batch_size=batch_size),
steps_per_epoch = int(x_train.shape[0]/batch_size),
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test),
callbacks=[tensorboard_callback, bestmodel_callback, savemodel_callback] )
model.save('./run/models/last-model.h5')
```
%% Cell type:markdown id: tags:
**Evaluate it :**
%% Cell type:code id: tags:
``` python
max_val_accuracy = max(history.history["val_accuracy"])
print("Max validation accuracy is : {:.4f}".format(max_val_accuracy))
```
%% Cell type:code id: tags:
``` python
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss : {:5.4f}'.format(score[0]))
print('Test accuracy : {:5.4f}'.format(score[1]))
```
%% Cell type:markdown id: tags:
## Step 7 - History
The return of model.fit() returns us the learning history
%% Cell type:code id: tags:
``` python
ooo.plot_history(history)
```
%% Cell type:markdown id: tags:
## Step 8 - Evaluate best model
%% Cell type:markdown id: tags:
### 8.1 - Restore best model :
%% Cell type:code id: tags:
``` python
loaded_model = tf.keras.models.load_model('./run/models/best-model.h5')
# best_model.summary()
print("Loaded.")
```
%% Cell type:markdown id: tags:
### 8.2 - Evaluate it :
%% Cell type:code id: tags:
``` python
score = loaded_model.evaluate(x_test, y_test, verbose=0)
print('Test loss : {:5.4f}'.format(score[0]))
print('Test accuracy : {:5.4f}'.format(score[1]))
```
%% Cell type:markdown id: tags:
**Plot confusion matrix**
%% Cell type:code id: tags:
``` python
y_pred = model.predict_classes(x_test)
conf_mat = confusion_matrix(y_test,y_pred, normalize="true", labels=range(43))
ooo.plot_confusion_matrix(conf_mat)
```
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/00-Fidle-logo-01.svg"></img>
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/00-Fidle-header-01.svg"></img>
# <!-- TITLE --> [GTS5] - CNN with GTSRB dataset - Full convolutions
<!-- DESC --> Episode 5: A lot of models, a lot of datasets and a lot of results.
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- Try multiple solutions
- Design a generic and batch-usable code
The German Traffic Sign Recognition Benchmark (GTSRB) is a dataset with more than 50,000 photos of road signs from about 40 classes.
The final aim is to recognise them !
Description is available there : http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset
## What we're going to do :
Our main steps:
- Try n models with n datasets
- Save a Pandas/h5 report
- Write to be run in batch mode
## Step 1 - Import
%% Cell type:code id: tags:
``` python
import tensorflow as tf
from tensorflow import keras
import numpy as np
import h5py
import os,time,json
import random
from IPython.display import display
VERSION='1.6'
```
%% Cell type:markdown id: tags:
## Step 2 - Init and start
%% Cell type:code id: tags:
``` python
# ---- Where I am ?
now = time.strftime("%A %d %B %Y - %Hh%Mm%Ss")
here = os.getcwd()
random.seed(time.time())
tag_id = '{:06}'.format(random.randint(0,99999))
# ---- Who I am ?
if 'OAR_JOB_ID' in os.environ:
oar_id=os.environ['OAR_JOB_ID']
else:
oar_id='???'
print('\nFull Convolutions Notebook')
print(' Version : {}'.format(VERSION))
print(' Now is : {}'.format(now))
print(' OAR id : {}'.format(oar_id))
print(' Tag id : {}'.format(tag_id))
print(' Working directory : {}'.format(here))
print(' TensorFlow version :',tf.__version__)
print(' Keras version :',tf.keras.__version__)
print(' for tensorboard : --logdir {}/run/logs_{}'.format(here,tag_id))
```
%% Cell type:markdown id: tags:
## Step 3 - Dataset loading
%% Cell type:code id: tags:
``` python
def read_dataset(name):
'''Reads h5 dataset from ./data
Arguments: dataset name, without .h5
Returns: x_train,y_train,x_test,y_test data'''
# ---- Read dataset
filename='./data/'+name+'.h5'
with h5py.File(filename,'r') as f:
x_train = f['x_train'][:]
y_train = f['y_train'][:]
x_test = f['x_test'][:]
y_test = f['y_test'][:]
return x_train,y_train,x_test,y_test
```
%% Cell type:markdown id: tags:
## Step 4 - Models collection
%% Cell type:code id: tags:
``` python
# A basic model
#
def get_model_v1(lx,ly,lz):
model = keras.models.Sequential()
model.add( keras.layers.Conv2D(96, (3,3), activation='relu', input_shape=(lx,ly,lz)))
model.add( keras.layers.MaxPooling2D((2, 2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Conv2D(192, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D((2, 2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Flatten())
model.add( keras.layers.Dense(1500, activation='relu'))
model.add( keras.layers.Dropout(0.5))
model.add( keras.layers.Dense(43, activation='softmax'))
return model
# A more sophisticated model
#
def get_model_v2(lx,ly,lz):
model = keras.models.Sequential()
model.add( keras.layers.Conv2D(64, (3, 3), padding='same', input_shape=(lx,ly,lz), activation='relu'))
model.add( keras.layers.Conv2D(64, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Conv2D(128, (3, 3), padding='same', activation='relu'))
model.add( keras.layers.Conv2D(128, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Conv2D(256, (3, 3), padding='same',activation='relu'))
model.add( keras.layers.Conv2D(256, (3, 3), activation='relu'))
model.add( keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Flatten())
model.add( keras.layers.Dense(512, activation='relu'))
model.add( keras.layers.Dropout(0.5))
model.add( keras.layers.Dense(43, activation='softmax'))
return model
def get_model_v3(lx,ly,lz):
model = keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(32, (5, 5), padding='same', activation='relu', input_shape=(lx,ly,lz)))
model.add(tf.keras.layers.BatchNormalization(axis=-1))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Conv2D(64, (5, 5), padding='same', activation='relu'))
model.add(tf.keras.layers.BatchNormalization(axis=-1))
model.add(tf.keras.layers.Conv2D(128, (5, 5), padding='same', activation='relu'))
model.add(tf.keras.layers.BatchNormalization(axis=-1))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(512, activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(0.4))
model.add(tf.keras.layers.Dense(43, activation='softmax'))
return model
```
%% Cell type:markdown id: tags:
## Step 5 - Multiple datasets, multiple models ;-)
%% Cell type:code id: tags:
``` python
def multi_run(datasets, models, datagen=None,
train_size=1, test_size=1, batch_size=64, epochs=16,
verbose=0, extension_dir='last'):
# ---- Logs and models dir
#
os.makedirs('./run/logs_{}'.format(extension_dir), mode=0o750, exist_ok=True)
os.makedirs('./run/models_{}'.format(extension_dir), mode=0o750, exist_ok=True)
# ---- Columns of output
#
output={}
output['Dataset']=[]
output['Size'] =[]
for m in models:
output[m+'_Accuracy'] = []
output[m+'_Duration'] = []
# ---- Let's go
#
for d_name in datasets:
print("\nDataset : ",d_name)
# ---- Read dataset
x_train,y_train,x_test,y_test = read_dataset(d_name)
d_size=os.path.getsize('./data/'+d_name+'.h5')/(1024*1024)
output['Dataset'].append(d_name)
output['Size'].append(d_size)
# ---- Get the shape
(n,lx,ly,lz) = x_train.shape
n_train = int(x_train.shape[0]*train_size)
n_test = int(x_test.shape[0]*test_size)
# ---- For each model
for m_name,m_function in models.items():
print(" Run model {} : ".format(m_name), end='')
# ---- get model
try:
model=m_function(lx,ly,lz)
# ---- Compile it
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
# ---- Callbacks tensorboard
log_dir = "./run/logs_{}/tb_{}_{}".format(extension_dir, d_name, m_name)
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)
# ---- Callbacks bestmodel
save_dir = "./run/models_{}/model_{}_{}.h5".format(extension_dir, d_name, m_name)
bestmodel_callback = tf.keras.callbacks.ModelCheckpoint(filepath=save_dir, verbose=0, monitor='accuracy', save_best_only=True)
# ---- Train
start_time = time.time()
if datagen==None:
# ---- No data augmentation (datagen=None) --------------------------------------
history = model.fit(x_train[:n_train], y_train[:n_train],
batch_size = batch_size,
epochs = epochs,
verbose = verbose,
validation_data = (x_test[:n_test], y_test[:n_test]),
callbacks = [tensorboard_callback, bestmodel_callback])
else:
# ---- Data augmentation (datagen given) ----------------------------------------
datagen.fit(x_train)
history = model.fit(datagen.flow(x_train, y_train, batch_size=batch_size),
steps_per_epoch = int(n_train/batch_size),
epochs = epochs,
verbose = verbose,
validation_data = (x_test[:n_test], y_test[:n_test]),
callbacks = [tensorboard_callback, bestmodel_callback])
# ---- Result
end_time = time.time()
duration = end_time-start_time
accuracy = max(history.history["val_accuracy"])*100
#
output[m_name+'_Accuracy'].append(accuracy)
output[m_name+'_Duration'].append(duration)
print("Accuracy={:.2f} and Duration={:.2f})".format(accuracy,duration))
except:
output[m_name+'_Accuracy'].append('0')
output[m_name+'_Duration'].append('999')
print('-')
return output
```
%% Cell type:markdown id: tags:
## Step 6 - Run !
%% Cell type:code id: tags:
``` python
start_time = time.time()
print('\n---- Run','-'*50)
# --------- Datasets, models, and more.. -----------------------------------
#
# ---- For tests
# datasets = ['set-24x24-L', 'set-24x24-RGB']
# models = {'v1':get_model_v1, 'v4':get_model_v2}
# batch_size = 64
# epochs = 2
# train_size = 0.1
# test_size = 0.1
# with_datagen = False
# verbose = 0
#
# ---- All possibilities -> Run A
# datasets = ['set-24x24-L', 'set-24x24-RGB', 'set-48x48-L', 'set-48x48-RGB', 'set-24x24-L-LHE', 'set-24x24-RGB-HE', 'set-48x48-L-LHE', 'set-48x48-RGB-HE']
# models = {'v1':get_model_v1, 'v2':get_model_v2, 'v3':get_model_v3}
# batch_size = 64
# epochs = 16
# train_size = 1
# test_size = 1
# with_datagen = False
# verbose = 0
#
# ---- Data augmentation -> Run B
datasets = ['set-48x48-RGB']
models = {'v2':get_model_v2}
batch_size = 64
epochs = 20
train_size = 1
test_size = 1
with_datagen = True
verbose = 0
#
# ---------------------------------------------------------------------------
# ---- Data augmentation
#
if with_datagen :
datagen = keras.preprocessing.image.ImageDataGenerator(featurewise_center=False,
featurewise_std_normalization=False,
width_shift_range=0.1,
height_shift_range=0.1,
zoom_range=0.2,
shear_range=0.1,
rotation_range=10.)
else:
datagen=None
# ---- Run
#
output = multi_run(datasets, models,
datagen=datagen,
train_size=train_size, test_size=test_size,
batch_size=batch_size, epochs=epochs,
verbose=verbose,
extension_dir=tag_id)
# ---- Save report
#
report={}
report['output']=output
report['description']='train_size={} test_size={} batch_size={} epochs={} data_aug={}'.format(train_size,test_size,batch_size,epochs,with_datagen)
report_name='./run/report_{}.json'.format(tag_id)
with open(report_name, 'w') as file:
json.dump(report, file)
print('\nReport saved as ',report_name)
end_time = time.time()
duration = end_time-start_time
print(f'Duration : {duration:.2f} s')
print('-'*59)
```
%% Cell type:markdown id: tags:
## Step 7 - That's all folks..
%% Cell type:code id: tags:
``` python
print('\n{}'.format(time.strftime("%A %-d %B %Y, %H:%M:%S")))
print("The work is done.\n")
```
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/00-Fidle-logo-01.svg"></img>
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/00-Fidle-header-01.svg"></img>
# <!-- TITLE --> [GTS6] - CNN with GTSRB dataset - Full convolutions as a batch
<!-- DESC --> Episode 6 : Run Full convolution notebook as a batch
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- Run a notebook code as a **job**
- Follow up with Tensorboard
The German Traffic Sign Recognition Benchmark (GTSRB) is a dataset with more than 50,000 photos of road signs from about 40 classes.
The final aim is to recognise them !
Description is available there : http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset
## What we're going to do :
Our main steps:
- Run Full-convolution.ipynb as a batch :
- Notebook mode
- Script mode
- Tensorboard follow up
## Step 1 - Run a notebook as a batch
To run a notebook in a command line :
```jupyter nbconvert (...) --to notebook --execute <notebook>```
%% Cell type:raw id: tags:
%%bash
# ---- This will execute and save a notebook
#
jupyter nbconvert --ExecutePreprocessor.timeout=-1 --to notebook --output='./run/full_convolutions' --execute '05-Full-convolutions.ipynb'
%% Cell type:markdown id: tags:
## Step 2 - Export as a script (What we're going to do !)
To export a notebook as a script :
```jupyter nbconvert --to script <notebook>```
To run the script :
```ipython <script>```
%% Cell type:code id: tags:
``` python
%%bash
# ---- This will convert a notebook to a notebook.py script
#
jupyter nbconvert --to script --output='./run/full_convolutions_B' '05-Full-convolutions.ipynb'
```
%% Output
[NbConvertApp] Converting notebook 05-Full-convolutions.ipynb to script
[NbConvertApp] Writing 11305 bytes to ./run/full_convolutions_B.py
%% Cell type:code id: tags:
``` python
!ls -l ./run/*.py
```
%% Output
-rw-r--r-- 1 pjluc pjluc 11305 Jan 21 00:13 ./run/full_convolutions_B.py
%% Cell type:markdown id: tags:
## Step 2 - Batch submission
Create batch script :
%% Cell type:code id: tags:
``` python
%%writefile "./run/batch_full_convolutions_B.sh"
#!/bin/bash
#OAR -n Full convolutions
#OAR -t gpu
#OAR -l /nodes=1/gpudevice=1,walltime=01:00:00
#OAR --stdout full_convolutions_%jobid%.out
#OAR --stderr full_convolutions_%jobid%.err
#OAR --project fidle
#---- With cpu
# use :
# OAR -l /nodes=1/core=32,walltime=01:00:00
# and add a 2>/dev/null to ipython xxx
# ----------------------------------
# _ _ _
# | |__ __ _| |_ ___| |__
# | '_ \ / _` | __/ __| '_ \
# | |_) | (_| | || (__| | | |
# |_.__/ \__,_|\__\___|_| |_|
# Full convolutions
# ----------------------------------
#
CONDA_ENV=deeplearning2
RUN_DIR=~/fidle/GTSRB
RUN_SCRIPT=./run/full_convolutions_B.py
# ---- Cuda Conda initialization
#
echo '------------------------------------------------------------'
echo "Start : $0"
echo '------------------------------------------------------------'
#
source /applis/environments/cuda_env.sh dahu 10.0
source /applis/environments/conda.sh
#
conda activate "$CONDA_ENV"
# ---- Run it...
#
cd $RUN_DIR
ipython $RUN_SCRIPT
```
%% Output
Writing ./run/batch_full_convolutions_B.sh
%% Cell type:code id: tags:
``` python
%%bash
chmod 755 ./run/*.sh
chmod 755 ./run/*.py
ls -l ./run/*full_convolutions*
```
%% Output
-rwxr-xr-x 1 pjluc pjluc 1045 Jan 21 00:15 ./run/batch_full_convolutions_B.sh
-rwxr-xr-x 1 pjluc pjluc 611 Jan 19 15:53 ./run/batch_full_convolutions.sh
-rwxr-xr-x 1 pjluc pjluc 11305 Jan 21 00:13 ./run/full_convolutions_B.py
%% Cell type:raw id: tags:
%%bash
./run/batch_full_convolutions.sh
oarsub (...)
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/00-Fidle-logo-01.svg"></img>
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/00-Fidle-header-01.svg"></img>
# <!-- TITLE --> [GTS7] - Full convolutions Report
<!-- DESC --> Displaying the reports of the different jobs
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- Compare the results of different dataset-model combinations
Les rapports (format json) sont générés par les jobs "Full convolution" [GTS5][GTS6]
## What we're going to do :
- Read json files and display results
## 1/ Python import
%% Cell type:code id: tags:
``` python
import pandas as pd
import os,glob,json
from pathlib import Path
from IPython.display import display, Markdown
```
%% Cell type:markdown id: tags:
## 2/ A nice function
%% Cell type:code id: tags:
``` python
def highlight_max(s):
is_max = (s == s.max())
return ['background-color: yellow' if v else '' for v in is_max]
def show_report(file):
# ---- Read json file
with open(file) as infile:
dict_report = json.load( infile )
output = dict_report['output']
description = dict_report['description']
# ---- about
print("\n\n\nReport : ",Path(file).stem)
print( "Desc. : ",description,'\n')
# ---- Create a pandas
report = pd.DataFrame (output)
col_accuracy = [ c for c in output.keys() if c.endswith('Accuracy')]
col_duration = [ c for c in output.keys() if c.endswith('Duration')]
# ---- Build formats
lambda_acc = lambda x : '{:.2f} %'.format(x) if (isinstance(x, float)) else '{:}'.format(x)
lambda_dur = lambda x : '{:.1f} s'.format(x) if (isinstance(x, float)) else '{:}'.format(x)
formats = {'Size':'{:.2f} Mo'}
for c in col_accuracy:
formats[c]=lambda_acc
for c in col_duration:
formats[c]=lambda_dur
t=report.style.highlight_max(subset=col_accuracy).format(formats).hide_index()
display(t)
```
%% Cell type:markdown id: tags:
## 3/ Reports display
%% Cell type:code id: tags:
``` python
for file in glob.glob("./run/*.json"):
show_report(file)
```
%% Output
Report : report_009557
Desc. : train_size=1 test_size=1 batch_size=64 epochs=16 data_aug=False
Report : report_020341
Desc. : train_size=1 test_size=1 batch_size=64 epochs=16 data_aug=False
Report : report_041040
Desc. : train_size=1 test_size=1 batch_size=64 epochs=20 data_aug=True
Report : report_088809
Desc. : train_size=1 test_size=1 batch_size=64 epochs=20 data_aug=True
Report : report_093384
Desc. : train_size=1 test_size=1 batch_size=64 epochs=16 data_aug=False
Report : report_094801
Desc. : train_size=1 test_size=1 batch_size=64 epochs=20 data_aug=True
Report : report_2020_01_20_17h22m23s
Desc. : train_size=1 test_size=1 batch_size=64 epochs=16 data_aug=False
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/00-Fidle-logo-01.svg"></img>
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/00-Fidle-header-01.svg"></img>
# <!-- TITLE --> [TSB1] - Tensorboard with/from Jupyter
<!-- DESC --> 4 ways to use Tensorboard from the Jupyter environment
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- Using Tensorboard
- ...and if possible, simply and easily !
About [Tensorboard](https://www.tensorflow.org/tensorboard/get_started)
## What we're going to do :
- Using Tensorboard
%% Cell type:markdown id: tags:
## Option 1 - From Jupyter
It's the easiest and most fun way: Launch Tensorboard directly from Jupiter.
Unfortunately, this feature seems to be a bit capricious with the recent versions of Jupyter...
It works on Jean-Zay (at **IDRIS**), but on Jupyter Notebook.
%% Cell type:markdown id: tags:
## Option 2 - Shell command
That's what we're going to use in **GRICAD.**
In fact, this is like starting tensorboard from the command line.
More about it : `tensorboard --help`
%% Cell type:code id: tags:
``` python
%%bash
tensorboard_start --logdir ./run/logs
```
%% Cell type:code id: tags:
``` python
%%bash
tensorboard_status
```
%% Cell type:code id: tags:
``` python
%%bash
tensorboard_stop
```
%% Cell type:markdown id: tags:
## Option 3 - Magic command
**Start**
%% Cell type:code id: tags:
``` python
%load_ext tensorboard
```
%% Cell type:markdown id: tags:
For example for use on a GRICAD cluster :
%% Cell type:code id: tags:
``` python
%tensorboard --port 21277 --host 0.0.0.0 --logdir ./run/logs
```
%% Cell type:markdown id: tags:
**Stop**
No way... use bash method
## Option 4 - Tensorboard as a module
**Start**
%% Cell type:code id: tags:
``` python
import tensorboard.notebook as tsb
```
%% Cell type:code id: tags:
``` python
tsb.start('--port 21277 --host 0.0.0.0 --logdir ./run/logs')
```
%% Cell type:markdown id: tags:
**Check**
%% Cell type:code id: tags:
``` python
a=tsb.list()
```
%% Output
No known TensorBoard instances running.
%% Cell type:markdown id: tags:
**Stop**
No way... use bash method
%% Cell type:code id: tags:
``` python
!kill 214798
```
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/00-Fidle-logo-01.svg"></img>
source diff could not be displayed: it is too large. Options to address this: view the blob.
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/00-Fidle-header-01.svg"></img>
# <!-- TITLE --> [IMDB1] - Text embedding with IMDB
<!-- DESC --> A very classical example of word embedding for text classification (sentiment analysis)
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- The objective is to guess whether film reviews are **positive or negative** based on the analysis of the text.
- Understand the management of **textual data** and **sentiment analysis**
Original dataset can be find **[there](http://ai.stanford.edu/~amaas/data/sentiment/)**
Note that [IMDb.com](https://imdb.com) offers several easy-to-use [datasets](https://www.imdb.com/interfaces/)
For simplicity's sake, we'll use the dataset directly [embedded in Keras](https://www.tensorflow.org/api_docs/python/tf/keras/datasets)
## What we're going to do :
- Retrieve data
- Preparing the data
- Build a model
- Train the model
- Evaluate the result
%% Cell type:markdown id: tags:
## Step 1 - Init python stuff
%% Cell type:code id: tags:
``` python
import numpy as np
import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.datasets.imdb as imdb
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
import os,sys,h5py,json
from importlib import reload
sys.path.append('..')
import fidle.pwk as ooo
ooo.init()
```
%% Output
FIDLE 2020 - Practical Work Module
Version : 0.2.9
Run time : Wednesday 19 February 2020, 22:04:33
TensorFlow version : 2.0.0
Keras version : 2.2.4-tf
%% Cell type:markdown id: tags:
## Step 2 - Retrieve data
**From Keras :**
This IMDb dataset can bet get directly from [Keras datasets](https://www.tensorflow.org/api_docs/python/tf/keras/datasets)
Due to their nature, textual data can be somewhat complex.
### 2.1 - Data structure :
The dataset is composed of 2 parts: **reviews** and **opinions** (positive/negative), with a **dictionary**
- dataset = (reviews, opinions)
- reviews = \[ review_0, review_1, ...\]
- review_i = [ int1, int2, ...] where int_i is the index of the word in the dictionary.
- opinions = \[ int0, int1, ...\] where int_j == 0 if opinion is negative or 1 if opinion is positive.
- dictionary = \[ mot1:int1, mot2:int2, ... ]
%% Cell type:markdown id: tags:
### 2.2 - Get dataset
For simplicity, we will use a pre-formatted dataset.
See : https://www.tensorflow.org/api_docs/python/tf/keras/datasets/imdb/load_data
However, Keras offers some usefull tools for formatting textual data.
See : https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/text
%% Cell type:code id: tags:
``` python
vocab_size = 10000
# ----- Retrieve x,y
#
(x_train, y_train), (x_test, y_test) = imdb.load_data( num_words = vocab_size,
skip_top = 0,
maxlen = None,
seed = 42,
start_char = 1,
oov_char = 2,
index_from = 3, )
```
%% Cell type:code id: tags:
``` python
print(" Max(x_train,x_test) : ", ooo.rmax([x_train,x_test]) )
print(" x_train : {} y_train : {}".format(x_train.shape, y_train.shape))
print(" x_test : {} y_test : {}".format(x_test.shape, y_test.shape))
print('\nReview example (x_train[12]) :\n\n',x_train[12])
```
%% Output
Max(x_train,x_test) : 9999
x_train : (25000,) y_train : (25000,)
x_test : (25000,) y_test : (25000,)
Review example (x_train[12]) :
[1, 14, 22, 1367, 53, 206, 159, 4, 636, 898, 74, 26, 11, 436, 363, 108, 7, 14, 432, 14, 22, 9, 1055, 34, 8599, 2, 5, 381, 3705, 4509, 14, 768, 47, 839, 25, 111, 1517, 2579, 1991, 438, 2663, 587, 4, 280, 725, 6, 58, 11, 2714, 201, 4, 206, 16, 702, 5, 5176, 19, 480, 5920, 157, 13, 64, 219, 4, 2, 11, 107, 665, 1212, 39, 4, 206, 4, 65, 410, 16, 565, 5, 24, 43, 343, 17, 5602, 8, 169, 101, 85, 206, 108, 8, 3008, 14, 25, 215, 168, 18, 6, 2579, 1991, 438, 2, 11, 129, 1609, 36, 26, 66, 290, 3303, 46, 5, 633, 115, 4363]
%% Cell type:markdown id: tags:
### 2.3 - Have a look for humans (optional)
When we loaded the dataset, we asked for using \<start\> as 1, \<unknown word\> as 2
So, we shifted the dataset by 3 with the parameter index_from=3
%% Cell type:code id: tags:
``` python
# ---- Retrieve dictionary {word:index}, and encode it in ascii
word_index = imdb.get_word_index()
# ---- Shift the dictionary from +3
word_index = {w:(i+3) for w,i in word_index.items()}
# ---- Add <pad>, <start> and unknown tags
word_index.update( {'<pad>':0, '<start>':1, '<unknown>':2} )
# ---- Create a reverse dictionary : {index:word}
index_word = {index:word for word,index in word_index.items()}
# ---- Add a nice function to transpose :
#
def dataset2text(review):
return ' '.join([index_word.get(i, '?') for i in review])
```
%% Cell type:code id: tags:
``` python
print('\nDictionary size : ', len(word_index))
print('\nReview example (x_train[12]) :\n\n',x_train[12])
print('\nIn real words :\n\n', dataset2text(x_train[12]))
```
%% Output
Dictionary size : 88587
Review example (x_train[12]) :
[1, 14, 22, 1367, 53, 206, 159, 4, 636, 898, 74, 26, 11, 436, 363, 108, 7, 14, 432, 14, 22, 9, 1055, 34, 8599, 2, 5, 381, 3705, 4509, 14, 768, 47, 839, 25, 111, 1517, 2579, 1991, 438, 2663, 587, 4, 280, 725, 6, 58, 11, 2714, 201, 4, 206, 16, 702, 5, 5176, 19, 480, 5920, 157, 13, 64, 219, 4, 2, 11, 107, 665, 1212, 39, 4, 206, 4, 65, 410, 16, 565, 5, 24, 43, 343, 17, 5602, 8, 169, 101, 85, 206, 108, 8, 3008, 14, 25, 215, 168, 18, 6, 2579, 1991, 438, 2, 11, 129, 1609, 36, 26, 66, 290, 3303, 46, 5, 633, 115, 4363]
In real words :
<start> this film contains more action before the opening credits than are in entire hollywood films of this sort this film is produced by tsui <unknown> and stars jet li this team has brought you many worthy hong kong cinema productions including the once upon a time in china series the action was fast and furious with amazing wire work i only saw the <unknown> in two shots aside from the action the story itself was strong and not just used as filler to find any other action films to rival this you must look for a hong kong cinema <unknown> in your area they are really worth checking out and usually never disappoint
%% Cell type:markdown id: tags:
### 2.4 - Have a look for neurons
%% Cell type:code id: tags:
``` python
plt.figure(figsize=(12, 6))
ax=sns.distplot([len(i) for i in x_train],bins=60)
ax.set_title('Distribution of reviews by size')
plt.xlabel("Review's sizes")
plt.ylabel('Density')
ax.set_xlim(0, 1500)
plt.show()
```
%% Output
%% Cell type:markdown id: tags:
## Step 3 - Preprocess the data
In order to be processed by an NN, all entries must have the same length.
We chose a review length of **review_len**
We will therefore complete them with a padding (of \<pad\>\)
%% Cell type:code id: tags:
``` python
review_len = 256
x_train = keras.preprocessing.sequence.pad_sequences(x_train,
value = 0,
padding = 'post',
maxlen = review_len)
x_test = keras.preprocessing.sequence.pad_sequences(x_test,
value = 0 ,
padding = 'post',
maxlen = review_len)
print('\nReview example (x_train[12]) :\n\n',x_train[12])
print('\nIn real words :\n\n', dataset2text(x_train[12]))
```
%% Output
Review example (x_train[12]) :
[ 1 14 22 1367 53 206 159 4 636 898 74 26 11 436
363 108 7 14 432 14 22 9 1055 34 8599 2 5 381
3705 4509 14 768 47 839 25 111 1517 2579 1991 438 2663 587
4 280 725 6 58 11 2714 201 4 206 16 702 5 5176
19 480 5920 157 13 64 219 4 2 11 107 665 1212 39
4 206 4 65 410 16 565 5 24 43 343 17 5602 8
169 101 85 206 108 8 3008 14 25 215 168 18 6 2579
1991 438 2 11 129 1609 36 26 66 290 3303 46 5 633
115 4363 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0]
In real words :
<start> this film contains more action before the opening credits than are in entire hollywood films of this sort this film is produced by tsui <unknown> and stars jet li this team has brought you many worthy hong kong cinema productions including the once upon a time in china series the action was fast and furious with amazing wire work i only saw the <unknown> in two shots aside from the action the story itself was strong and not just used as filler to find any other action films to rival this you must look for a hong kong cinema <unknown> in your area they are really worth checking out and usually never disappoint <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad> <pad>
%% Cell type:markdown id: tags:
### Save dataset and dictionary (can be usefull)
%% Cell type:code id: tags:
``` python
os.makedirs('./data', mode=0o750, exist_ok=True)
with h5py.File('./data/dataset_imdb.h5', 'w') as f:
f.create_dataset("x_train", data=x_train)
f.create_dataset("y_train", data=y_train)
f.create_dataset("x_test", data=x_test)
f.create_dataset("y_test", data=y_test)
with open('./data/word_index.json', 'w') as fp:
json.dump(word_index, fp)
with open('./data/index_word.json', 'w') as fp:
json.dump(index_word, fp)
print('Saved.')
```
%% Output
Saved.
%% Cell type:markdown id: tags:
## Step 4 - Build the model
Few remarks :
1. We'll choose a dense vector size for the embedding output with **dense_vector_size**
2. **GlobalAveragePooling1D** do a pooling on the last dimension : (None, lx, ly) -> (None, ly)
In other words: we average the set of vectors/words of a sentence
3. L'embedding de Keras fonctionne de manière supervisée. Il s'agit d'une couche de *vocab_size* neurones vers *n_neurons* permettant de maintenir une table de vecteurs (les poids constituent les vecteurs). Cette couche ne calcule pas de sortie a la façon des couches normales, mais renvois la valeur des vecteurs. n mots => n vecteurs (ensuite empilés par le pooling)
Voir : https://stats.stackexchange.com/questions/324992/how-the-embedding-layer-is-trained-in-keras-embedding-layer
A SUIVRE : https://www.liip.ch/en/blog/sentiment-detection-with-keras-word-embeddings-and-lstm-deep-learning-networks
### 4.1 - Build
More documentation about :
- [Embedding](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Embedding)
- [GlobalAveragePooling1D](https://www.tensorflow.org/api_docs/python/tf/keras/layers/GlobalAveragePooling1D)
%% Cell type:code id: tags:
``` python
def get_model(dense_vector_size=32):
model = keras.Sequential()
model.add(keras.layers.Embedding(input_dim = vocab_size,
output_dim = dense_vector_size,
input_length = review_len))
model.add(keras.layers.GlobalAveragePooling1D())
model.add(keras.layers.Dense(dense_vector_size, activation='relu'))
model.add(keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer = 'adam',
loss = 'binary_crossentropy',
metrics = ['accuracy'])
return model
```
%% Cell type:markdown id: tags:
## Step 5 - Train the model
### 5.1 - Get it
%% Cell type:code id: tags:
``` python
model = get_model(32)
model.summary()
```
%% Output
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 256, 32) 320000
_________________________________________________________________
global_average_pooling1d (Gl (None, 32) 0
_________________________________________________________________
dense (Dense) (None, 32) 1056
_________________________________________________________________
dense_1 (Dense) (None, 1) 33
=================================================================
Total params: 321,089
Trainable params: 321,089
Non-trainable params: 0
_________________________________________________________________
%% Cell type:markdown id: tags:
### 5.2 - Add callback
%% Cell type:code id: tags:
``` python
os.makedirs('./run/models', mode=0o750, exist_ok=True)
save_dir = "./run/models/best_model.h5"
savemodel_callback = tf.keras.callbacks.ModelCheckpoint(filepath=save_dir, verbose=0, save_best_only=True)
```
%% Cell type:markdown id: tags:
### 5.1 - Train it
%% Cell type:code id: tags:
``` python
%%time
n_epochs = 30
batch_size = 512
history = model.fit(x_train,
y_train,
epochs = n_epochs,
batch_size = batch_size,
validation_data = (x_test, y_test),
verbose = 1,
callbacks = [savemodel_callback])
```
%% Output
Train on 25000 samples, validate on 25000 samples
Epoch 1/30
25000/25000 [==============================] - 2s 60us/sample - loss: 0.6883 - accuracy: 0.6220 - val_loss: 0.6783 - val_accuracy: 0.7303
Epoch 2/30
25000/25000 [==============================] - 1s 32us/sample - loss: 0.6511 - accuracy: 0.7672 - val_loss: 0.6162 - val_accuracy: 0.7666
Epoch 3/30
25000/25000 [==============================] - 1s 30us/sample - loss: 0.5571 - accuracy: 0.8088 - val_loss: 0.5094 - val_accuracy: 0.8194
Epoch 4/30
25000/25000 [==============================] - 1s 30us/sample - loss: 0.4412 - accuracy: 0.8528 - val_loss: 0.4150 - val_accuracy: 0.8494
Epoch 5/30
25000/25000 [==============================] - 1s 30us/sample - loss: 0.3553 - accuracy: 0.8767 - val_loss: 0.3595 - val_accuracy: 0.8604
Epoch 6/30
25000/25000 [==============================] - 1s 30us/sample - loss: 0.3036 - accuracy: 0.8907 - val_loss: 0.3316 - val_accuracy: 0.8660
Epoch 7/30
25000/25000 [==============================] - 1s 30us/sample - loss: 0.2684 - accuracy: 0.9020 - val_loss: 0.3108 - val_accuracy: 0.8733
Epoch 8/30
25000/25000 [==============================] - 1s 31us/sample - loss: 0.2427 - accuracy: 0.9120 - val_loss: 0.2999 - val_accuracy: 0.8774
Epoch 9/30
25000/25000 [==============================] - 1s 30us/sample - loss: 0.2222 - accuracy: 0.9196 - val_loss: 0.2923 - val_accuracy: 0.8798
Epoch 10/30
25000/25000 [==============================] - 1s 30us/sample - loss: 0.2055 - accuracy: 0.9262 - val_loss: 0.2885 - val_accuracy: 0.8817
Epoch 11/30
25000/25000 [==============================] - 1s 31us/sample - loss: 0.1915 - accuracy: 0.9321 - val_loss: 0.2871 - val_accuracy: 0.8819
Epoch 12/30
25000/25000 [==============================] - 1s 32us/sample - loss: 0.1795 - accuracy: 0.9364 - val_loss: 0.2869 - val_accuracy: 0.8825
Epoch 13/30
25000/25000 [==============================] - 1s 30us/sample - loss: 0.1680 - accuracy: 0.9418 - val_loss: 0.2893 - val_accuracy: 0.8824
Epoch 14/30
25000/25000 [==============================] - 1s 31us/sample - loss: 0.1581 - accuracy: 0.9454 - val_loss: 0.2915 - val_accuracy: 0.8830
Epoch 15/30
25000/25000 [==============================] - 1s 31us/sample - loss: 0.1490 - accuracy: 0.9498 - val_loss: 0.2970 - val_accuracy: 0.8810
Epoch 16/30
25000/25000 [==============================] - 1s 30us/sample - loss: 0.1411 - accuracy: 0.9530 - val_loss: 0.3006 - val_accuracy: 0.8815
Epoch 17/30
25000/25000 [==============================] - 1s 29us/sample - loss: 0.1341 - accuracy: 0.9556 - val_loss: 0.3075 - val_accuracy: 0.8798
Epoch 18/30
25000/25000 [==============================] - 1s 29us/sample - loss: 0.1273 - accuracy: 0.9588 - val_loss: 0.3131 - val_accuracy: 0.8793
Epoch 19/30
25000/25000 [==============================] - 1s 29us/sample - loss: 0.1206 - accuracy: 0.9608 - val_loss: 0.3199 - val_accuracy: 0.8774
Epoch 20/30
25000/25000 [==============================] - 1s 29us/sample - loss: 0.1151 - accuracy: 0.9630 - val_loss: 0.3319 - val_accuracy: 0.8722
Epoch 21/30
25000/25000 [==============================] - 1s 29us/sample - loss: 0.1097 - accuracy: 0.9658 - val_loss: 0.3357 - val_accuracy: 0.8744
Epoch 22/30
25000/25000 [==============================] - 1s 30us/sample - loss: 0.1043 - accuracy: 0.9688 - val_loss: 0.3439 - val_accuracy: 0.8734
Epoch 23/30
25000/25000 [==============================] - 1s 32us/sample - loss: 0.0986 - accuracy: 0.9708 - val_loss: 0.3530 - val_accuracy: 0.8728
Epoch 24/30
25000/25000 [==============================] - 1s 31us/sample - loss: 0.0941 - accuracy: 0.9735 - val_loss: 0.3614 - val_accuracy: 0.8696
Epoch 25/30
25000/25000 [==============================] - 1s 32us/sample - loss: 0.0897 - accuracy: 0.9749 - val_loss: 0.3718 - val_accuracy: 0.8703
Epoch 26/30
25000/25000 [==============================] - 1s 29us/sample - loss: 0.0854 - accuracy: 0.9768 - val_loss: 0.3822 - val_accuracy: 0.8676
Epoch 27/30
25000/25000 [==============================] - 1s 29us/sample - loss: 0.0811 - accuracy: 0.9785 - val_loss: 0.3919 - val_accuracy: 0.8668
Epoch 28/30
25000/25000 [==============================] - 1s 30us/sample - loss: 0.0779 - accuracy: 0.9789 - val_loss: 0.4036 - val_accuracy: 0.8651
Epoch 29/30
25000/25000 [==============================] - 1s 30us/sample - loss: 0.0747 - accuracy: 0.9803 - val_loss: 0.4138 - val_accuracy: 0.8640
Epoch 30/30
25000/25000 [==============================] - 1s 30us/sample - loss: 0.0714 - accuracy: 0.9819 - val_loss: 0.4262 - val_accuracy: 0.8629
CPU times: user 1min 35s, sys: 4.59 s, total: 1min 40s
Wall time: 23.6 s
%% Cell type:markdown id: tags:
## Step 6 - Evaluate
### 6.1 - Training history
%% Cell type:code id: tags:
``` python
ooo.plot_history(history)
```
%% Output
%% Cell type:markdown id: tags:
### 6.2 - Reload and evaluate best model
%% Cell type:code id: tags:
``` python
model = keras.models.load_model('./run/models/best_model.h5')
# ---- Evaluate
reload(ooo)
score = model.evaluate(x_test, y_test, verbose=0)
print('x_test / loss : {:5.4f}'.format(score[0]))
print('x_test / accuracy : {:5.4f}'.format(score[1]))
values=[score[1], 1-score[1]]
ooo.plot_donut(values,["Accuracy","Errors"], title="#### Accuracy donut is :")
# ---- Confusion matrix
y_pred = model.predict_classes(x_test)
# ooo.display_confusion_matrix(y_test,y_pred,labels=range(2),color='orange',font_size='20pt')
ooo.display_confusion_matrix(y_test,y_pred,labels=range(2))
```
%% Output
x_test / loss : 0.2869
x_test / accuracy : 0.8825
#### Accuracy donut is :
#### Confusion matrix is :
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/00-Fidle-logo-01.svg"></img>
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/00-Fidle-header-01.svg"></img>
# <!-- TITLE --> [IMDB2] - Text embedding with IMDB - Reloaded
<!-- DESC --> Example of reusing a previously saved model
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- The objective is to guess whether film reviews are **positive or negative** based on the analysis of the text.
- For this, we will use our **previously saved model**.
Original dataset can be find **[there](http://ai.stanford.edu/~amaas/data/sentiment/)**
Note that [IMDb.com](https://imdb.com) offers several easy-to-use [datasets](https://www.imdb.com/interfaces/)
For simplicity's sake, we'll use the dataset directly [embedded in Keras](https://www.tensorflow.org/api_docs/python/tf/keras/datasets)
## What we're going to do :
- Preparing the data
- Retrieve our saved model
- Evaluate the result
%% Cell type:markdown id: tags:
## Step 1 - Init python stuff
%% Cell type:code id: tags:
``` python
import numpy as np
import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.datasets.imdb as imdb
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
import pandas as pd
import os,sys,h5py,json,re
from importlib import reload
sys.path.append('..')
import fidle.pwk as ooo
ooo.init()
```
%% Output
FIDLE 2020 - Practical Work Module
Version : 0.2.9
Run time : Wednesday 19 February 2020, 22:08:28
TensorFlow version : 2.0.0
Keras version : 2.2.4-tf
%% Cell type:markdown id: tags:
## Step 2 : Preparing the data
### 2.1 - Our reviews :
%% Cell type:code id: tags:
``` python
reviews = [ "This film is particularly nice, a must see.",
"Some films are great classics and cannot be ignored.",
"This movie is just abominable and doesn't deserve to be seen!"]
```
%% Cell type:markdown id: tags:
### 2.2 - Retrieve dictionaries
%% Cell type:code id: tags:
``` python
with open('./data/word_index.json', 'r') as fp:
word_index = json.load(fp)
index_word = {index:word for word,index in word_index.items()}
```
%% Cell type:markdown id: tags:
### 2.3 - Clean, index and padd
%% Cell type:code id: tags:
``` python
max_len = 256
vocab_size = 10000
nb_reviews = len(reviews)
x_data = []
# ---- For all reviews
for review in reviews:
# ---- First index must be <start>
index_review=[1]
# ---- For all words
for w in review.split(' '):
# ---- Clean it
w_clean = re.sub(r"[^a-zA-Z0-9]", "", w)
# ---- Not empty ?
if len(w_clean)>0:
# ---- Get the index
w_index = word_index.get(w,2)
if w_index>vocab_size : w_index=2
# ---- Add the index if < vocab_size
index_review.append(w_index)
# ---- Add the indexed review
x_data.append(index_review)
# ---- Padding
x_data = keras.preprocessing.sequence.pad_sequences(x_data, value = 0, padding = 'post', maxlen = max_len)
```
%% Cell type:markdown id: tags:
### 2.4 - Have a look
%% Cell type:code id: tags:
``` python
def translate(x):
return ' '.join( [index_word.get(i,'?') for i in x] )
for i in range(nb_reviews):
imax=np.where(x_data[i]==0)[0][0]+5
print(f'\nText review :', reviews[i])
print( f'x_train[{i:}] :', list(x_data[i][:imax]), '(...)')
print( 'Translation :', translate(x_data[i][:imax]), '(...)')
```
%% Output
Text review : This film is particularly nice, a must see.
x_train[0] : [1, 2, 22, 9, 572, 2, 6, 215, 2, 0, 0, 0, 0, 0] (...)
Translation : <start> <unknown> film is particularly <unknown> a must <unknown> <pad> <pad> <pad> <pad> <pad> (...)
Text review : Some films are great classics and cannot be ignored.
x_train[1] : [1, 2, 108, 26, 87, 2239, 5, 566, 30, 2, 0, 0, 0, 0, 0] (...)
Translation : <start> <unknown> films are great classics and cannot be <unknown> <pad> <pad> <pad> <pad> <pad> (...)
Text review : This movie is just abominable and doesn't deserve to be seen!
x_train[2] : [1, 2, 20, 9, 43, 2, 5, 152, 1833, 8, 30, 2, 0, 0, 0, 0, 0] (...)
Translation : <start> <unknown> movie is just <unknown> and doesn't deserve to be <unknown> <pad> <pad> <pad> <pad> <pad> (...)
%% Cell type:markdown id: tags:
## Step 2 - Bring back the model
%% Cell type:code id: tags:
``` python
model = keras.models.load_model('./run/models/best_model.h5')
```
%% Cell type:markdown id: tags:
## Step 4 - Predict
%% Cell type:code id: tags:
``` python
y_pred = model.predict(x_data)
```
%% Cell type:markdown id: tags:
#### And the winner is :
%% Cell type:code id: tags:
``` python
for i in range(nb_reviews):
print(f'\n{reviews[i]:<70} =>',('NEGATIVE' if y_pred[i][0]<0.5 else 'POSITIVE'),f'({y_pred[i][0]:.2f})')
```
%% Output
This film is particularly nice, a must see. => POSITIVE (0.54)
Some films are great classics and cannot be ignored. => POSITIVE (0.61)
This movie is just abominable and doesn't deserve to be seen! => NEGATIVE (0.33)
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/00-Fidle-logo-01.svg"></img>
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/00-Fidle-header-01.svg"></img>
# <!-- TITLE --> [IMDB3] - Text embedding/LSTM model with IMDB
<!-- DESC --> Still the same problem, but with a network combining embedding and LSTM
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- The objective is to guess whether film reviews are **positive or negative** based on the analysis of the text.
- Use of a model combining embedding and LSTM
Original dataset can be find **[there](http://ai.stanford.edu/~amaas/data/sentiment/)**
Note that [IMDb.com](https://imdb.com) offers several easy-to-use [datasets](https://www.imdb.com/interfaces/)
For simplicity's sake, we'll use the dataset directly [embedded in Keras](https://www.tensorflow.org/api_docs/python/tf/keras/datasets)
## What we're going to do :
- Retrieve data
- Preparing the data
- Build a Embedding/LSTM model
- Train the model
- Evaluate the result
%% Cell type:markdown id: tags:
## Step 1 - Init python stuff
%% Cell type:code id: tags:
``` python
import numpy as np
import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.datasets.imdb as imdb
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
import os,sys,h5py,json
from importlib import reload
sys.path.append('..')
import fidle.pwk as ooo
ooo.init()
```
%% Cell type:markdown id: tags:
## Step 2 - Retrieve data
**From Keras :**
This IMDb dataset can bet get directly from [Keras datasets](https://www.tensorflow.org/api_docs/python/tf/keras/datasets)
Due to their nature, textual data can be somewhat complex.
### 2.1 - Data structure :
The dataset is composed of 2 parts: **reviews** and **opinions** (positive/negative), with a **dictionary**
- dataset = (reviews, opinions)
- reviews = \[ review_0, review_1, ...\]
- review_i = [ int1, int2, ...] where int_i is the index of the word in the dictionary.
- opinions = \[ int0, int1, ...\] where int_j == 0 if opinion is negative or 1 if opinion is positive.
- dictionary = \[ mot1:int1, mot2:int2, ... ]
%% Cell type:markdown id: tags:
### 2.2 - Get dataset
For simplicity, we will use a pre-formatted dataset.
See : https://www.tensorflow.org/api_docs/python/tf/keras/datasets/imdb/load_data
However, Keras offers some usefull tools for formatting textual data.
See : https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/text
%% Cell type:code id: tags:
``` python
vocab_size = 10000
# ----- Retrieve x,y
#
(x_train, y_train), (x_test, y_test) = imdb.load_data( num_words = vocab_size,
skip_top = 0,
maxlen = None,
seed = 42,
start_char = 1,
oov_char = 2,
index_from = 3, )
```
%% Cell type:code id: tags:
``` python
print(" Max(x_train,x_test) : ", ooo.rmax([x_train,x_test]) )
print(" x_train : {} y_train : {}".format(x_train.shape, y_train.shape))
print(" x_test : {} y_test : {}".format(x_test.shape, y_test.shape))
print('\nReview example (x_train[12]) :\n\n',x_train[12])
```
%% Cell type:markdown id: tags:
### 2.3 - Have a look for humans (optional)
When we loaded the dataset, we asked for using \<start\> as 1, \<unknown word\> as 2
So, we shifted the dataset by 3 with the parameter index_from=3
%% Cell type:code id: tags:
``` python
# ---- Retrieve dictionary {word:index}, and encode it in ascii
word_index = imdb.get_word_index()
# ---- Shift the dictionary from +3
word_index = {w:(i+3) for w,i in word_index.items()}
# ---- Add <pad>, <start> and unknown tags
word_index.update( {'<pad>':0, '<start>':1, '<unknown>':2} )
# ---- Create a reverse dictionary : {index:word}
index_word = {index:word for word,index in word_index.items()}
# ---- Add a nice function to transpose :
#
def dataset2text(review):
return ' '.join([index_word.get(i, '?') for i in review])
```
%% Cell type:code id: tags:
``` python
print('\nDictionary size : ', len(word_index))
print('\nReview example (x_train[12]) :\n\n',x_train[12])
print('\nIn real words :\n\n', dataset2text(x_train[12]))
```
%% Cell type:markdown id: tags:
### 2.4 - Have a look for neurons
%% Cell type:code id: tags:
``` python
plt.figure(figsize=(12, 6))
ax=sns.distplot([len(i) for i in x_train],bins=60)
ax.set_title('Distribution of reviews by size')
plt.xlabel("Review's sizes")
plt.ylabel('Density')
ax.set_xlim(0, 1500)
plt.show()
```
%% Cell type:markdown id: tags:
## Step 3 - Preprocess the data
In order to be processed by an NN, all entries must have the same length.
We chose a review length of **review_len**
We will therefore complete them with a padding (of \<pad\>\)
%% Cell type:code id: tags:
``` python
review_len = 256
x_train = keras.preprocessing.sequence.pad_sequences(x_train,
value = 0,
padding = 'post',
maxlen = review_len)
x_test = keras.preprocessing.sequence.pad_sequences(x_test,
value = 0 ,
padding = 'post',
maxlen = review_len)
print('\nReview example (x_train[12]) :\n\n',x_train[12])
print('\nIn real words :\n\n', dataset2text(x_train[12]))
```
%% Cell type:markdown id: tags:
### Save dataset and dictionary (can be usefull)
%% Cell type:code id: tags:
``` python
os.makedirs('./data', mode=0o750, exist_ok=True)
with h5py.File('./data/dataset_imdb.h5', 'w') as f:
f.create_dataset("x_train", data=x_train)
f.create_dataset("y_train", data=y_train)
f.create_dataset("x_test", data=x_test)
f.create_dataset("y_test", data=y_test)
with open('./data/word_index.json', 'w') as fp:
json.dump(word_index, fp)
with open('./data/index_word.json', 'w') as fp:
json.dump(index_word, fp)
print('Saved.')
```
%% Cell type:markdown id: tags:
## Step 4 - Build the model
Few remarks :
1. We'll choose a dense vector size for the embedding output with **dense_vector_size**
2. **GlobalAveragePooling1D** do a pooling on the last dimension : (None, lx, ly) -> (None, ly)
In other words: we average the set of vectors/words of a sentence
3. L'embedding de Keras fonctionne de manière supervisée. Il s'agit d'une couche de *vocab_size* neurones vers *n_neurons* permettant de maintenir une table de vecteurs (les poids constituent les vecteurs). Cette couche ne calcule pas de sortie a la façon des couches normales, mais renvois la valeur des vecteurs. n mots => n vecteurs (ensuite empilés par le pooling)
Voir : https://stats.stackexchange.com/questions/324992/how-the-embedding-layer-is-trained-in-keras-embedding-layer
A SUIVRE : https://www.liip.ch/en/blog/sentiment-detection-with-keras-word-embeddings-and-lstm-deep-learning-networks
### 4.1 - Build
More documentation about :
- [Embedding](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Embedding)
- [GlobalAveragePooling1D](https://www.tensorflow.org/api_docs/python/tf/keras/layers/GlobalAveragePooling1D)
%% Cell type:code id: tags:
``` python
def get_model(dense_vector_size=128):
model = keras.Sequential()
model.add(keras.layers.Embedding(input_dim = vocab_size,
output_dim = dense_vector_size,
input_length = review_len))
model.add(keras.layers.LSTM(128, dropout=0.2, recurrent_dropout=0.2))
model.add(keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer = 'adam',
loss = 'binary_crossentropy',
metrics = ['accuracy'])
return model
```
%% Cell type:markdown id: tags:
## Step 5 - Train the model
### 5.1 - Get it
%% Cell type:code id: tags:
``` python
model = get_model()
model.summary()
```
%% Cell type:markdown id: tags:
### 5.2 - Add callback
%% Cell type:code id: tags:
``` python
os.makedirs('./run/models', mode=0o750, exist_ok=True)
save_dir = "./run/models/best_model.h5"
savemodel_callback = tf.keras.callbacks.ModelCheckpoint(filepath=save_dir, verbose=0, save_best_only=True)
```
%% Cell type:markdown id: tags:
### 5.1 - Train it
GPU : batch_size=512 : 305s
%% Cell type:code id: tags:
``` python
%%time
n_epochs = 10
batch_size = 32
history = model.fit(x_train,
y_train,
epochs = n_epochs,
batch_size = batch_size,
validation_data = (x_test, y_test),
verbose = 1,
callbacks = [savemodel_callback])
```
%% Cell type:markdown id: tags:
## Step 6 - Evaluate
### 6.1 - Training history
%% Cell type:code id: tags:
``` python
ooo.plot_history(history)
```
%% Cell type:markdown id: tags:
### 6.2 - Reload and evaluate best model
%% Cell type:code id: tags:
``` python
model = keras.models.load_model('./run/models/best_model.h5')
# ---- Evaluate
reload(ooo)
score = model.evaluate(x_test, y_test, verbose=0)
print('x_test / loss : {:5.4f}'.format(score[0]))
print('x_test / accuracy : {:5.4f}'.format(score[1]))
values=[score[1], 1-score[1]]
ooo.plot_donut(values,["Accuracy","Errors"], title="#### Accuracy donut is :")
# ---- Confusion matrix
y_pred = model.predict_classes(x_test)
ooo.display_confusion_matrix(y_test,y_pred,labels=range(2),color='orange',font_size='20pt')
```
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/00-Fidle-logo-01.svg"></img>
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/00-Fidle-header-01.svg"></img>
# <!-- TITLE --> [LINR1] - Linear regression with direct resolution
<!-- DESC --> Direct determination of linear regression
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- Just one, the illustration of a direct resolution :-)
## What we're going to do :
Equation : $ Y = X.\theta + N$
Where N is a noise vector
and $\theta = (a,b)$ a vector as y = a.x + b
%% Cell type:markdown id: tags:
## Step 1 - Import and init
%% Cell type:code id: tags:
``` python
import numpy as np
import math
import matplotlib
import matplotlib.pyplot as plt
import sys
sys.path.append('..')
import fidle.pwk as ooo
ooo.init()
```
%% Output
FIDLE 2020 - Practical Work Module
Version : 0.2.9
Run time : Tuesday 18 February 2020, 16:02:39
TensorFlow version : 2.0.0
Keras version : 2.2.4-tf
%% Cell type:markdown id: tags:
## Step 2 - Retrieve a set of points
%% Cell type:code id: tags:
``` python
# ---- Paramètres
nb = 100 # Nombre de points
xmin = 0 # Distribution / x
xmax = 10
a = 4 # Distribution / y
b = 2 # y= a.x + b (+ bruit)
noise = 7 # bruit
theta = np.array([[a],[b]])
# ---- Vecteur X (1,x) x nb
# la premiere colonne est a 1 afin que X.theta <=> 1.b + x.a
Xc1 = np.ones((nb,1))
Xc2 = np.random.uniform(xmin,xmax,(nb,1))
X = np.c_[ Xc1, Xc2 ]
# ---- Noise
# N = np.random.uniform(-noise,noise,(nb,1))
N = noise * np.random.normal(0,1,(nb,1))
# ---- Vecteur Y
Y = (X @ theta) + N
# print("X:\n",X,"\nY:\n ",Y)
```
%% Cell type:markdown id: tags:
### Show it
%% Cell type:code id: tags:
``` python
width = 12
height = 6
fig, ax = plt.subplots()
fig.set_size_inches(width,height)
ax.plot(X[:,1], Y, ".")
ax.tick_params(axis='both', which='both', bottom=False, left=False, labelbottom=False, labelleft=False)
ax.set_xlabel('x axis')
ax.set_ylabel('y axis')
plt.show()
```
%% Output
%% Cell type:markdown id: tags:
## Step 3 - Direct calculation of the normal equation
We'll try to find an optimal value of $\theta$, minimizing a cost function.
The cost function, classically used in the case of linear regressions, is the **root mean square error** (racine carré de l'erreur quadratique moyenne):
$RMSE(X,h_\theta)=\sqrt{\frac1n\sum_{i=1}^n\left[h_\theta(X^{(i)})-Y^{(i)}\right]^2}$
With the simplified variant : $MSE(X,h_\theta)=\frac1n\sum_{i=1}^n\left[h_\theta(X^{(i)})-Y^{(i)}\right]^2$
The optimal value of regression is : $ \hat{ \theta } =( X^{-T} .X)^{-1}.X^{-T}.Y$
Démontstration : https://eli.thegreenplace.net/2014/derivation-of-the-normal-equation-for-linear-regression
%% Cell type:code id: tags:
``` python
theta_hat = np.linalg.inv(X.T @ X) @ X.T @ Y
print("Theta :\n",theta,"\n\ntheta hat :\n",theta_hat)
```
%% Output
Theta :
[[4]
[2]]
theta hat :
[[6.81242007]
[1.56836316]]
%% Cell type:markdown id: tags:
### Show it
%% Cell type:code id: tags:
``` python
Xd = np.array([[1,xmin], [1,xmax]])
Yd = Xd @ theta_hat
fig, ax = plt.subplots()
fig.set_size_inches(width,height)
ax.plot(X[:,1], Y, ".")
ax.plot(Xd[:,1], Yd, "-")
ax.tick_params(axis='both', which='both', bottom=False, left=False, labelbottom=False, labelleft=False)
ax.set_xlabel('x axis')
ax.set_ylabel('y axis')
plt.show()
```
%% Output
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/00-Fidle-logo-01.svg"></img>
source diff could not be displayed: it is too large. Options to address this: view the blob.
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/00-Fidle-header-01.svg"></img>
# <!-- TITLE --> [FIT1] - Complexity Syndrome
<!-- DESC --> Illustration of the problem of complexity with the polynomial regression
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- Visualizing and understanding under and overfitting
## What we're going to do :
We are looking for a polynomial function to approximate the observed series :
$ y = a_n\cdot x^n + \dots + a_i\cdot x^i + \dots + a_1\cdot x + b $
## Step 1 - Import and init
%% Cell type:code id: tags:
``` python
import numpy as np
import math
import random
import matplotlib
import matplotlib.pyplot as plt
import sys
sys.path.append('..')
import fidle.pwk as ooo
ooo.init()
```
%% Output
FIDLE 2020 - Practical Work Module
Version : 0.2.9
Run time : Tuesday 18 February 2020, 17:23:05
TensorFlow version : 2.0.0
Keras version : 2.2.4-tf
%% Cell type:markdown id: tags:
## Step 2 - Preparation of learning data :
%% Cell type:code id: tags:
``` python
# ---- Parameters
n = 100
xob_min = -5
xob_max = 5
deg = 7
a_min = -2
a_max = 2
noise = 2000
# ---- Train data
# X,Y : data
# X_norm,Y_norm : normalized data
X = np.random.uniform(xob_min,xob_max,(n,1))
# N = np.random.uniform(-noise,noise,(n,1))
N = noise * np.random.normal(0,1,(n,1))
a = np.random.uniform(a_min,a_max, (deg,))
fy = np.poly1d( a )
Y = fy(X) + N
# ---- Data normalization
#
X_norm = (X - X.mean(axis=0)) / X.std(axis=0)
Y_norm = (Y - Y.mean(axis=0)) / Y.std(axis=0)
# ---- Data visualization
width = 12
height = 6
nb_viz = min(2000,n)
def vector_infos(name,V):
m=V.mean(axis=0).item()
s=V.std(axis=0).item()
print("{:8} : mean={:+12.4f} std={:+12.4f} min={:+12.4f} max={:+12.4f}".format(name,m,s,V.min(),V.max()))
print("Nombre de points : {} a={} deg={} bruit={}".format(n,a,deg,noise))
ooo.display_md('#### Before normalization :')
print("\nDonnées d'aprentissage brute :")
print("({} points visibles sur {})".format(nb_viz,n))
plt.figure(figsize=(width, height))
plt.plot(X[:nb_viz], Y[:nb_viz], '.')
plt.tick_params(axis='both', which='both', bottom=False, left=False, labelbottom=False, labelleft=False)
plt.xlabel('x axis')
plt.ylabel('y axis')
plt.show()
vector_infos('X',X)
vector_infos('Y',Y)
ooo.display_md('#### After normalization :')
print("\nDonnées d'aprentissage normalisées :")
print("({} points visibles sur {})".format(nb_viz,n))
plt.figure(figsize=(width, height))
plt.plot(X_norm[:nb_viz], Y_norm[:nb_viz], '.')
plt.tick_params(axis='both', which='both', bottom=False, left=False, labelbottom=False, labelleft=False)
plt.xlabel('x axis')
plt.ylabel('y axis')
plt.show()
vector_infos('X_norm',X_norm)
vector_infos('Y_norm',Y_norm)
```
%% Output
Nombre de points : 100 a=[-1.40023862 1.64009905 1.89987647 1.24972783 1.17765272 1.90935391
1.11259327] deg=7 bruit=2000
#### Before normalization :
Données d'aprentissage brute :
(100 points visibles sur 100)
X : mean= +0.2539 std= +2.9283 min= -4.9332 max= +4.9177
Y : mean= -2914.8537 std= +5532.3607 min= -23848.6013 max= +5139.0627
#### After normalization :
Données d'aprentissage normalisées :
(100 points visibles sur 100)
X_norm : mean= +0.0000 std= +1.0000 min= -1.7714 max= +1.5927
Y_norm : mean= -0.0000 std= +1.0000 min= -3.7839 max= +1.4558
%% Cell type:markdown id: tags:
## Step 3 - Polynomial regression with NumPy
### 3.1 - Underfitting
%% Cell type:code id: tags:
``` python
def draw_reg(X_norm, Y_norm, x_hat,fy_hat, size):
plt.figure(figsize=size)
plt.plot(X_norm, Y_norm, '.')
x_hat = np.linspace(X_norm.min(), X_norm.max(), 100)
plt.plot(x_hat, fy_hat(x_hat))
plt.tick_params(axis='both', which='both', bottom=False, left=False, labelbottom=False, labelleft=False)
plt.xlabel('x axis')
plt.ylabel('y axis')
plt.show()
```
%% Cell type:code id: tags:
``` python
reg_deg=1
a_hat = np.polyfit(X_norm.reshape(-1,), Y_norm.reshape(-1,), reg_deg)
fy_hat = np.poly1d( a_hat )
print("Nombre de degrés : {} a_hat={}".format(reg_deg, a_hat))
draw_reg(X_norm[:nb_viz],Y_norm[:nb_viz], x_hat,fy_hat, (width,height))
```
%% Output
Nombre de degrés : 1 a_hat=[ 2.15635737e-01 -1.26046371e-16]
%% Cell type:markdown id: tags:
### 3.2 - Good fitting
%% Cell type:code id: tags:
``` python
reg_deg=5
a_hat = np.polyfit(X_norm.reshape(-1,), Y_norm.reshape(-1,), reg_deg)
fy_hat = np.poly1d( a_hat )
print("Nombre de degrés : {} a_hat={}".format(reg_deg, a_hat))
draw_reg(X_norm[:nb_viz],Y_norm[:nb_viz], x_hat,fy_hat, (width,height))
```
%% Output
Nombre de degrés : 5 a_hat=[ 0.09676506 -0.49102546 -0.19674074 0.41539174 -0.00888844 0.51187173]
%% Cell type:markdown id: tags:
### 3.3 - Overfitting
%% Cell type:code id: tags:
``` python
reg_deg=24
a_hat = np.polyfit(X_norm.reshape(-1,), Y_norm.reshape(-1,), reg_deg)
fy_hat = np.poly1d( a_hat )
print("Nombre de degrés : {} a_hat={}".format(reg_deg, a_hat))
draw_reg(X_norm[:nb_viz],Y_norm[:nb_viz], x_hat,fy_hat, (width,height))
```
%% Output
Nombre de degrés : 24 a_hat=[-3.39583761e-01 3.66524802e+00 1.35968152e+01 -5.33709389e+01
-1.54708597e+02 3.43661072e+02 8.76139324e+02 -1.29075459e+03
-2.93637308e+03 3.13537269e+03 6.25956959e+03 -5.14495063e+03
-8.72124478e+03 5.75688179e+03 7.93008239e+03 -4.30734159e+03
-4.57884398e+03 2.04404969e+03 1.58025869e+03 -5.55090657e+02
-2.89374022e+02 7.05282858e+01 2.12619645e+01 -2.67799255e+00
3.95718335e-01]
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/00-Fidle-logo-01.svg"></img>
source diff could not be displayed: it is too large. Options to address this: view the blob.