Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • daconcea/fidle
  • bossardl/fidle
  • Julie.Remenant/fidle
  • abijolao/fidle
  • monsimau/fidle
  • karkars/fidle
  • guilgautier/fidle
  • cailletr/fidle
  • talks/fidle
9 results
Show changes
Showing
with 4048 additions and 2168 deletions
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/00-Fidle-header-01.svg"></img>
<img width="800px" src="../fidle/img/header.svg"></img>
# <!-- TITLE --> [POLR1] - Complexity Syndrome
<!-- DESC --> Illustration of the problem of complexity with the polynomial regression
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- Visualizing and understanding under and overfitting
## What we're going to do :
We are looking for a polynomial function to approximate the observed series :
$ y = a_n\cdot x^n + \dots + a_i\cdot x^i + \dots + a_1\cdot x + b $
## Step 1 - Import and init
%% Cell type:code id: tags:
``` python
import numpy as np
import math
import random
import matplotlib
import matplotlib.pyplot as plt
import sys
import fidle
sys.path.append('..')
import fidle.pwk as pwk
datasets_dir = pwk.init('POLR1')
# Init Fidle environment
run_id, run_dir, datasets_dir = fidle.init('POLR1')
```
%% Output
**FIDLE 2020 - Practical Work Module**
Version : 0.6.1 DEV
Notebook id : POLR1
Run time : Wednesday 16 December 2020, 17:48:01
TensorFlow version : 2.0.0
Keras version : 2.2.4-tf
Datasets dir : ~/datasets/fidle
Update keras cache : False
Save figs : True
Path figs : ./run/figs
%% Cell type:markdown id: tags:
## Step 2 - Dataset generation
%% Cell type:code id: tags:
``` python
# ---- Parameters
n = 100
xob_min = -5
xob_max = 5
deg = 7
a_min = -2
a_max = 2
noise = 2000
# ---- Train data
# X,Y : data
# X_norm,Y_norm : normalized data
X = np.random.uniform(xob_min,xob_max,(n,1))
# N = np.random.uniform(-noise,noise,(n,1))
N = noise * np.random.normal(0,1,(n,1))
a = np.random.uniform(a_min,a_max, (deg,))
fy = np.poly1d( a )
Y = fy(X) + N
# ---- Data normalization
#
X_norm = (X - X.mean(axis=0)) / X.std(axis=0)
Y_norm = (Y - Y.mean(axis=0)) / Y.std(axis=0)
# ---- Data visualization
width = 12
height = 6
nb_viz = min(2000,n)
def vector_infos(name,V):
m=V.mean(axis=0).item()
s=V.std(axis=0).item()
print("{:8} : mean={:+12.4f} std={:+12.4f} min={:+12.4f} max={:+12.4f}".format(name,m,s,V.min(),V.max()))
pwk.display_md('#### Generator :')
fidle.utils.display_md('#### Generator :')
print(f"Nomber of points={n} deg={deg} bruit={noise}")
pwk.display_md('#### Datasets :')
fidle.utils.display_md('#### Datasets :')
print(f"{nb_viz} points visibles sur {n})")
plt.figure(figsize=(width, height))
plt.plot(X[:nb_viz], Y[:nb_viz], '.')
plt.tick_params(axis='both', which='both', bottom=False, left=False, labelbottom=False, labelleft=False)
plt.xlabel('x axis')
plt.ylabel('y axis')
pwk.save_fig("01-dataset")
fidle.scrawler.save_fig("01-dataset")
plt.show()
pwk.display_md('#### Before normalization :')
fidle.utils.display_md('#### Before normalization :')
vector_infos('X',X)
vector_infos('Y',Y)
pwk.display_md('#### After normalization :')
fidle.utils.display_md('#### After normalization :')
vector_infos('X_norm',X_norm)
vector_infos('Y_norm',Y_norm)
```
%% Output
#### Generator :
Nomber of points=100 deg=7 bruit=2000
#### Datasets :
100 points visibles sur 100)
#### Before normalization :
X : mean= +0.3946 std= +2.7857 min= -4.7119 max= +4.9613
Y : mean= +944.8316 std= +2687.1875 min= -4221.2450 max= +11553.9320
#### After normalization :
X_norm : mean= -0.0000 std= +1.0000 min= -1.8331 max= +1.6393
Y_norm : mean= +0.0000 std= +1.0000 min= -1.9225 max= +3.9480
%% Cell type:markdown id: tags:
## Step 3 - Polynomial regression with NumPy
### 3.1 - Underfitting
%% Cell type:code id: tags:
``` python
def draw_reg(X_norm, Y_norm, x_hat,fy_hat, size, save_as):
plt.figure(figsize=size)
plt.plot(X_norm, Y_norm, '.')
x_hat = np.linspace(X_norm.min(), X_norm.max(), 100)
plt.plot(x_hat, fy_hat(x_hat))
plt.tick_params(axis='both', which='both', bottom=False, left=False, labelbottom=False, labelleft=False)
plt.xlabel('x axis')
plt.ylabel('y axis')
pwk.save_fig(save_as)
fidle.scrawler.save_fig(save_as)
plt.show()
```
%% Cell type:code id: tags:
``` python
reg_deg=1
a_hat = np.polyfit(X_norm.reshape(-1,), Y_norm.reshape(-1,), reg_deg)
fy_hat = np.poly1d( a_hat )
print(f'Nombre de degrés : {reg_deg}')
draw_reg(X_norm[:nb_viz],Y_norm[:nb_viz], X_norm,fy_hat, (width,height), save_as='02-underfitting')
```
%% Output
Nombre de degrés : 1
%% Cell type:markdown id: tags:
### 3.2 - Good fitting
%% Cell type:code id: tags:
``` python
reg_deg=5
a_hat = np.polyfit(X_norm.reshape(-1,), Y_norm.reshape(-1,), reg_deg)
fy_hat = np.poly1d( a_hat )
print(f'Nombre de degrés : {reg_deg}')
draw_reg(X_norm[:nb_viz],Y_norm[:nb_viz], X_norm,fy_hat, (width,height), save_as='03-good_fitting')
```
%% Output
Nombre de degrés : 5
%% Cell type:markdown id: tags:
### 3.3 - Overfitting
%% Cell type:code id: tags:
``` python
reg_deg=24
a_hat = np.polyfit(X_norm.reshape(-1,), Y_norm.reshape(-1,), reg_deg)
fy_hat = np.poly1d( a_hat )
print(f'Nombre de degrés : {reg_deg}')
draw_reg(X_norm[:nb_viz],Y_norm[:nb_viz], X_norm,fy_hat, (width,height), save_as='04-over_fitting')
```
%% Output
Nombre de degrés : 24
%% Cell type:code id: tags:
``` python
pwk.end()
fidle.end()
```
%% Output
End time is : Wednesday 16 December 2020, 17:48:03
Duration is : 00:00:01 491ms
This notebook ends here
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/00-Fidle-logo-01.svg"></img>
<img width="80px" src="../fidle/img/logo-paysage.svg"></img>
......
Source diff could not be displayed: it is too large. Options to address this: view the blob.
......@@ -20,19 +20,17 @@ import matplotlib
import matplotlib.pyplot as plt
from IPython.display import display,Markdown,HTML
sys.path.append('..')
import fidle.pwk as pwk
class RegressionCooker():
pwk = None
fidle = None
version = '0.1'
def __init__(self, pwk):
self.pwk = pwk
pwk.subtitle('FIDLE 2020 - Regression Cooker')
def __init__(self, fidle):
self.fidle = fidle
fidle.utils.subtitle('FIDLE 2020 - Regression Cooker')
print('Version :', self.version)
print('Run time : {}'.format(time.strftime("%A %-d %B %Y, %H:%M:%S")))
print('Run time : {}'.format(time.strftime("%A %d %B %Y, %H:%M:%S")))
@classmethod
......@@ -101,7 +99,7 @@ class RegressionCooker():
print(f"X shape : {X.shape} Y shape : {Y.shape} plot : {nb_viz} points")
plt.figure(figsize=(width, height))
plt.plot(X[:nb_viz], Y[:nb_viz], '.')
self.pwk.save_fig('01-dataset')
self.fidle.scrawler.save_fig('01-dataset')
plt.show()
self.vector_infos('X',X)
self.vector_infos('Y',Y)
......@@ -189,13 +187,13 @@ class RegressionCooker():
# ---- Visualization
pwk.subtitle('Visualization :')
self.pwk.save_fig('02-basic_descent')
self.fidle.utils.subtitle('Visualization :')
self.fidle.scrawler.save_fig('02-basic_descent')
plt.show()
pwk.subtitle('Loss :')
self.fidle.utils.subtitle('Loss :')
self.__plot_loss(loss)
self.pwk.save_fig('03-basic_descent_loss')
self.fidle.scrawler.save_fig('03-basic_descent_loss')
plt.show()
return theta
......@@ -268,14 +266,14 @@ class RegressionCooker():
# ---- Visualization
pwk.subtitle('Visualization :')
self.pwk.save_fig('04-minibatch_descent')
self.fidle.utils.subtitle('Visualization :')
self.fidle.scrawler.save_fig('04-minibatch_descent')
plt.show()
pwk.subtitle('Loss :')
self.fidle.utils.subtitle('Loss :')
self.__plot_loss(loss)
self.pwk.save_fig('05-minibatch_descent_loss')
self.fidle.scrawler.save_fig('05-minibatch_descent_loss')
plt.show()
return theta
\ No newline at end of file
return theta
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/header.svg"></img>
# <!-- TITLE --> [K3MNIST1] - Simple classification with DNN
<!-- DESC --> An example of classification using a dense neural network for the famous MNIST dataset
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- Recognizing handwritten numbers
- Understanding the principle of a classifier DNN network
- Implementation with Keras
The [MNIST dataset](http://yann.lecun.com/exdb/mnist/) (Modified National Institute of Standards and Technology) is a must for Deep Learning.
It consists of 60,000 small images of handwritten numbers for learning and 10,000 for testing.
## What we're going to do :
- Retrieve data
- Preparing the data
- Create a model
- Train the model
- Evaluate the result
%% Cell type:markdown id: tags:
## Step 1 - Init python stuff
%% Cell type:code id: tags:
``` python
import os
os.environ['KERAS_BACKEND'] = 'torch'
import keras
import numpy as np
import matplotlib.pyplot as plt
import sys,os
from importlib import reload
# Init Fidle environment
import fidle
run_id, run_dir, datasets_dir = fidle.init('K3MNIST1')
```
%% Cell type:markdown id: tags:
Verbosity during training : 0 = silent, 1 = progress bar, 2 = one line per epoch
%% Cell type:code id: tags:
``` python
fit_verbosity = 1
```
%% Cell type:markdown id: tags:
Override parameters (batch mode) - Just forget this cell
%% Cell type:code id: tags:
``` python
fidle.override('fit_verbosity')
```
%% Cell type:markdown id: tags:
## Step 2 - Retrieve data
MNIST is one of the most famous historic dataset.
Include in [Keras datasets](https://keras.io/datasets)
%% Cell type:code id: tags:
``` python
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
print("x_train : ",x_train.shape)
print("y_train : ",y_train.shape)
print("x_test : ",x_test.shape)
print("y_test : ",y_test.shape)
```
%% Cell type:markdown id: tags:
## Step 3 - Preparing the data
%% Cell type:code id: tags:
``` python
print('Before normalization : Min={}, max={}'.format(x_train.min(),x_train.max()))
xmax=x_train.max()
x_train = x_train / xmax
x_test = x_test / xmax
print('After normalization : Min={}, max={}'.format(x_train.min(),x_train.max()))
```
%% Cell type:markdown id: tags:
### Have a look
%% Cell type:code id: tags:
``` python
fidle.scrawler.images(x_train, y_train, [27], x_size=5,y_size=5, colorbar=True, save_as='01-one-digit')
fidle.scrawler.images(x_train, y_train, range(5,41), columns=12, save_as='02-many-digits')
```
%% Cell type:markdown id: tags:
## Step 4 - Create model
About informations about :
- [Optimizer](https://keras.io/api/optimizers)
- [Activation](https://keras.io/api/layers/activations)
- [Loss](https://keras.io/api/losses)
- [Metrics](https://keras.io/api/metrics)
%% Cell type:code id: tags:
``` python
hidden1 = 100
hidden2 = 100
model = keras.Sequential([
keras.layers.Input((28,28)),
keras.layers.Flatten(),
keras.layers.Dense( hidden1, activation='relu'),
keras.layers.Dense( hidden2, activation='relu'),
keras.layers.Dense( 10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
```
%% Cell type:markdown id: tags:
## Step 5 - Train the model
%% Cell type:code id: tags:
``` python
batch_size = 512
epochs = 16
history = model.fit( x_train, y_train,
batch_size = batch_size,
epochs = epochs,
verbose = fit_verbosity,
validation_data = (x_test, y_test))
```
%% Cell type:markdown id: tags:
## Step 6 - Evaluate
### 6.1 - Final loss and accuracy
%% Cell type:code id: tags:
``` python
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss :', score[0])
print('Test accuracy :', score[1])
```
%% Cell type:markdown id: tags:
### 6.2 - Plot history
%% Cell type:code id: tags:
``` python
fidle.scrawler.history(history, figsize=(6,4), save_as='03-history')
```
%% Cell type:markdown id: tags:
### 6.3 - Plot results
%% Cell type:code id: tags:
``` python
#y_pred = model.predict_classes(x_test) Deprecated after 01/01/2021 !!
y_sigmoid = model.predict(x_test, verbose=fit_verbosity)
y_pred = np.argmax(y_sigmoid, axis=-1)
fidle.scrawler.images(x_test, y_test, range(0,200), columns=12, x_size=1, y_size=1, y_pred=y_pred, save_as='04-predictions')
```
%% Cell type:markdown id: tags:
### 6.4 - Plot some errors
%% Cell type:code id: tags:
``` python
errors=[ i for i in range(len(x_test)) if y_pred[i]!=y_test[i] ]
errors=errors[:min(24,len(errors))]
fidle.scrawler.images(x_test, y_test, errors[:15], columns=6, x_size=2, y_size=2, y_pred=y_pred, save_as='05-some-errors')
```
%% Cell type:code id: tags:
``` python
fidle.scrawler.confusion_matrix(y_test,y_pred,range(10),normalize=True, save_as='06-confusion-matrix')
```
%% Cell type:code id: tags:
``` python
fidle.end()
```
%% Cell type:markdown id: tags:
<div class="todo">
A few things you can do for fun:
<ul>
<li>Changing the network architecture (layers, number of neurons, etc.)</li>
<li>Display a summary of the network</li>
<li>Retrieve and display the softmax output of the network, to evaluate its "doubts".</li>
</ul>
</div>
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/logo-paysage.svg"></img>
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/header.svg"></img>
# <!-- TITLE --> [K3MNIST2] - Simple classification with CNN
<!-- DESC --> An example of classification using a convolutional neural network for the famous MNIST dataset
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- Recognizing handwritten numbers
- Understanding the principle of a classifier DNN network
- Implementation with Keras
The [MNIST dataset](http://yann.lecun.com/exdb/mnist/) (Modified National Institute of Standards and Technology) is a must for Deep Learning.
It consists of 60,000 small images of handwritten numbers for learning and 10,000 for testing.
## What we're going to do :
- Retrieve data
- Preparing the data
- Create a model
- Train the model
- Evaluate the result
%% Cell type:markdown id: tags:
## Step 1 - Init python stuff
%% Cell type:code id: tags:
``` python
import os
os.environ['KERAS_BACKEND'] = 'torch'
import keras
import numpy as np
import matplotlib.pyplot as plt
import sys,os
from importlib import reload
# Init Fidle environment
import fidle
run_id, run_dir, datasets_dir = fidle.init('K3MNIST2')
```
%% Cell type:markdown id: tags:
Verbosity during training : 0 = silent, 1 = progress bar, 2 = one line per epoch
%% Cell type:code id: tags:
``` python
fit_verbosity = 1
```
%% Cell type:markdown id: tags:
Override parameters (batch mode) - Just forget this cell
%% Cell type:code id: tags:
``` python
fidle.override('fit_verbosity')
```
%% Cell type:markdown id: tags:
## Step 2 - Retrieve data
MNIST is one of the most famous historic dataset.
Include in [Keras datasets](https://keras.io/datasets)
%% Cell type:code id: tags:
``` python
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
x_train = x_train.reshape(-1,28,28,1)
x_test = x_test.reshape(-1,28,28,1)
print("x_train : ",x_train.shape)
print("y_train : ",y_train.shape)
print("x_test : ",x_test.shape)
print("y_test : ",y_test.shape)
```
%% Cell type:markdown id: tags:
## Step 3 - Preparing the data
%% Cell type:code id: tags:
``` python
print('Before normalization : Min={}, max={}'.format(x_train.min(),x_train.max()))
xmax=x_train.max()
x_train = x_train / xmax
x_test = x_test / xmax
print('After normalization : Min={}, max={}'.format(x_train.min(),x_train.max()))
```
%% Cell type:markdown id: tags:
### Have a look
%% Cell type:code id: tags:
``` python
fidle.scrawler.images(x_train, y_train, [27], x_size=5,y_size=5, colorbar=True, save_as='01-one-digit')
fidle.scrawler.images(x_train, y_train, range(5,41), columns=12, save_as='02-many-digits')
```
%% Cell type:markdown id: tags:
## Step 4 - Create model
About informations about :
- [Optimizer](https://keras.io/api/optimizers)
- [Activation](https://keras.io/api/layers/activations)
- [Loss](https://keras.io/api/losses)
- [Metrics](https://keras.io/api/metrics)
%% Cell type:code id: tags:
``` python
model = keras.models.Sequential()
model.add( keras.layers.Input((28,28,1)) )
model.add( keras.layers.Conv2D(8, (3,3), activation='relu') )
model.add( keras.layers.MaxPooling2D((2,2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Conv2D(16, (3,3), activation='relu') )
model.add( keras.layers.MaxPooling2D((2,2)))
model.add( keras.layers.Dropout(0.2))
model.add( keras.layers.Flatten())
model.add( keras.layers.Dense(100, activation='relu'))
model.add( keras.layers.Dropout(0.5))
model.add( keras.layers.Dense(10, activation='softmax'))
```
%% Cell type:code id: tags:
``` python
model.summary()
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
```
%% Cell type:markdown id: tags:
## Step 5 - Train the model
%% Cell type:code id: tags:
``` python
batch_size = 512
epochs = 16
history = model.fit( x_train, y_train,
batch_size = batch_size,
epochs = epochs,
verbose = fit_verbosity,
validation_data = (x_test, y_test))
```
%% Cell type:markdown id: tags:
## Step 6 - Evaluate
### 6.1 - Final loss and accuracy
Note : With a DNN, we had a precision of the order of : 97.7%
%% Cell type:code id: tags:
``` python
score = model.evaluate(x_test, y_test, verbose=0)
print(f'Test loss : {score[0]:4.4f}')
print(f'Test accuracy : {score[1]:4.4f}')
```
%% Cell type:markdown id: tags:
### 6.2 - Plot history
%% Cell type:code id: tags:
``` python
fidle.scrawler.history(history, figsize=(6,4), save_as='03-history')
```
%% Cell type:markdown id: tags:
### 6.3 - Plot results
%% Cell type:code id: tags:
``` python
#y_pred = model.predict_classes(x_test) Deprecated after 01/01/2021 !!
y_sigmoid = model.predict(x_test, verbose=fit_verbosity)
y_pred = np.argmax(y_sigmoid, axis=-1)
fidle.scrawler.images(x_test, y_test, range(0,200), columns=12, x_size=1, y_size=1, y_pred=y_pred, save_as='04-predictions')
```
%% Cell type:markdown id: tags:
### 6.4 - Plot some errors
%% Cell type:code id: tags:
``` python
errors=[ i for i in range(len(x_test)) if y_pred[i]!=y_test[i] ]
errors=errors[:min(24,len(errors))]
fidle.scrawler.images(x_test, y_test, errors[:15], columns=6, x_size=2, y_size=2, y_pred=y_pred, save_as='05-some-errors')
```
%% Cell type:code id: tags:
``` python
fidle.scrawler.confusion_matrix(y_test,y_pred,range(10),normalize=True, save_as='06-confusion-matrix')
```
%% Cell type:code id: tags:
``` python
fidle.end()
```
%% Cell type:markdown id: tags:
<div class="todo">
A few things you can do for fun:
<ul>
<li>Changing the network architecture (layers, number of neurons, etc.)</li>
<li>Display a summary of the network</li>
<li>Retrieve and display the softmax output of the network, to evaluate its "doubts".</li>
</ul>
</div>
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/logo-paysage.svg"></img>
%% Cell type:markdown id:86fe2213-fb44-4bd4-a371-a541cba6a744 tags:
<img width="800px" src="../fidle/img/header.svg"></img>
# <!-- TITLE --> [LMNIST1] - Simple classification with DNN
<!-- DESC --> An example of classification using a dense neural network for the famous MNIST dataset, using PyTorch Lightning
<!-- AUTHOR : MBOGOL Touye Achille (AI/ML Engineer EFELIA-MIAI/SIMAP Lab) -->
## Objectives :
- Recognizing handwritten numbers
- Understanding the principle of a classifier DNN network
- Implementation with pytorch lightning
The [MNIST dataset](http://yann.lecun.com/exdb/mnist/) (Modified National Institute of Standards and Technology) is a must for Deep Learning.
It consists of 60,000 small images of handwritten numbers for learning and 10,000 for testing.
## What we're going to do :
- Retrieve data
- Preparing the data
- Create a model
- Train the model
- Evaluate the result
%% Cell type:markdown id:7f16101a-6612-4e02-93e9-c45ce1ac911d tags:
## Step 1 - Init python stuff
%% Cell type:code id:743c77d3-0983-491c-90be-ef2219861a47 tags:
``` python
import pandas as pd
import numpy as np
import torch
import torch.nn as nn
import lightning.pytorch as pl
import torch.nn.functional as F
import torchvision.transforms as T
import sys,os
import multiprocessing
from torchvision import datasets
from torchmetrics.functional import accuracy
from torch.utils.data import Dataset, DataLoader
from lightning.pytorch import loggers as pl_loggers
from modules.progressbar import CustomTrainProgressBar
from lightning.pytorch.loggers.tensorboard import TensorBoardLogger
# Init Fidle environment
import fidle
run_id, run_dir, datasets_dir = fidle.init('LMNIST1')
```
%% Cell type:markdown id:df10dcda-aa63-476b-8665-9b1610fe51c6 tags:
## Step 2 Retrieve data
MNIST is one of the most famous historic dataset include in torchvision Datasets. `torchvision` provides many built-in datasets in the `torchvision.datasets`.
%% Cell type:code id:6668e50c-f0c6-43cf-b733-9ac29d6a3900 tags:
``` python
# Load data sets
train_dataset = datasets.MNIST(root="data", train=True, download=True, transform=None)
test_dataset = datasets.MNIST(root="data", train=False, download=True, transform=None)
```
%% Cell type:code id:b543b885-6336-461d-abbe-6d3171405771 tags:
``` python
# print info for train data
print(train_dataset)
print()
# print info for test data
print(test_dataset)
```
%% Cell type:code id:44a489f5-3e53-4a2b-8069-f265b2814dc0 tags:
``` python
# See the shape of train data and test data
print("x_train : ",train_dataset.data.shape)
print("y_train : ",train_dataset.targets.shape)
print()
print("x_test : ",test_dataset.data.shape)
print("y_test : ",test_dataset.targets.shape)
print()
# print number of labels or class
print("Number of Targets :",len(np.unique(train_dataset.targets)))
print("Targets Values :", np.unique(train_dataset.targets))
print("\nRemark that we work with torch tensors and not numpy array, not tensorflow tensor")
print(" -> x_train.dtype = ",train_dataset.data.dtype)
print(" -> y_train.dtype = ",train_dataset.targets.dtype)
```
%% Cell type:markdown id:b418adb7-33ea-450c-9793-3cdce5d5fa8c tags:
## Step 3 - Preparing your data for training with DataLoaders
The Dataset retrieves our dataset’s features and labels one sample at a time. While training a model, we typically want to pass samples in `minibatches`, reshuffle the data at every epoch to reduce model overfitting, and use Python’s multiprocessing to speed up data retrieval. DataLoader is an iterable that abstracts this complexity for us in an easy API.
%% Cell type:code id:8af0bc4c-acb3-46d9-aae2-143b0327d970 tags:
``` python
# Before normalization:
x_train=train_dataset.data
print('Before normalization : Min={}, max={}'.format(x_train.min(),x_train.max()))
# After normalization:
## T.Compose creates a pipeline where the provided transformations are run in sequence
transforms = T.Compose(
[
# This transforms takes a np.array or a PIL image of integers
# in the range 0-255 and transforms it to a float tensor in the
# range 0.0 - 1.0
T.ToTensor()
]
)
train_dataset = datasets.MNIST(root="data", train=True, download=True, transform=transforms)
test_dataset = datasets.MNIST(root="data", train=False, download=True, transform=transforms)
# print image and label After normalization.
## iter() followed by next() is used to get some images and label.
image,label=next(iter(train_dataset))
print('After normalization : Min={}, max={}'.format(image.min(),image.max()))
```
%% Cell type:markdown id:35d50a57-8274-4660-8765-d0f2bf7214bd tags:
### Have a look
%% Cell type:code id:a172ebc5-8858-4f30-8e2c-1e9c123ae0ee tags:
``` python
x_train=train_dataset.data
y_train=train_dataset.targets
```
%% Cell type:code id:5a487760-b43a-4f7c-bfd8-1ce2c9652769 tags:
``` python
fidle.scrawler.images(x_train, y_train, [27], x_size=5, y_size=5, colorbar=True, save_as='01-one-digit')
fidle.scrawler.images(x_train, y_train, range(5,41), columns=12, save_as='02-many-digits')
```
%% Cell type:code id:ca0a63ae-e6d6-4940-b8ff-9b11cb2737bb tags:
``` python
# train bacth data
train_loader= DataLoader(
dataset=train_dataset,
shuffle=True,
batch_size=512,
num_workers=2
)
# test batch data
test_loader= DataLoader(
dataset=test_dataset,
shuffle=False,
batch_size=512,
num_workers=2
)
# print image and label after normalization.
image, label=next(iter(train_loader))
print('Shape of first training data batch after use pytorch dataloader :\nbatch images = {} \nbatch labels = {}'.format(image.shape,label.shape))
```
%% Cell type:markdown id:51bf21ee-76ca-42fa-b67f-066dbd239a72 tags:
## Step 4 - Create Model
About informations about :
- [Optimizer](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers)
- [Activation](https://www.tensorflow.org/api_docs/python/tf/keras/activations)
- [Loss](https://www.tensorflow.org/api_docs/python/tf/keras/losses)
- [Metrics](https://www.tensorflow.org/api_docs/python/tf/keras/metrics)
`Note :` PyTorch provides losses such as the cross-entropy loss (`nn.CrossEntropyLoss`) usually use for classification problem. we're using the softmax function to predict class probabilities. With a softmax output, you want to use cross-entropy as the loss. To actually calculate the loss, we need to pass in the raw output of our network into the loss, not the output of the softmax function. This raw output is usually called the *logits* or *scores*. because in pytorch the cross entropy contain softmax function already.
%% Cell type:code id:16701119-71eb-4f59-a50a-f153b07a74ae tags:
``` python
class MyNet(nn.Module):
def __init__(self,num_class=10):
super().__init__()
self.num_class = num_class
self.model = nn.Sequential(
# Input vector:
nn.Flatten(), # convert each 2D 28x28 image into a contiguous array of 784 pixel values
# first hidden layer
nn.Linear(in_features=1*28*28, out_features=100),
nn.ReLU(),
nn.Dropout1d(0.1), # Combat overfitting
# second hidden layer
nn.Linear(in_features=100, out_features=100),
nn.ReLU(),
nn.Dropout1d(0.1), # Combat overfitting
# logits outpout
nn.Linear(100, num_class)
)
# forward pass
def forward(self, x):
return self.model(x)
```
%% Cell type:code id:37abf99b-f8ec-4048-a65d-f173ee18b234 tags:
``` python
class LitModel(pl.LightningModule):
def __init__(self, MyNet):
super().__init__()
self.MyNet = MyNet
# forward pass
def forward(self, x):
return self.MyNet(x)
def configure_optimizers(self):
# optimizer
optimizer = torch.optim.Adam(self.parameters(), lr=1e-3)
return optimizer
def training_step(self, batch, batch_idx):
# defines the train loop
x, y = batch
# forward pass
y_hat = self.MyNet(x)
# computes the cross entropy loss between input logits and target
loss = F.cross_entropy(y_hat, y)
# accuracy metrics
acc = accuracy(y_hat, y,task="multiclass",num_classes=10)
metrics = {"train_loss": loss,
"train_acc" : acc
}
# logs metrics for each training_step
self.log_dict(metrics,
on_step = False,
on_epoch = True,
prog_bar = True,
logger = True
)
return loss
def validation_step(self, batch, batch_idx):
# defines the valid loop.
x, y = batch
# forward pass
y_hat = self.MyNet(x)
# computes the cross entropy loss between input logits and target
loss = F.cross_entropy(y_hat, y)
# accuracy metrics
acc = accuracy(y_hat, y,task="multiclass",num_classes=10)
metrics = {"test_loss": loss,
"test_acc": acc
}
# logs metrics for each validation_step
self.log_dict(metrics,
on_step = False,
on_epoch = True,
prog_bar = True,
logger = True
)
return metrics
def predict_step(self, batch, batch_idx):
# defnie the predict loop
x, y = batch
# forward pass
y_hat = self.MyNet(x)
return y_hat
```
%% Cell type:code id:7546b27e-d492-420a-8d5d-109201b47830 tags:
``` python
# print summary model
model=LitModel(MyNet())
print(model)
```
%% Cell type:markdown id:fb32e85d-bd92-4ca5-a3dc-ddb5ed50ba6b tags:
## Step 5 - Train Model
%% Cell type:code id:96f0e087-f21a-4afc-85c5-3a3c0c353fe1 tags:
``` python
# loggers data
os.makedirs(f'{run_dir}/logs', mode=0o750, exist_ok=True)
logger= TensorBoardLogger(save_dir=f'{run_dir}/logs',name="DNN_logs")
```
%% Cell type:code id:ce975c03-d05d-40c4-92ff-0cc90699c13e tags:
``` python
# train model
# trainer= pl.Trainer(accelerator='auto',
# max_epochs=20,
# logger=logger,
# num_sanity_val_steps=0,
# callbacks=[CustomTrainProgressBar()]
# )
trainer= pl.Trainer(accelerator='auto',
max_epochs=20,
logger=logger,
num_sanity_val_steps=0
)
trainer.fit(model=model, train_dataloaders=train_loader, val_dataloaders=test_loader,)
```
%% Cell type:markdown id:a1191f05-4454-415c-a5ed-e63d9ae56651 tags:
## Step 6 - Evaluate
### 6.1 - Final loss and accuracy
%% Cell type:code id:9f45316e-0d2d-4fc1-b9a8-5fb8aaf5586a tags:
``` python
score=trainer.validate(model=model,dataloaders=test_loader, verbose=False)
print('x_test / acc : {:5.4f}'.format(score[0]['test_acc']))
print('x_test / loss : {:5.4f}'.format(score[0]['test_loss']))
```
%% Cell type:markdown id:e352e48d-b473-4162-a1aa-72d6d4f7aa38 tags:
## 6.2 - Plot history
To access logs with tensorboad :
- Under **Docker**, from a terminal launched via the jupyterlab launcher, use the following command:<br>
```tensorboard --logdir <path-to-logs> --host 0.0.0.0```
- If you're **not using Docker**, from a terminal :<br>
```tensorboard --logdir <path-to-logs>```
**Note:** One tensorboard instance can be used simultaneously.
%% Cell type:markdown id:f00ded6b-a7db-4c5d-b1b2-72264db20bdb tags:
### 6.3 - Plot results
%% Cell type:code id:e387a70d-9c23-4d16-8ef7-879aec7791e2 tags:
``` python
# logits outpout by batch size
y_logits=trainer.predict(model=model,dataloaders=test_loader)
# Concat into single tensor
y_logits=torch.cat(y_logits)
# output probabilities values
y_pred_values=F.softmax(y_logits,dim=1)
# Returns the indices of the maximum output probabilities values
y_pred=torch.argmax(y_pred_values,dim=-1).numpy()
```
%% Cell type:code id:fb2b2eeb-fcd8-453c-93ef-59a960a8bbd5 tags:
``` python
x_test=test_dataset.data
y_test=test_dataset.targets
```
%% Cell type:code id:71187fa9-2ad3-4b23-94b9-1846045bd070 tags:
``` python
fidle.scrawler.images(x_test, y_test, range(0,200), columns=12, x_size=1, y_size=1, y_pred=y_pred, save_as='04-predictions')
```
%% Cell type:markdown id:2fc7b2b9-9115-4848-9aae-2798bf7aa79a tags:
### 6.4 - Plot some errors
%% Cell type:code id:e55f17c4-fce7-423a-9adf-f2511c534ef5 tags:
``` python
errors=[ i for i in range(len(x_test)) if y_pred[i]!=y_test[i] ]
errors=errors[:min(24,len(errors))]
fidle.scrawler.images(x_test, y_test, errors[:15], columns=6, x_size=2, y_size=2, y_pred=y_pred, save_as='05-some-errors')
```
%% Cell type:code id:fea1b396-70ca-4b00-851d-0538a4b347fb tags:
``` python
fidle.scrawler.confusion_matrix(y_test,y_pred,range(10),normalize=True, save_as='06-confusion-matrix')
```
%% Cell type:code id:e982c032-cce8-4c71-8cdc-2af4b31b2914 tags:
``` python
fidle.end()
```
%% Cell type:markdown id:233838c2-c97f-4489-8c79-9247d7b7456b tags:
<div class="todo">
A few things you can do for fun:
<ul>
<li>Changing the network architecture (layers, number of neurons, etc.)</li>
<li>Display a summary of the network</li>
<li>Retrieve and display the softmax output of the network, to evaluate its "doubts".</li>
</ul>
</div>
%% Cell type:markdown id:51b87aa0-d4e9-48bb-8205-4b583f4b0b61 tags:
---
<img width="80px" src="../fidle/img/logo-paysage.svg"></img>
%% Cell type:markdown id:86fe2213-fb44-4bd4-a371-a541cba6a744 tags:
<img width="800px" src="../fidle/img/header.svg"></img>
## <!-- TITLE --> [LMNIST2] - Simple classification with CNN
<!-- DESC --> An example of classification using a convolutional neural network for the famous MNIST dataset, using PyTorch Lightning
<!-- AUTHOR : MBOGOL Touye Achille (AI/ML Engineer MIAI/SIMaP) -->
## Objectives :
- Recognizing handwritten numbers
- Understanding the principle of a classifier DNN network
- Implementation with pytorch lightning
The [MNIST dataset](http://yann.lecun.com/exdb/mnist/) (Modified National Institute of Standards and Technology) is a must for Deep Learning.
It consists of 60,000 small images of handwritten numbers for learning and 10,000 for testing.
## What we're going to do :
- Retrieve data
- Preparing the data
- Create a model
- Train the model
- Evaluate the result
%% Cell type:markdown id:7f16101a-6612-4e02-93e9-c45ce1ac911d tags:
## Step 1 - Init python stuff
%% Cell type:code id:743c77d3-0983-491c-90be-ef2219861a47 tags:
``` python
import pandas as pd
import numpy as np
import torch
import torch.nn as nn
import lightning.pytorch as pl
import torch.nn.functional as F
import torchvision.transforms as T
import sys,os
import multiprocessing
import matplotlib.pyplot as plt
from torchvision import datasets
from torchmetrics.functional import accuracy
from torch.utils.data import Dataset, DataLoader
from modules.progressbar import CustomTrainProgressBar
from lightning.pytorch.loggers import TensorBoardLogger
# Init Fidle environment
import fidle
run_id, run_dir, datasets_dir = fidle.init('LMNIST2')
```
%% Cell type:markdown id:df10dcda-aa63-476b-8665-9b1610fe51c6 tags:
## Step 2 Retrieve data
MNIST is one of the most famous historic dataset include in torchvision Datasets. `torchvision` provides many built-in datasets in the `torchvision.datasets`.
%% Cell type:code id:6668e50c-f0c6-43cf-b733-9ac29d6a3900 tags:
``` python
# Load data sets
train_dataset = datasets.MNIST(root="data", train=True, download=True, transform=None)
test_dataset= datasets.MNIST(root="data", train=False, download=False, transform=None)
```
%% Cell type:code id:a14d6fc2-b913-4eaa-9cde-5ca6785bfa12 tags:
``` python
# print info for train data
print(train_dataset)
print()
# print info for test data
print(test_dataset)
```
%% Cell type:code id:44a489f5-3e53-4a2b-8069-f265b2814dc0 tags:
``` python
# See the shape of train data and test data
print("x_train : ",train_dataset.data.shape)
print("y_train : ",train_dataset.targets.shape)
print()
print("x_test : ",test_dataset.data.shape)
print("y_test : ",test_dataset.targets.shape)
print()
# print number of targets and values targets
print("Number of Targets :",len(np.unique(train_dataset.targets)))
print("Targets Values :", np.unique(train_dataset.targets))
print()
print("Remark that we work with torch tensors and not numpy array, not tensorflow tensor")
print(" -> x_train.dtype = ",train_dataset.data.dtype)
print(" -> y_train.dtype = ",train_dataset.targets.dtype)
```
%% Cell type:markdown id:b418adb7-33ea-450c-9793-3cdce5d5fa8c tags:
## Step 3 - Preparing your data for training with DataLoaders
The Dataset retrieves our dataset’s features and labels one sample at a time. While training a model, we typically want to pass samples in `minibatches`, reshuffle the data at every epoch to reduce model overfitting, and use Python’s multiprocessing to speed up data retrieval. DataLoader is an iterable that abstracts this complexity for us in an easy API.
%% Cell type:code id:8af0bc4c-acb3-46d9-aae2-143b0327d970 tags:
``` python
# Before normalization:
x_train=train_dataset.data
print('Before normalization : Min={}, max={}'.format(x_train.min(),x_train.max()))
# After normalization:
## T.Compose creates a pipeline where the provided transformations are run in sequence
transforms = T.Compose(
[
# This transforms takes a np.array or a PIL image of integers
# in the range 0-255 and transforms it to a float tensor in the
# range 0.0 - 1.0
T.ToTensor(),
]
)
train_dataset = datasets.MNIST(root="data", train=True, download=True, transform=transforms)
test_dataset= datasets.MNIST(root="data", train=False, download=True, transform=transforms)
# print image and label After normalization.
# iter() followed by next() is used to get some images and label
image,label=next(iter(train_dataset))
print('After normalization : Min={}, max={}'.format(image.min(),image.max()))
```
%% Cell type:markdown id:35d50a57-8274-4660-8765-d0f2bf7214bd tags:
### Have a look
%% Cell type:code id:a172ebc5-8858-4f30-8e2c-1e9c123ae0ee tags:
``` python
x_train=train_dataset.data
y_train=train_dataset.targets
```
%% Cell type:code id:5a487760-b43a-4f7c-bfd8-1ce2c9652769 tags:
``` python
fidle.scrawler.images(x_train, y_train, [27], x_size=5,y_size=5, colorbar=True, save_as='01-one-digit')
fidle.scrawler.images(x_train, y_train, range(5,41), columns=12, save_as='02-many-digits')
```
%% Cell type:code id:ca0a63ae-e6d6-4940-b8ff-9b11cb2737bb tags:
``` python
# train bacth data
train_loader= DataLoader(
dataset=train_dataset,
shuffle=True,
batch_size=512,
num_workers=2
)
# test batch data
test_loader= DataLoader(
dataset=test_dataset,
shuffle=False,
batch_size=512,
num_workers=2
)
# print image and label After normalization and batch_size.
image, label=next(iter(train_loader))
print('Shape of first training data batch after use pytorch dataloader :\nbatch images = {} \nbatch labels = {}'.format(image.shape,label.shape))
```
%% Cell type:markdown id:51bf21ee-76ca-42fa-b67f-066dbd239a72 tags:
## Step 4 - Create Model
About informations about :
- [Optimizer](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers)
- [Activation](https://www.tensorflow.org/api_docs/python/tf/keras/activations)
- [Loss](https://www.tensorflow.org/api_docs/python/tf/keras/losses)
- [Metrics](https://www.tensorflow.org/api_docs/python/tf/keras/metrics)
`Note :` PyTorch provides losses such as the cross-entropy loss (`nn.CrossEntropyLoss`) usually use for classification problem. we're using the softmax function to predict class probabilities. With a softmax output, you want to use cross-entropy as the loss. To actually calculate the loss, we need to pass in the raw output of our network into the loss, not the output of the softmax function. This raw output is usually called the *logits* or *scores*. because in pytorch the cross entropy contain softmax function already.
%% Cell type:code id:16701119-71eb-4f59-a50a-f153b07a74ae tags:
``` python
class MyNet(nn.Module):
def __init__(self,num_class=10):
super().__init__()
self.num_class=num_class
self.model = nn.Sequential(
# first convolution
nn.Conv2d(in_channels=1, out_channels=8, kernel_size=3, stride=1, padding=0),
nn.ReLU(),
nn.MaxPool2d((2,2)),
nn.Dropout2d(0.1), # Combat overfitting
# second convolution
nn.Conv2d(in_channels=8, out_channels=16, kernel_size=3, stride=1, padding=0),
nn.ReLU(),
nn.MaxPool2d((2,2)),
nn.Dropout2d(0.1), # Combat overfitting
nn.Flatten(), # convert feature map into feature vectors
# MLP network
nn.Linear(16*5*5,100),
nn.ReLU(),
nn.Dropout1d(0.1), # Combat overfitting
nn.Linear(100, num_class), # logits outpout
)
def forward(self, x):
x=self.model(x) # forward pass
return x
```
%% Cell type:code id:37abf99b-f8ec-4048-a65d-f173ee18b234 tags:
``` python
class LitModel(pl.LightningModule):
def __init__(self, MyNet):
super().__init__()
self.MyNet = MyNet
# forward pass
def forward(self, x):
return self.MyNet(x)
# optimizer
def configure_optimizers(self):
optimizer = torch.optim.Adam(self.parameters(), lr=1e-3)
return optimizer
def training_step(self, batch, batch_idx):
# defines the train loop
x, y = batch
# forward pass
y_hat = self.MyNet(x)
# loss function
loss= F.cross_entropy(y_hat, y)
# metrics accuracy
acc=accuracy(y_hat, y,task="multiclass",num_classes=10)
metrics = {"train_loss": loss, "train_acc": acc}
# logs metrics for each training_step
self.log_dict(metrics,
on_step=False ,
on_epoch=True,
prog_bar=True,
logger=True)
return loss
def validation_step(self, batch, batch_idx):
# defines the valid loop.
x, y = batch
# forward pass
y_hat= self.MyNet(x)
# loss function MSE
loss = F.cross_entropy(y_hat, y)
# metrics accuracy
acc=accuracy(y_hat, y, task="multiclass", num_classes=10)
metrics = {"test_loss": loss, "test_acc": acc}
# logs metrics for each validation_step
self.log_dict(metrics,
on_step = False,
on_epoch = True,
prog_bar = True,
logger = True
)
return metrics
def predict_step(self, batch, batch_idx):
# defnie the predict loop
x, y = batch
# forward pass
y_hat = self.MyNet(x)
return y_hat
```
%% Cell type:code id:489af62f-8f7c-4d1b-a6d0-5a0417e79869 tags:
``` python
# print summary model
model=LitModel(MyNet())
print(model)
```
%% Cell type:markdown id:fb32e85d-bd92-4ca5-a3dc-ddb5ed50ba6b tags:
## Step 5 - Train Model
%% Cell type:code id:756d5e19-6a10-42b8-8971-411389f7d19c tags:
``` python
# loggers data
os.makedirs(f'{run_dir}/logs', mode=0o750, exist_ok=True)
logger= TensorBoardLogger(save_dir=f'{run_dir}/logs',name="CNN_logs")
```
%% Cell type:code id:ce975c03-d05d-40c4-92ff-0cc90699c13e tags:
``` python
# train model
# trainer = pl.Trainer(accelerator='auto',
# max_epochs=16,
# logger=logger,
# num_sanity_val_steps=0,
# callbacks=[CustomTrainProgressBar()]
# )
trainer = pl.Trainer(accelerator='auto',
max_epochs=16,
logger=logger,
num_sanity_val_steps=0
)
trainer.fit(model=model, train_dataloaders=train_loader, val_dataloaders=test_loader)
```
%% Cell type:markdown id:a1191f05-4454-415c-a5ed-e63d9ae56651 tags:
## Step 6 - Evaluate
### 6.1 - Final loss and accuracy
Note : With a DNN, we had a precision of the order of : 97.7%
%% Cell type:code id:9f45316e-0d2d-4fc1-b9a8-5fb8aaf5586a tags:
``` python
# evaluate your model
score=trainer.validate(model=model,dataloaders=test_loader, verbose=False)
print('x_test / acc : {:5.4f}'.format(score[0]['test_acc']))
print('x_test / loss : {:5.4f}'.format(score[0]['test_loss']))
```
%% Cell type:code id:5cfe9bd6-654b-42e0-b430-5f3b816526b0 tags:
``` python
score=trainer.validate(model=model,dataloaders=test_loader, verbose=False)
```
%% Cell type:markdown id:e352e48d-b473-4162-a1aa-72d6d4f7aa38 tags:
## 6.2 - Plot history
To access logs with tensorboad :
- Under **Docker**, from a terminal launched via the jupyterlab launcher, use the following command:<br>
```tensorboard --logdir <path-to-logs> --host 0.0.0.0```
- If you're **not using Docker**, from a terminal :<br>
```tensorboard --logdir <path-to-logs>```
**Note:** One tensorboard instance can be used simultaneously.
%% Cell type:markdown id:f00ded6b-a7db-4c5d-b1b2-72264db20bdb tags:
### 6.3 - Plot results
%% Cell type:code id:e387a70d-9c23-4d16-8ef7-879aec7791e2 tags:
``` python
# logits outpout by batch size
y_logits= trainer.predict(model=model,dataloaders=test_loader)
# Concat into single tensor
y_logits= torch.cat(y_logits)
# output probabilities values
y_pred_values=F.softmax(y_logits,dim=1)
# Returns the indices of the maximum output probabilities values
y_pred=torch.argmax(y_pred_values,dim=-1)
```
%% Cell type:code id:fb2b2eeb-fcd8-453c-93ef-59a960a8bbd5 tags:
``` python
x_test=test_dataset.data
y_test=test_dataset.targets
```
%% Cell type:code id:71187fa9-2ad3-4b23-94b9-1846045bd070 tags:
``` python
fidle.scrawler.images(x_test, y_test, range(0,200), columns=12, x_size=1, y_size=1, y_pred=y_pred, save_as='04-predictions')
```
%% Cell type:markdown id:2fc7b2b9-9115-4848-9aae-2798bf7aa79a tags:
### 6.4 - Plot some errors
%% Cell type:code id:e55f17c4-fce7-423a-9adf-f2511c534ef5 tags:
``` python
errors=[ i for i in range(len(x_test)) if y_pred[i]!=y_test[i] ]
errors=errors[:min(24,len(errors))]
fidle.scrawler.images(x_test, y_test, errors[:15], columns=6, x_size=2, y_size=2, y_pred=y_pred, save_as='05-some-errors')
```
%% Cell type:code id:fea1b396-70ca-4b00-851d-0538a4b347fb tags:
``` python
fidle.scrawler.confusion_matrix(y_test,y_pred,range(10),normalize=True, save_as='06-confusion-matrix')
```
%% Cell type:code id:e982c032-cce8-4c71-8cdc-2af4b31b2914 tags:
``` python
fidle.end()
```
%% Cell type:markdown id:233838c2-c97f-4489-8c79-9247d7b7456b tags:
<div class="todo">
A few things you can do for fun:
<ul>
<li>Changing the network architecture (layers, number of neurons, etc.)</li>
<li>Display a summary of the network</li>
<li>Retrieve and display the softmax output of the network, to evaluate its "doubts".</li>
</ul>
</div>
%% Cell type:markdown id:51b87aa0-d4e9-48bb-8205-4b583f4b0b61 tags:
---
<img width="80px" src="../fidle/img/logo-paysage.svg"></img>
# ------------------------------------------------------------------
# _____ _ _ _
# | ___(_) __| | | ___
# | |_ | |/ _` | |/ _ \
# | _| | | (_| | | __/
# |_| |_|\__,_|_|\___|
# ------------------------------------------------------------------
# Formation Introduction au Deep Learning (FIDLE)
# CNRS/SARI/DEVLOG 2023
# ------------------------------------------------------------------
# 2.0 version by Achille Mbogol Touye (EFELIA-MIAI/SIMAP¨), sep 2023
from tqdm import tqdm as _tqdm
from lightning.pytorch.callbacks import TQDMProgressBar
# Créez un callback de barre de progression pour afficher les métriques d'entraînement
class CustomTrainProgressBar(TQDMProgressBar):
def __init__(self):
super().__init__()
self._val_progress_bar = _tqdm()
self._predict_progress_bar = _tqdm()
def init_predict_tqdm(self):
bar=super().init_test_tqdm()
bar.set_description("Predicting")
return bar
def init_train_tqdm(self):
bar=super().init_train_tqdm()
bar.set_description("Training")
return bar
@property
def val_progress_bar(self):
if self._val_progress_bar is None:
raise ValueError("The `_val_progress_bar` reference has not been set yet.")
return self._val_progress_bar
@property
def predict_progress_bar(self) -> _tqdm:
if self._predict_progress_bar is None:
raise TypeError(f"The `{self.__class__.__name__}._predict_progress_bar` reference has not been set yet.")
return self._predict_progress_bar
def on_validation_start(self, trainer, pl_module):
# Désactivez l'affichage de la barre de progression de validation
self.val_progress_bar.disable = True
def on_predict_start(self, trainer, pl_module):
# Désactivez l'affichage de la barre de progression de validation
self.predict_progress_bar.disable = True
\ No newline at end of file
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/header.svg"></img>
# <!-- TITLE --> [PMNIST1] - Simple classification with DNN
<!-- DESC -->Example of classification with a fully connected neural network, using Pytorch
<!-- AUTHOR : Laurent Risser (CNRS/IMT) -->
## Objectives :
- Recognizing handwritten numbers
- Understanding the principle of a classifier DNN network
- Implementation with PyTorch
The [MNIST dataset](http://yann.lecun.com/exdb/mnist/) (Modified National Institute of Standards and Technology) is a must for Deep Learning.
It consists of 60,000 small images of handwritten numbers for learning and 10,000 for testing.
## What we're going to do :
- Retrieve data
- Preparing the data
- Create a model
- Train the model
- Evaluate the result
%% Cell type:markdown id: tags:
## Step 1 - Init python stuff
%% Cell type:code id: tags:
``` python
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
import torchvision #to get the MNIST dataset
import numpy as np
import matplotlib.pyplot as plt
import sys,os
import fidle
from modules.fidle_pwk_additional import convergence_history_CrossEntropyLoss
# Init Fidle environment
run_id, run_dir, datasets_dir = fidle.init('PMNIST1')
```
%% Cell type:markdown id: tags:
## Step 2 - Retrieve data
MNIST is one of the most famous historic dataset.
Include in [torchvision datasets](https://pytorch.org/vision/stable/datasets.html)
%% Cell type:code id: tags:
``` python
#get and format the training set
mnist_trainset = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=None)
x_train=mnist_trainset.data.type(torch.DoubleTensor)
y_train=mnist_trainset.targets
#get and format the test set
mnist_testset = torchvision.datasets.MNIST(root='./data', train=False, download=True, transform=None)
x_test=mnist_testset.data.type(torch.DoubleTensor)
y_test=mnist_testset.targets
#check data shape and format
print("Size of the train and test observations")
print(" -> x_train : ",x_train.shape)
print(" -> y_train : ",y_train.shape)
print(" -> x_test : ",x_test.shape)
print(" -> y_test : ",y_test.shape)
print("\nRemark that we work with torch tensors and not numpy arrays:")
print(" -> x_train.dtype = ",x_train.dtype)
print(" -> y_train.dtype = ",y_train.dtype)
```
%% Cell type:markdown id: tags:
## Step 3 - Preparing the data
%% Cell type:code id: tags:
``` python
print('Before normalization : Min={}, max={}'.format(x_train.min(),x_train.max()))
xmax=x_train.max()
x_train = x_train / xmax
x_test = x_test / xmax
print('After normalization : Min={}, max={}'.format(x_train.min(),x_train.max()))
```
%% Cell type:markdown id: tags:
### Have a look
%% Cell type:code id: tags:
``` python
np_x_train=x_train.numpy().astype(np.float64)
np_y_train=y_train.numpy().astype(np.uint8)
fidle.scrawler.images(np_x_train,np_y_train , [27], x_size=5,y_size=5, colorbar=True)
fidle.scrawler.images(np_x_train,np_y_train, range(5,41), columns=12)
```
%% Cell type:markdown id: tags:
## Step 4 - Create model
About informations about :
- [Optimizer](https://pytorch.org/docs/stable/optim.html)
- [Basic neural-network blocks](https://pytorch.org/docs/stable/nn.html)
- [Loss](https://pytorch.org/docs/stable/nn.html#loss-functions)
%% Cell type:code id: tags:
``` python
class MyModel(nn.Module):
"""
Basic fully connected neural-network
"""
def __init__(self):
hidden1 = 100
hidden2 = 100
super(MyModel, self).__init__()
self.hidden1 = nn.Linear(784, hidden1)
self.hidden2 = nn.Linear(hidden1, hidden2)
self.hidden3 = nn.Linear(hidden2, 10)
def forward(self, x):
x = x.view(-1,784) #flatten the images before using fully-connected layers
x = self.hidden1(x)
x = F.relu(x)
x = self.hidden2(x)
x = F.relu(x)
x = self.hidden3(x)
x = F.softmax(x, dim=0)
return x
model = MyModel()
```
%% Cell type:markdown id: tags:
## Step 5 - Train the model
### 5.1 - Stochastic gradient descent strategy to fit the model
%% Cell type:code id: tags:
``` python
def fit(model,X_train,Y_train,X_test,Y_test, EPOCHS = 5, BATCH_SIZE = 32):
loss = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(),lr=1e-3) #lr is the learning rate
model.train()
history=convergence_history_CrossEntropyLoss()
history.update(model,X_train,Y_train,X_test,Y_test)
n=X_train.shape[0] #number of observations in the training data
#stochastic gradient descent
for epoch in range(EPOCHS):
batch_start=0
epoch_shuffler=np.arange(n)
np.random.shuffle(epoch_shuffler) #remark that 'utilsData.DataLoader' could be used instead
while batch_start+BATCH_SIZE < n:
#get mini-batch observation
mini_batch_observations = epoch_shuffler[batch_start:batch_start+BATCH_SIZE]
var_X_batch = Variable(X_train[mini_batch_observations,:,:]).float() #the input image is flattened
var_Y_batch = Variable(Y_train[mini_batch_observations])
#gradient descent step
optimizer.zero_grad() #set the parameters gradients to 0
Y_pred_batch = model(var_X_batch) #predict y with the current NN parameters
curr_loss = loss(Y_pred_batch, var_Y_batch) #compute the current loss
curr_loss.backward() #compute the loss gradient w.r.t. all NN parameters
optimizer.step() #update the NN parameters
#prepare the next mini-batch of the epoch
batch_start+=BATCH_SIZE
history.update(model,X_train,Y_train,X_test,Y_test)
return history
```
%% Cell type:markdown id: tags:
### 5.2 - Fit the model
%% Cell type:code id: tags:
``` python
batch_size = 512
epochs = 128
history=fit(model,x_train,y_train,x_test,y_test,EPOCHS=epochs,BATCH_SIZE = batch_size)
```
%% Cell type:markdown id: tags:
## Step 6 - Evaluate
### 6.1 - Final loss and accuracy
%% Cell type:code id: tags:
``` python
var_x_test = Variable(x_test[:,:,:]).float()
var_y_test = Variable(y_test[:])
y_pred = model(var_x_test)
loss = nn.CrossEntropyLoss()
curr_loss = loss(y_pred, var_y_test)
val_loss = curr_loss.item()
val_accuracy = float( (torch.argmax(y_pred, dim= 1) == var_y_test).float().mean() )
print('Test loss :', val_loss)
print('Test accuracy :', val_accuracy)
```
%% Cell type:markdown id: tags:
### 6.2 - Plot history
%% Cell type:code id: tags:
``` python
fidle.scrawler.history(history, figsize=(6,4))
```
%% Cell type:markdown id: tags:
### 6.3 - Plot results
%% Cell type:code id: tags:
``` python
y_pred = model(var_x_test)
np_y_pred_label = torch.argmax(y_pred, dim= 1).numpy().astype(np.uint8)
np_x_test=x_test.numpy().astype(np.float64)
np_y_test=y_test.numpy().astype(np.uint8)
fidle.scrawler.images(np_x_test, np_y_test, range(0,60), columns=12, x_size=1, y_size=1, y_pred=np_y_pred_label)
```
%% Cell type:markdown id: tags:
### 6.4 - Plot some errors
%% Cell type:code id: tags:
``` python
errors=[ i for i in range(len(np_y_test)) if np_y_pred_label[i]!=np_y_test[i] ]
errors=errors[:min(24,len(errors))]
fidle.scrawler.images(np_x_test, np_y_test, errors[:15], columns=6, x_size=2, y_size=2, y_pred=np_y_pred_label)
```
%% Cell type:code id: tags:
``` python
fidle.scrawler.confusion_matrix(np_y_test,np_y_pred_label, range(10))
```
%% Cell type:markdown id: tags:
<div class="todo">
A few things you can do for fun:
<ul>
<li>Changing the network architecture (layers, number of neurons, etc.)</li>
<li>Display a summary of the network</li>
<li>Retrieve and display the softmax output of the network, to evaluate its "doubts".</li>
</ul>
</div>
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/logo-paysage.svg"></img>
Source diff could not be displayed: it is too large. Options to address this: view the blob.
Source diff could not be displayed: it is too large. Options to address this: view the blob.
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/00-Fidle-header-01.svg"></img>
<img width="800px" src="../fidle/img/header.svg"></img>
# <!-- TITLE --> [NP1] - A short introduction to Numpy
<!-- DESC --> Numpy is an essential tool for the Scientific Python.
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- Comprendre les grands principes de Numpy et son potentiel
- Understand the main principles of Numpy and its potential
Note : This notebook is strongly inspired by the UGA Python Introduction Course
See : **https://gricad-gitlab.univ-grenoble-alpes.fr/python-uga/py-training-2017**
%% Cell type:markdown id: tags:
## Step 1 - Numpy the beginning
Code using `numpy` usually starts with the import statement
%% Cell type:code id: tags:
``` python
import numpy as np
```
%% Cell type:markdown id: tags:
NumPy provides the type `np.ndarray`. Such array are multidimensionnal sequences of homogeneous elements. They can be created for example with the commands:
%% Cell type:code id: tags:
``` python
# from a list
l = [10.0, 12.5, 15.0, 17.5, 20.0]
np.array(l)
```
%% Output
array([10. , 12.5, 15. , 17.5, 20. ])
%% Cell type:code id: tags:
``` python
# fast but the values can be anything
np.empty(4)
```
%% Output
array([ 6.93990061e-310, 6.95333088e-310, -1.90019324e+120,
6.93987701e-310])
%% Cell type:code id: tags:
``` python
# slower than np.empty but the values are all 0.
np.zeros([2, 6])
```
%% Output
array([[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.]])
%% Cell type:code id: tags:
``` python
# multidimensional array
a = np.ones([2, 3, 4])
print(a.shape, a.size, a.dtype)
a
```
%% Output
(2, 3, 4) 24 float64
array([[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]],
[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]]])
%% Cell type:code id: tags:
``` python
# like range but produce 1D numpy array
np.arange(4)
```
%% Output
array([0, 1, 2, 3])
%% Cell type:code id: tags:
``` python
# np.arange can produce arrays of floats
np.arange(4.)
```
%% Output
array([0., 1., 2., 3.])
%% Cell type:code id: tags:
``` python
# another convenient function to generate 1D arrays
np.linspace(10, 20, 5)
```
%% Output
array([10. , 12.5, 15. , 17.5, 20. ])
%% Cell type:markdown id: tags:
A NumPy array can be easily converted to a Python list.
%% Cell type:code id: tags:
``` python
a = np.linspace(10, 20 ,5)
list(a)
```
%% Output
[10.0, 12.5, 15.0, 17.5, 20.0]
%% Cell type:code id: tags:
``` python
# Or even better
a.tolist()
```
%% Output
[10.0, 12.5, 15.0, 17.5, 20.0]
%% Cell type:markdown id: tags:
## Step 2 - Access elements
Elements in a `numpy` array can be accessed using indexing and slicing in any dimension. It also offers the same functionalities available in Fortan or Matlab.
### 2.1 - Indexes and slices
For example, we can create an array `A` and perform any kind of selection operations on it.
%% Cell type:code id: tags:
``` python
A = np.random.random([4, 5])
A
```
%% Output
array([[0.14726334, 0.90799321, 0.67130094, 0.23978162, 0.96444415],
[0.26039418, 0.06135763, 0.35856793, 0.73366941, 0.50698925],
[0.39557097, 0.55950866, 0.70056205, 0.65344863, 0.90891062],
[0.19049184, 0.56355734, 0.71701494, 0.66035889, 0.06400119]])
%% Cell type:code id: tags:
``` python
# Get the element from second line, first column
A[1, 0]
```
%% Output
0.26039417830707656
%% Cell type:code id: tags:
``` python
# Get the first two lines
A[:2]
```
%% Output
array([[0.14726334, 0.90799321, 0.67130094, 0.23978162, 0.96444415],
[0.26039418, 0.06135763, 0.35856793, 0.73366941, 0.50698925]])
%% Cell type:code id: tags:
``` python
# Get the last column
A[:, -1]
```
%% Output
array([0.96444415, 0.50698925, 0.90891062, 0.06400119])
%% Cell type:code id: tags:
``` python
# Get the first two lines and the columns with an even index
A[:2, ::2]
```
%% Output
array([[0.14726334, 0.67130094, 0.96444415],
[0.26039418, 0.35856793, 0.50698925]])
%% Cell type:markdown id: tags:
### 2.2 - Using a mask to select elements validating a condition:
%% Cell type:code id: tags:
``` python
cond = A > 0.5
print(cond)
print(A[cond])
```
%% Output
[[False True True False True]
[False False False True True]
[False True True True True]
[False True True True False]]
[0.90799321 0.67130094 0.96444415 0.73366941 0.50698925 0.55950866
0.70056205 0.65344863 0.90891062 0.56355734 0.71701494 0.66035889]
%% Cell type:markdown id: tags:
The mask is in fact a particular case of the advanced indexing capabilities provided by NumPy. For example, it is even possible to use lists for indexing:
%% Cell type:code id: tags:
``` python
# Selecting only particular columns
print(A)
A[:, [0, 1, 4]]
```
%% Output
[[0.14726334 0.90799321 0.67130094 0.23978162 0.96444415]
[0.26039418 0.06135763 0.35856793 0.73366941 0.50698925]
[0.39557097 0.55950866 0.70056205 0.65344863 0.90891062]
[0.19049184 0.56355734 0.71701494 0.66035889 0.06400119]]
array([[0.14726334, 0.90799321, 0.96444415],
[0.26039418, 0.06135763, 0.50698925],
[0.39557097, 0.55950866, 0.90891062],
[0.19049184, 0.56355734, 0.06400119]])
%% Cell type:markdown id: tags:
## Step 3 - Perform array manipulations
### 3.1 - Apply arithmetic operations to whole arrays (element-wise):
%% Cell type:code id: tags:
``` python
(A+5)**2
```
%% Output
array([[26.49431985, 34.90438372, 32.16365436, 27.45531142, 35.57459401],
[27.67174691, 25.61734103, 28.71425022, 32.87496493, 30.32693058],
[29.11218606, 30.9081365 , 32.49640767, 31.96148136, 34.91522467],
[26.94120557, 30.95317031, 32.68425986, 32.03966276, 25.6441081 ]])
%% Cell type:markdown id: tags:
### 3.2 - Apply functions element-wise:
%% Cell type:code id: tags:
``` python
np.exp(A) # With numpy arrays, use the functions from numpy !
```
%% Output
array([[1.15865904, 2.47934201, 1.95678132, 1.27097157, 2.62332907],
[1.29744141, 1.0632791 , 1.43127825, 2.08270892, 1.66028496],
[1.48523197, 1.74981253, 2.01488485, 1.92215822, 2.48161763],
[1.2098445 , 1.75691133, 2.04830976, 1.93548684, 1.06609367]])
%% Cell type:markdown id: tags:
### 3.3 - Setting parts of arrays
%% Cell type:code id: tags:
``` python
A[:, 0] = 0.
print(A)
```
%% Output
[[0. 0.90799321 0.67130094 0.23978162 0.96444415]
[0. 0.06135763 0.35856793 0.73366941 0.50698925]
[0. 0.55950866 0.70056205 0.65344863 0.90891062]
[0. 0.56355734 0.71701494 0.66035889 0.06400119]]
%% Cell type:code id: tags:
``` python
# BONUS: Safe element-wise inverse with masks
cond = (A != 0)
A[cond] = 1./A[cond]
print(A)
```
%% Output
[[ 0. 1.10132983 1.48964487 4.17046144 1.03686668]
[ 0. 16.29789234 2.78887186 1.36301171 1.97242842]
[ 0. 1.78728245 1.42742531 1.53034219 1.1002182 ]
[ 0. 1.77444232 1.39467107 1.51432807 15.62470834]]
%% Cell type:markdown id: tags:
## Step 4 - Attributes and methods of `np.ndarray` (see the [doc](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html#numpy.ndarray))
%% Cell type:code id: tags:
``` python
for i,v in enumerate([s for s in dir(A) if not s.startswith('__')]):
print(f'{v:16}', end='')
if (i+1) % 6 == 0 :print('')
```
%% Output
T all any argmax argmin argpartition
argsort astype base byteswap choose clip
compress conj conjugate copy ctypes cumprod
cumsum data diagonal dot dtype dump
dumps fill flags flat flatten getfield
imag item itemset itemsize max mean
min nbytes ndim newbyteorder nonzero partition
prod ptp put ravel real repeat
reshape resize round searchsorted setfield setflags
shape size sort squeeze std strides
sum swapaxes take tobytes tofile tolist
tostring trace transpose var view
%% Cell type:code id: tags:
``` python
# Ex1: Get the mean through different dimensions
print(A)
print('Mean value', A.mean())
print('Mean line', A.mean(axis=0))
print('Mean column', A.mean(axis=1))
```
%% Output
[[ 0. 1.10132983 1.48964487 4.17046144 1.03686668]
[ 0. 16.29789234 2.78887186 1.36301171 1.97242842]
[ 0. 1.78728245 1.42742531 1.53034219 1.1002182 ]
[ 0. 1.77444232 1.39467107 1.51432807 15.62470834]]
Mean value 2.818696254398785
Mean line [0. 5.24023674 1.77515328 2.14453585 4.93355541]
Mean column [1.55966056 4.48444087 1.16905363 4.06162996]
%% Cell type:code id: tags:
``` python
# Ex2: Convert a 2D array in 1D keeping all elements
print(A)
print(A.shape)
A_flat = A.flatten()
print(A_flat, A_flat.shape)
```
%% Output
[[ 0. 1.10132983 1.48964487 4.17046144 1.03686668]
[ 0. 16.29789234 2.78887186 1.36301171 1.97242842]
[ 0. 1.78728245 1.42742531 1.53034219 1.1002182 ]
[ 0. 1.77444232 1.39467107 1.51432807 15.62470834]]
(4, 5)
[ 0. 1.10132983 1.48964487 4.17046144 1.03686668 0.
16.29789234 2.78887186 1.36301171 1.97242842 0. 1.78728245
1.42742531 1.53034219 1.1002182 0. 1.77444232 1.39467107
1.51432807 15.62470834] (20,)
%% Cell type:markdown id: tags:
### 4.1 - Remark: dot product
%% Cell type:code id: tags:
``` python
b = np.linspace(0, 10, 11)
c = b @ b
# before 3.5:
# c = b.dot(b)
print(b)
print(c)
```
%% Output
[ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.]
385.0
%% Cell type:markdown id: tags:
### 4.2 - For Matlab users
| ` ` | Matlab | Numpy |
| ------------- | ------ | ----- |
| element wise | `.*` | `*` |
| dot product | `*` | `@` |
%% Cell type:markdown id: tags:
`numpy` arrays can also be sorted, even when they are composed of complex data if the type of the columns are explicitly stated with `dtypes`.
%% Cell type:markdown id: tags:
### 4.3 - NumPy and SciPy sub-packages:
We already saw `numpy.random` to generate `numpy` arrays filled with random values. This submodule also provides functions related to distributions (Poisson, gaussian, etc.) and permutations.
%% Cell type:markdown id: tags:
To perform linear algebra with dense matrices, we can use the submodule `numpy.linalg`. For instance, in order to compute the determinant of a random matrix, we use the method `det`
%% Cell type:code id: tags:
``` python
A = np.random.random([5,5])
print(A)
np.linalg.det(A)
```
%% Output
[[0.33277412 0.18065847 0.10352574 0.48095553 0.97748505]
[0.20756676 0.33166777 0.00808192 0.18868636 0.1722338 ]
[0.94092977 0.21755657 0.52045179 0.45008315 0.1751413 ]
[0.27404121 0.53531168 0.41209088 0.22503687 0.50026306]
[0.23077516 0.99886616 0.74286904 0.40849416 0.57970741]]
-0.026288777656342802
%% Cell type:code id: tags:
``` python
squared_subA = A[1:3, 1:3]
print(squared_subA)
np.linalg.inv(squared_subA)
```
%% Output
[[0.33166777 0.00808192]
[0.21755657 0.52045179]]
array([[ 3.0460928 , -0.04730175],
[-1.27331197, 1.94118039]])
%% Cell type:markdown id: tags:
### 4.4 - Introduction to Pandas: Python Data Analysis Library
Pandas is an open source library providing high-performance, easy-to-use data structures and data analysis tools for Python.
[Pandas tutorial](https://pandas.pydata.org/pandas-docs/stable/10min.html)
[Grenoble Python Working Session](https://github.com/iutzeler/Pres_Pandas/)
[Pandas for SQL Users](http://sergilehkyi.com/translating-sql-to-pandas/)
[Pandas Introduction Training HPC Python@UGA](https://gricad-gitlab.univ-grenoble-alpes.fr/python-uga/training-hpc/-/blob/master/ipynb/11_pandas.ipynb)
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/00-Fidle-logo-01.svg"></img>
<img width="80px" src="../fidle/img/logo-paysage.svg"></img>
......
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/header.svg"></img>
# <!-- TITLE --> [ACTF1] - Activation functions
<!-- DESC --> Some activation functions, with their derivatives.
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- View the main activation functions
Les fonctions d'activation dans Keras :
https://www.tensorflow.org/api_docs/python/tf/keras/activations
## What we're going to do :
- Juste visualiser les principales fonctions d'activation
%% Cell type:code id: tags:
``` python
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import math
from math import erfc, sqrt, exp
from math import pi as PI
from math import e as E
import sys
import fidle
# Init Fidle environment
run_id, run_dir, datasets_dir = fidle.init('ACTF1')
```
%% Cell type:code id: tags:
``` python
SELU_A = -sqrt(2/PI)/(erfc(1/sqrt(2))*exp(1/2)-1)
SELU_L = (1-erfc(1/sqrt(2))*sqrt(E))*sqrt(2*PI) / (2*erfc(sqrt(2))*E*E+PI*erfc(1/sqrt(2))**2*E-2*(2+PI)*erfc(1/sqrt(2))*sqrt(E)+PI+2)**0.5
def heaviside(z):
return np.where(z<0,0,1)
def sign(z):
return np.where(z<0,-1,1)
# return np.sign(z)
def sigmoid(z):
return 1 / (1 + np.exp(-z))
def tanh(z):
return np.tanh(z)
def relu(z):
return np.maximum(0, z)
def leaky_relu(z,a=0.05):
return np.maximum(a*z, z)
def elu(z,a=1):
#y=z.copy()
y=a*(np.exp(z)-1)
y[z>0]=z[z>0]
return y
def selu(z):
return SELU_L*elu(z,a=SELU_A)
def derivative(f, z, eps=0.000001):
return (f(z + eps) - f(z - eps))/(2 * eps)
```
%% Cell type:code id: tags:
``` python
pw=5
ph=5
z = np.linspace(-5, 5, 200)
# ------ Heaviside
#
fig, ax = plt.subplots(1, 1)
fig.set_size_inches(pw,ph)
ax.set_xlim(-5, 5)
ax.set_ylim(-2, 2)
ax.axhline(y=0, linewidth=1, linestyle='--', color='lightgray')
ax.axvline(x=0, linewidth=1, linestyle='--', color='lightgray')
ax.plot(0, 0, "rx", markersize=10)
ax.plot(z, heaviside(z), linestyle='-', label="Heaviside")
ax.plot(z, derivative(heaviside, z), linewidth=3, alpha=0.6, label="dHeaviside/dx")
# ax.plot(z, sign(z), label="Heaviside")
ax.set_title("Heaviside")
fidle.scrawler.save_fig('Heaviside')
plt.show()
# ----- Logit/Sigmoid
#
fig, ax = plt.subplots(1, 1)
fig.set_size_inches(pw,ph)
ax.set_xlim(-5, 5)
ax.set_ylim(-2, 2)
ax.axhline(y=0, linewidth=1, linestyle='--', color='lightgray')
ax.axvline(x=0, linewidth=1, linestyle='--', color='lightgray')
ax.plot(z, sigmoid(z), label="Sigmoid")
ax.plot(z, derivative(sigmoid, z), linewidth=3, alpha=0.6, label="dSigmoid/dx")
ax.set_title("Logit")
fidle.scrawler.save_fig('Logit')
plt.show()
# ----- Tanh
#
fig, ax = plt.subplots(1, 1)
fig.set_size_inches(pw,ph)
ax.set_xlim(-5, 5)
ax.set_ylim(-2, 2)
ax.axhline(y=0, linewidth=1, linestyle='--', color='lightgray')
ax.axvline(x=0, linewidth=1, linestyle='--', color='lightgray')
ax.plot(z, tanh(z), label="Tanh")
ax.plot(z, derivative(tanh, z), linewidth=3, alpha=0.6, label="dTanh/dx")
ax.set_title("Tanh")
fidle.scrawler.save_fig('Tanh')
plt.show()
# ----- Relu
#
fig, ax = plt.subplots(1, 1)
fig.set_size_inches(pw,ph)
ax.set_xlim(-5, 5)
ax.set_ylim(-2, 2)
ax.axhline(y=0, linewidth=1, linestyle='--', color='lightgray')
ax.axvline(x=0, linewidth=1, linestyle='--', color='lightgray')
ax.plot(z, relu(z), label="ReLU")
ax.plot(z, derivative(relu, z), linewidth=3, alpha=0.6, label="dReLU/dx")
ax.set_title("ReLU")
fidle.scrawler.save_fig('ReLU')
plt.show()
# ----- Leaky Relu
#
fig, ax = plt.subplots(1, 1)
fig.set_size_inches(pw,ph)
ax.set_xlim(-5, 5)
ax.set_ylim(-2, 2)
ax.axhline(y=0, linewidth=1, linestyle='--', color='lightgray')
ax.axvline(x=0, linewidth=1, linestyle='--', color='lightgray')
ax.plot(z, leaky_relu(z), label="Leaky ReLU")
ax.plot(z, derivative( leaky_relu, z), linewidth=3, alpha=0.6, label="dLeakyReLU/dx")
ax.set_title("Leaky ReLU (α=0.05)")
fidle.scrawler.save_fig('LeakyReLU')
plt.show()
# ----- Elu
#
fig, ax = plt.subplots(1, 1)
fig.set_size_inches(pw,ph)
ax.set_xlim(-5, 5)
ax.set_ylim(-2, 2)
ax.axhline(y=0, linewidth=1, linestyle='--', color='lightgray')
ax.axvline(x=0, linewidth=1, linestyle='--', color='lightgray')
ax.plot(z, elu(z), label="ReLU")
ax.plot(z, derivative( elu, z), linewidth=3, alpha=0.6, label="dExpReLU/dx")
ax.set_title("ELU (α=1)")
fidle.scrawler.save_fig('ELU')
plt.show()
# ----- Selu
#
fig, ax = plt.subplots(1, 1)
fig.set_size_inches(pw,ph)
ax.set_xlim(-5, 5)
ax.set_ylim(-2, 2)
ax.axhline(y=0, linewidth=1, linestyle='--', color='lightgray')
ax.axvline(x=0, linewidth=1, linestyle='--', color='lightgray')
ax.plot(z, selu(z), label="SeLU")
ax.plot(z, derivative( selu, z), linewidth=3, alpha=0.6, label="dSeLU/dx")
ax.set_title("ELU (SELU)")
fidle.scrawler.save_fig('SeLU')
plt.show()
```
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/logo-paysage.svg"></img>
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/header.svg"></img>
# <!-- TITLE --> [PANDAS1] - Quelques exemples avec Pandas
<!-- DESC --> pandas is another essential tool for the Scientific Python.
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- Understand how to slice a dataset
%% Cell type:markdown id: tags:
## Step 1 - A little cooking with datasets
%% Cell type:code id: tags:
``` python
import pandas as pd
import numpy as np
```
%% Cell type:code id: tags:
``` python
# Get some data
a = np.arange(50).reshape(10,5)
print('Starting data: \n',a)
```
%% Cell type:code id: tags:
``` python
# Create a DataFrame
df_all = pd.DataFrame(a, columns=['A','B','C','D','E'])
print('\nDataFrame :')
display(df_all)
```
%% Cell type:code id: tags:
``` python
# Shuffle data
df_all = df_all.sample(frac=1, axis=0)
print('\nDataFrame randomly shuffled :')
display(df_all)
```
%% Cell type:code id: tags:
``` python
# Get a train part
df_train = df_all.sample(frac=0.8, axis=0)
print('\nTrain set (80%) :')
display(df_train)
```
%% Cell type:code id: tags:
``` python
# Get test set as all - train
df_test = df_all.drop(df_train.index)
print('\nTest set (all - train) :')
display(df_test)
```
%% Cell type:code id: tags:
``` python
x_train = df_train.drop('E', axis=1)
y_train = df_train['E']
x_test = df_test.drop('E', axis=1)
y_test = df_test['E']
display(x_train)
display(y_train)
```
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/logo-paysage.svg"></img>
%% Cell type:markdown id:51be1de8 tags:
<img width="800px" src="../fidle/img/header.svg"></img>
# <!-- TITLE --> [PYTORCH1] - Practical Lab : PyTorch
<!-- DESC --> PyTorch est l'un des principaux framework utilisé dans le Deep Learning
<!-- AUTHOR : Kamel Guerda (CNRS/IDRIS) -->
## Objectives :
- Understand PyTorch
%% Cell type:markdown id:1959d3d5-388e-4c43-8318-342f08e6b024 tags:
## **Introduction**
%% Cell type:markdown id:a6da1305-551a-4549-abed-641415823a33 tags:
**PyTorch** is an open-source machine learning library developed by Facebook's AI Research lab. It offers an imperative and dynamic computational model, making it particularly easy and intuitive for researchers. Its primary feature is the tensor, a multi-dimensional array similar to NumPy's ndarray, but with GPU acceleration.
%% Cell type:markdown id:54c79dfb-a061-4b72-afe3-c97c28071e5c tags:
### **Installation and usage**
%% Cell type:markdown id:20852981-c289-4c4e-8099-2c5efef58e3b tags:
Whether you're working on the supercomputer Jean Zay or your own machine, getting your environment ready is the first step. Here's how to proceed:
%% Cell type:markdown id:a88f32bd-37f6-4e99-97e0-62283a146a1f tags:
#### **On Jean Zay**
%% Cell type:markdown id:8421a9f0-130d-40ef-8a7a-066bf9147066 tags:
For those accessing the Jean Zay supercomputer (you should already be at step 3):
1. **Access JupyterHub**: Go to [https://jupyterhub.idris.fr](https://jupyterhub.idris.fr). The login credentials are the same as those used to access the Jean Zay machine. Ensure your IP address is whitelisted (add a new IP via the account management form if needed).
2. **Create a JupyterLab Instance**: Choose to create the instance either on a frontend node (e.g., for internet access) or on a compute node by reserving resources via Slurm. Select the appropriate options such as workspace, allocated resources, billing, etc.
3. **Choose the Kernel**: IDRIS provides kernels based on modules installed on Jean Zay. This includes various versions of Python, Tensorflow, and PyTorch. Create a new notebook with the desired kernel through the launcher or change the kernel on an existing notebook by clicking the kernel name at the top right of the screen.
4. For advanced features like Tensorboard, MLFlow, custom kernel creation, etc., refer to the [JupyterHub technical documentation](https://jupyterhub.idris.fr/services/documentation/).
%% Cell type:markdown id:a168594c-cf18-4ed8-babf-242b56b3e0b7 tags:
> **Task:** Verifying Your Kernel in the upper top corner
> - In JupyterLab, at the top right of your notebook, you should see the name of your current kernel.
> - Ensure it matches "PyTorch 2.0" or a similar name indicating the PyTorch version.
> - If it doesn't, click on the kernel name and select the appropriate kernel from the list.
%% Cell type:markdown id:0aaadeee-5115-48d0-aa57-20a0a63d5054 tags:
#### **Elsewhere**
%% Cell type:markdown id:5d34951e-1b7b-4776-9449-eff57a9385f4 tags:
For users on other platforms:
1. Install PyTorch by following the official [installation guide](https://pytorch.org/get-started/locally/).
2. If you have a GPU, ensure you've installed the necessary CUDA toolkit and cuDNN libraries.
3. Launch your preferred Python environment, whether it's Jupyter notebook, an IDE like PyCharm, or just the terminal.
Once your setup is complete, you're ready to dive in. Let's explore the fascinating world of deep learning!
%% Cell type:markdown id:7552d5ac-eb8c-48e0-9e61-3b056d560f7b tags:
### **Version**
%% Cell type:code id:272e492f-35c5-4293-b504-8e8632da1b73 tags:
``` python
# Importing PyTorch
import torch
# TODO: Print the version of PyTorch being used
```
%% Cell type:markdown id:9fdbe225-4e06-4ad0-abca-4325457dc0e1 tags:
<details>
<summary>Hint (click to reveal)</summary>
To print the version of PyTorch you're using, you can access the <code>__version__</code> attribute of the <code>torch</code> module.
```python
print(torch.__version__)
```
%% Cell type:markdown id:72752068-02fe-4e44-8c27-40e8f66680c9 tags:
**Why PyTorch 2.0 is a Game-Changer**
PyTorch 2.0 represents a major step in the evolution of this popular deep learning library. As part of the transition to the 2-series, let's highlight some reasons why this version is pivotal:
1. **Performance**: With PyTorch 2.0, performance has been supercharged at the compiler level, offering faster execution and support for Dynamic Shapes and Distributed systems.
2. **torch.compile**: This introduces a more Pythonic approach, moving some parts of PyTorch from C++ back to Python. Notably, across a test set of 163 open-source models, the use of `torch.compile` resulted in a 43% speed increase during training on an NVIDIA A100 GPU.
3. **Innovative Technologies**: Technologies like TorchDynamo and TorchInductor, both written in Python, make PyTorch more flexible and developer-friendly.
4. **Staying Pythonic**: PyTorch 2.0 emphasizes Python-centric development, reducing barriers for developers and vendors.
As we progress in this lab, we'll dive deeper into some of these features, giving you hands-on experience with the power and flexibility of PyTorch 2.0.
%% Cell type:markdown id:bc215c02-1f16-48be-88f9-5080fd2be9ed tags:
## **Pytorch Fundamentals**
%% Cell type:markdown id:bcd7f0fc-a714-495e-9307-e48964abd85b tags:
### **Tensors**
%% Cell type:markdown id:6e185bf6-3d3c-4a43-b425-e6aa3da5d5dd tags:
A **tensor** is a generalization of vectors and matrices and is easily understood as a multi-dimensional array. In the context of PyTorch:
- A 0-dimensional tensor is a scalar (a single number).
- A 1-dimensional tensor is a vector.
- A 2-dimensional tensor is a matrix.
- ... and so on for higher dimensions.
Tensors are fundamental to PyTorch not just as data containers but also for their compatibility with GPU acceleration, making operations on them extremely fast. This acceleration is vital for training large neural networks.
Let's start our journey with tensors by examining how PyTorch handles scalars.
%% Cell type:markdown id:fa90e399-3955-4417-a4a3-c0c812ebb1d9 tags:
#### **Scalars in PyTorch**
### Scalars in PyTorch
A scalar, being a 0-dimensional tensor, is simply a single number. While it might seem trivial, understanding scalars in PyTorch lays the foundation for grasping more complex tensor structures. Familiarize yourself with the `torch.tensor()` function from the [official documentation](https://pytorch.org/docs/stable/generated/torch.tensor.html) before proceeding.
> **Task**: Create a scalar tensor in PyTorch and examine its properties.
%% Cell type:code id:b6db1841-0fab-4df0-b699-058d5a477ca6 tags:
``` python
# TODO: Create a scalar tensor with the value 7.5
scalar_tensor = # Your code here
# Print the scalar tensor
print("Scalar Tensor:", scalar_tensor)
# TODO: Print its dimension, shape, and type
```
%% Output
Cell In[2], line 2
scalar_tensor = # Your code here
^
SyntaxError: invalid syntax
%% Cell type:markdown id:c9bc265c-9a7f-4588-8586-562b390d63d9 tags:
<details>
<summary>Hint (click to reveal)</summary>
To create a scalar tensor, use the <code>torch.tensor()</code> function. To retrieve its dimension, shape, and type, you can use the <code>.dim()</code>, <code>.shape</code>, and <code>.dtype</code> attributes respectively.
Here's how you can achieve that:
```python
scalar_tensor = torch.tensor(7.5)
print("Scalar Tensor:", scalar_tensor)
print("Dimension:", scalar_tensor.dim())
print("Shape:", scalar_tensor.shape)
print("Type:", scalar_tensor.dtype)
```
</details>
%% Cell type:markdown id:fc240c26-5866-4080-bbb9-d5cde1500300 tags:
#### **Vectors in PyTorch**
A vector in PyTorch is a 1-dimensional tensor. It's essentially a list of numbers that can represent anything from a sequence of data points to the weights of a neural network layer.
In this section, we'll see how to create and manipulate vectors using PyTorch. We'll also look at some basic operations you can perform on them.
> **Task**: Create a 1-dimensional tensor (vector) with values `[1.5, 2.3, 3.1, 4.8, 5.2]` and print its dimension, shape, and type.
Start by referring to the `torch.tensor()` function in the [official documentation](https://pytorch.org/docs/stable/generated/torch.tensor.html) to understand how to create tensors of varying dimensions.
%% Cell type:code id:e9503b49-38d1-45d9-910f-761da82cfbd0 tags:
``` python
# TODO: Create a 1-dimensional tensor (vector) with values [1.5, 2.3, 3.1, 4.8, 5.2]
vector_tensor = # Your code here
# Print the vector tensor
print("Vector Tensor:", vector_tensor)
# TODO: Print its dimension, shape, and type
```
%% Output
Cell In[3], line 2
vector_tensor = # Your code here
^
SyntaxError: invalid syntax
%% Cell type:markdown id:13252d1f-004f-42e0-aec9-56322b43ab72 tags:
<details>
<summary>Hint (click to reveal)</summary>
Creating a 1-dimensional tensor is similar to creating a scalar. Instead of a single number, you pass a list of numbers to the <code>torch.tensor()</code> function. The <code>.dim()</code>, <code>.shape</code>, and <code>.dtype</code> attributes will help you retrieve its properties.
```python
vector_tensor = torch.tensor([1.5, 2.3, 3.1, 4.8, 5.2])
print("Vector Tensor:", vector_tensor)
print("Dimension:", vector_tensor.dim())
print("Shape:", vector_tensor.shape)
print("Type:", vector_tensor.dtype)
```
</details>
%% Cell type:markdown id:7bfc47a8-e99d-4683-ac36-287f35a76fd0 tags:
#### **Vector Operations**
Vectors are not just static entities; we often perform various operations on them, especially in the context of neural networks. This includes addition, subtraction, scalar multiplication, dot products, etc.
> **Task**: Using the previously defined `vector_tensor`, perform the following operations:
1. Add 5 to all the elements of the vector.
2. Multiply all the elements of the vector by 2.
3. Compute the dot product of the vector with itself.
%% Cell type:code id:86182e1c-5491-4743-a7c8-10b9effd8194 tags:
``` python
# TODO: Add 5 to all elements
vector_added = # Your code here
# TODO: Multiply all elements by 2
vector_multiplied = # Your code here
# TODO: Compute the dot product with itself
dot_product = # Your code here
# Print the results
print("Vector after addition:", vector_added)
print("Vector after multiplication:", vector_multiplied)
print("Dot Product:", dot_product)
```
%% Output
Cell In[4], line 2
vector_added = # Your code here
^
SyntaxError: invalid syntax
%% Cell type:markdown id:75773a02-3ab4-4325-99fb-7a742e997f21 tags:
<details>
<summary>Hint (click to reveal)</summary>
PyTorch tensors support regular arithmetic operations. For the dot product, you can use the <code>torch.dot()</code> function.
```python
vector_added = vector_tensor + 5
vector_multiplied = vector_tensor * 2
dot_product = torch.dot(vector_tensor, vector_tensor)
print("Vector after addition:", vector_added)
print("Vector after multiplication:", vector_multiplied)
print("Dot Product:", dot_product)
```
</details>
%% Cell type:markdown id:2b4766ba-ef9a-4f24-ba43-7358097a7b61 tags:
#### **Matrices in PyTorch**
A matrix in PyTorch is represented as a 2D tensor. Just as vectors are generalizations of scalars, matrices are generalizations of vectors, providing an additional dimension. Matrices are crucial for a range of operations in deep learning, including representing datasets, transformations, and more.
%% Cell type:markdown id:2ec7544d-ef87-4773-88d8-cee731d1c43c tags:
##### **Creating Matrices**
Before diving into manual matrix creation, it's beneficial to know some utility functions PyTorch provides:
- `torch.rand()`: Generates a matrix with random values between 0 and 1.
- `torch.eye()`: Creates an identity matrix.
- `torch.zeros()`: Generates a matrix filled with zeros.
- `torch.ones()`: Generates a matrix filled with ones.
You can explore more about these functions in the [official documentation](https://pytorch.org/docs/stable/tensors.html).
> **Task**: Using the above functions, create the following matrices:
> 1. A 3x3 matrix with random values.
> 2. A 5x5 identity matrix.
> 3. A 2x4 matrix filled with zeros.
> 4. A 4x2 matrix filled with ones.
%% Cell type:code id:5014b564-6bf5-4f00-a513-578ca72d94a8 tags:
``` python
# Your code for creating the matrices goes here
```
%% Cell type:markdown id:86b2708c-45c6-4b2c-b526-41491fcafa08 tags:
<details>
<summary>Hint (click to reveal)</summary>
To create these matrices, make use of the following functions:
1. `torch.rand(size)`: Use this function and specify the size as `(3, 3)` to create a 3x3 matrix with random values.
2. `torch.eye(n, m)`: Use this to generate an identity matrix. For a square matrix like 5x5, n and m would both be 5.
3. `torch.zeros(m, n)`: For a 2x4 matrix filled with zeros, specify m=2 and n=4.
4. `torch.ones(m, n)`: Similar to the `zeros` function but fills the matrix with ones.
```python
# 1. 3x3 matrix with random values
random_matrix = torch.rand(3, 3)
print(random_matrix)
# 2. 5x5 identity matrix
identity_matrix = torch.eye(5, 5)
print(identity_matrix)
# 3. 2x4 matrix filled with zeros
zero_matrix = torch.zeros(2, 4)
print(zero_matrix)
# 4. 4x2 matrix filled with ones
one_matrix = torch.ones(4, 2)
print(one_matrix)
```
</details>
%% Cell type:markdown id:60ff5e51-699e-46a1-8cc7-1d5fc9a4d078 tags:
#### **Matrix Operations in PyTorch**
Just like vectors, matrices can undergo a variety of operations. Some of the basic ones include matrix addition, subtraction, and multiplication. More advanced operations include matrix inversion, transposition, and determinant calculation.
%% Cell type:markdown id:c6bdb9d9-b299-4d63-b92f-7c4b8c32a1b7 tags:
##### **Basic Matrix Operations**
> **Task**: Perform the following operations on matrices:
> 1. Create two 3x3 matrices with random values.
> 2. Add the two matrices.
> 3. Subtract the second matrix from the first one.
> 4. Multiply the two matrices element-wise.
Remember, for matrix multiplication that results in the dot product, you'd use `torch.mm` or `@`, but for element-wise multiplication, you use `*`.
Here's the [official documentation](https://pytorch.org/docs/stable/tensors.html#torch.Tensor.matmul) on matrix operations for your reference.
%% Cell type:code id:6be8c647-c455-4d3b-8a21-c4b7102ffa75 tags:
``` python
# Your code for creating the matrices and performing the operations goes here
```
%% Cell type:markdown id:0020b26b-b2bb-4efa-9bf3-3f037acd050e tags:
<details>
<summary>Hint (click to reveal)</summary>
Here's how you can perform the given matrix operations:
```python
# 1. Create two 3x3 matrices with random values
matrix1 = torch.rand(3, 3)
matrix2 = torch.rand(3, 3)
print("Matrix 1:\n", matrix1)
print("\nMatrix 2:\n", matrix2)
# 2. Add the two matrices
sum_matrix = matrix1 + matrix2
print("\nSum of matrices:\n", sum_matrix)
# 3. Subtract the second matrix from the first one
difference_matrix = matrix1 - matrix2
print("\nDifference of matrices:\n", difference_matrix)
# 4. Multiply the two matrices element-wise
product_matrix = matrix1 * matrix2
print("\nElement-wise product of matrices:\n", product_matrix)
```
</details>
%% Cell type:markdown id:07f57464-76e2-4670-8332-3fcec2e162bd tags:
#### **Higher-Dimensional Tensors in PyTorch**
While scalars, vectors, and matrices cover 0D, 1D, and 2D tensors respectively, in deep learning, especially in tasks like image processing, you often encounter tensors with more than two dimensions.
For instance, a colored image is often represented as a 3D tensor: height x width x channels (e.g., RGB channels). A batch of such images would then be a 4D tensor: batch_size x height x width x channels.
Let's get our hands dirty with some higher-dimensional tensors!
%% Cell type:markdown id:3dd1fea7-d290-49fe-ac1f-5a8387e3d386 tags:
##### **Creating a 3D Tensor**
> **Task**: Create a 3D tensor representing 2 images of size 4x4 with 3 channels (like RGB) filled with random values.
Use the `torch.rand` function, and remember to specify the dimensions correctly.
Here's the [official documentation](https://pytorch.org/docs/stable/tensors.html#creation-ops) for tensor creation.
%% Cell type:code id:e7c8ac6e-f870-4b5d-ac2c-05be1d0cc9f1 tags:
``` python
# Your code for creating the 3D tensor goes here
```
%% Cell type:markdown id:efe61750-a91f-428a-b4e2-7df0cc2a782b tags:
<details>
<summary>Hint (click to reveal)</summary>
Creating a 3D tensor with the given specifications can be achieved using the `torch.rand` function. Here's how:
```python
# Create a 3D tensor representing 2 images of size 4x4 with 3 channels
image_tensor = torch.rand(2, 4, 4, 3)
print(image_tensor)
```
</details>
%% Cell type:markdown id:8cfbcaa0-a0f6-4869-ba94-65d4439a60ca tags:
#### **Reshaping Tensors**
In deep learning, we often need to reshape our tensors. For instance, an image represented as a 3D tensor might need to be reshaped into a 1D tensor before passing it through a fully connected layer. PyTorch provides methods to make this easy.
The most commonly used method for reshaping tensors in PyTorch is the `view()` method. Another method that offers more flexibility (especially when you're unsure about the size of one dimension) is `reshape()`.
>[Task]: Using the official documentation, find out how to use the [`view()`](https://pytorch.org/docs/stable/tensors.html#torch.Tensor.view) and [`reshape()`](https://pytorch.org/docs/stable/tensors.html#torch.Tensor.reshape) methods. Create a 2x3 tensor using `torch.tensor()` and then reshape it into a 3x2 tensor.
%% Cell type:code id:e6758ba7-aa35-42f0-87c1-86b88de64238 tags:
``` python
# Create a 2x3 tensor
# Reshape it into a 3x2 tensor
```
%% Cell type:markdown id:fea31255-c2fe-47b2-b03b-c2b35953e05a tags:
<details>
<summary>Hint (click to reveal)</summary>
To reshape a tensor using <code>view()</code> method:
```python
tensor = torch.tensor([[1, 2, 3], [4, 5, 6]])
reshaped_tensor = tensor.view(3, 2)
```
<br>
Alternatively, using the <code>reshape()</code> method:
```python
reshaped_tensor = tensor.reshape(3, 2)
```
</details>
%% Cell type:markdown id:c580dbca-b75a-4b97-a24a-6a19c7cdf8d1 tags:
#### **Broadcasting**
Broadcasting is a powerful feature in PyTorch that allows you to perform operations between tensors of different shapes. When possible, PyTorch will automatically reshape the tensors in a way that makes the operation valid. This can significantly reduce manual reshaping and is efficient in memory usage.
However, it's essential to understand the rules and nuances of broadcasting to use it effectively and avoid unexpected behaviors.
>[Task]: Given a tensor `A` of shape (4, 1) and another tensor `B` of shape (1, 4), use PyTorch operations to produce a result tensor of shape (4, 4). Check the [official documentation on broadcasting](https://pytorch.org/docs/stable/notes/broadcasting.html) for guidance.
%% Cell type:code id:44566fb7-87ed-41ef-a86e-db32a1cf2179 tags:
``` python
# Define tensor A of shape (4, 1) and tensor B of shape (1, 4)
# Perform an operation to get a result tensor of shape (4, 4)
```
%% Cell type:markdown id:2602f2c4-f507-4a9a-8e8d-dee5e95efc61 tags:
<details>
<summary>Hint (click to reveal)</summary>
You can simply use addition, subtraction, multiplication, or any other element-wise operation. When you do this operation, PyTorch will automatically broadcast the tensors to a compatible shape. For example:
```python
A = torch.tensor([[1], [2], [3], [4]])
B = torch.tensor([[1, 2, 3, 4]])
result = A * B
print(result)
```
</details>
%% Cell type:markdown id:ba2cc439-8ecc-4d92-b78f-39ef762678f8 tags:
### **GPU Support with CUDA**
%% Cell type:markdown id:575536c5-87a7-4781-8557-558627f14c0a tags:
PyTorch seamlessly supports operations on Graphics Processing Units (GPUs) through CUDA, an API developed by NVIDIA for their GPUs. If you have a compatible NVIDIA GPU on your machine, PyTorch can utilize it to speed up tensor operations which can be orders of magnitude faster than on a CPU.
To verify if your PyTorch installation can use CUDA, you can check the attribute `torch.cuda.is_available()`. This returns `True` if CUDA is available and PyTorch can use GPUs, otherwise it returns `False`.
>[Task]: Print whether CUDA support is available on your system. The [CUDA documentation](https://pytorch.org/docs/stable/cuda.html) might be useful for this task.
%% Cell type:code id:38e84bb7-5026-4262-8b78-b368c55a1450 tags:
``` python
# Check and print if CUDA is available
cuda_available = None # Replace None with the appropriate code
print("CUDA available:", cuda_availablez
```
%% Cell type:markdown id:646b5660-5131-4ce0-9592-0fd14608c6df tags:
<details>
<summary>Hint (click to reveal)</summary>
To check if CUDA is available, you can utilize the torch.cuda.is_available() function.
```python
cuda_available = torch.cuda.is_available()
print("CUDA available:", cuda_available)
```
</details>
%% Cell type:markdown id:86c8d7ed-0931-4874-bb27-e796ae1a1d7a tags:
When developing deep learning models in PyTorch, it's a good habit to write device-agnostic code. This means your code can automatically use a GPU if available, or fall back to using the CPU if not. The `torch.device` object allows you to specify the device (either CPU or GPU) where you'd like your tensors to be allocated.
To dynamically determine the device, a common pattern is to check `torch.cuda.is_available()`, and set the device accordingly. This is particularly useful when you want your code to be flexible, regardless of the underlying hardware.
>[Task]: Define a `device` variable that is set to 'cuda:0' if CUDA is available and 'cpu' otherwise. Create a tensor on this device. The [documentation about torch.device](https://pytorch.org/docs/stable/tensor_attributes.html#torch-device) might be handy.
%% Cell type:code id:91e05e75-03ad-44cb-9842-89e2017ee709 tags:
``` python
# Define the device
device = None # Replace None with the appropriate code
# Create a tensor on the specified device
tensor_on_device = torch.tensor([1, 2, 3, 4, 5], device=device)
```
%% Cell type:markdown id:3b80406b-b1cc-4831-a6ba-8e6385703755 tags:
<details>
<summary>Hint (click to reveal)</summary>
To define the device variable dynamically:
```python
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
```
<br>
After setting the device, you can create tensors on it directly using the device argument.
</details>
%% Cell type:markdown id:574a2192-cc09-4d2c-8f01-97b051b7ffc8 tags:
### **Automatic Differentiation with Autograd**
%% Cell type:markdown id:7f5406f6-e295-4f70-a815-9eef18352390 tags:
PyTorch's `autograd` module provides the tools for automatically computing the gradients for tensors. This feature is a cornerstone for neural network training, as gradients are essential for optimization algorithms like gradient descent.
When we create a tensor, `requires_grad` is set to `False` by default, meaning it won't track operations. However, if we set `requires_grad=True`, PyTorch will start to track all operations on the tensor.
Let's start with a simple example:
>**Task:** Create a tensor that holds a single value, let's say 2, and set `requires_grad=True`. Then, define a simple operation like squaring the tensor. Finally, inspect the resulting tensor. The [documentation for requires_grad](https://pytorch.org/docs/stable/autograd.html#torch.Tensor.requires_grad) might be handy.
%% Cell type:code id:fe63ab93-55be-434d-822f-8fd9cd727941 tags:
``` python
# TODO: Create a tensor, perform a simple operation, and print its data and grad_fn separately.
```
%% Cell type:markdown id:fa7ee20c-c2d6-4dcf-bb37-9eda580b5dc5 tags:
<details>
<summary>Hint (click to reveal)</summary>
To create a tensor with requires_grad=True and square it:
```python
# TODO: Create a tensor, perform a simple operation, and print its data and grad_fn separately.
x = torch.tensor([2.0], requires_grad=True)
y = x ** 2
print("Data:", y.data)
print("grad_fn:", y.grad_fn)
```
</details>
%% Cell type:markdown id:c14dde16-a6be-4151-94cb-96ae98f0648a tags:
Once the operation is executed on a tensor, a new attribute grad_fn is created. This attribute references a function that has created the tensor. In our example, since we squared the tensor, grad_fn will be of type PowBackward0.
This grad_fn attribute provides a link to the computational history of the tensor, allowing PyTorch to backpropagate errors and compute gradients when training neural networks.
%% Cell type:markdown id:0965e79e-558a-45a9-8ab2-614c503e59c0 tags:
#### **Computing Gradients**
%% Cell type:markdown id:36fb6c5b-9b39-4a2f-a767-61032b1b4ffc tags:
Now, let's compute the gradients of `out` with respect to `x`. To do this, we'll call the `backward()` method on the tensor `out`.
>[Task]: Compute the gradients of `out` by calling the `backward()` method on it. Afterwards, print the gradients of `x`. The [documentation for backward()](https://pytorch.org/docs/stable/autograd.html#torch.autograd.backward) may be useful.
%% Cell type:code id:83685760-bde9-4327-88f7-cfe02bdb3309 tags:
``` python
# TODO: Compute the gradient and print it.
```
%% Cell type:markdown id:9b1d104b-efef-4fff-869d-8dde1131868e tags:
<details>
<summary>Hint (click to reveal)</summary>
To compute the gradient:
```python
y.backward()
print(x.grad)
```
</details>
%% Cell type:markdown id:d7f5aecb-8623-481f-a5cf-f8b6dd0c9a37 tags:
#### **Gradient Accumulation**
%% Cell type:markdown id:1a4df0a1-12a0-4129-a258-915fa8440193 tags:
In PyTorch, the gradients of tensors are accumulated into the `.grad` attribute each time you call `.backward()`. This means that if you call `.backward()` multiple times, the gradients will add up.
However, by default, calling `.backward()` consumes the computational graph to save memory. If you intend to call `.backward()` multiple times on the same graph, you need to specify `retain_graph=True` during all but the last call.
>[Task]: Create a tensor, perform an operation on it, and then call `backward()` twice. Use `retain_graph=True` in the first call to retain the computational graph. Observe the `.grad` attribute after each call.
%% Cell type:code id:50a04095-9d7e-48ba-90ed-06718cd379f0 tags:
``` python
# Create a tensor
w = torch.tensor([1.0], requires_grad=True)
# Operation
result = w * 2
# TODO: Call backward twice (using retain_graph=True for the first call) and print the grad after each call
# ...
```
%% Cell type:markdown id:d699e58d-d479-466a-b592-cbf68d185c3b tags:
<details>
<summary>Hint (click to reveal)</summary>
```python
result.backward(retain_graph=True)
print(w.grad) # This should print 2
result.backward()
print(w.grad) # This should print 4, as gradients get accumulated
```
</details>
%% Cell type:markdown id:88d30f87-2469-4289-ad8a-51a25a2e8b82 tags:
#### **Zeroing Gradients**
%% Cell type:markdown id:2ea93580-9a35-4f5d-8f29-0a324d28d28a tags:
In neural network training, we typically want to update our weights with the gradients after each forward and backward pass. This means that we don't want the gradients to accumulate across multiple passes. Hence, it's common to zero out the gradients at the start of a new iteration.
>[Task]: Using the tensor from the previous cell, zero out its gradients and verify that it has been set to zero.
%% Cell type:code id:9cb03a91-d1df-4bbf-a0d2-b5580c643e12 tags:
``` python
# TODO: Zero out the gradients of w and print
```
%% Cell type:markdown id:4a89ff66-b1ef-413a-a41c-847e8c832e4b tags:
<details>
<summary>Hint (click to reveal)</summary>
```python
w.grad.zero_()
print(w.grad)
```
</details>
%% Cell type:markdown id:85f75515-3d89-4249-b00a-03c13cca92d4 tags:
#### **Non-Scalar Backward**
%% Cell type:markdown id:86a54a2c-e8c1-4278-a3fe-ed60564ebd07 tags:
When dealing with non-scalar tensors, `backward` requires an additional argument: the gradient of the tensor with respect to some scalar (usually a loss).
>[Task]: Create a tensor of shape (2, 2) with `requires_grad=True`. Compute a non-scalar result by multiplying the tensor with itself. Then, compute backward with a gradient argument. You can consult the [backward documentation](https://pytorch.org/docs/stable/autograd.html#torch.autograd.backward) for reference.
%% Cell type:code id:cc0e4271-c356-4a4e-9a3a-5df1403a4211 tags:
``` python
# TODO: Create a tensor, perform an operation, and compute backward with a gradient argument
```
%% Cell type:markdown id:e7ee72f3-f51c-4849-b41d-136028029185 tags:
<details>
<summary>Hint (click to reveal)</summary>
```python
v = torch.tensor([[2.0, 3.0], [4.0, 5.0]], requires_grad=True)
result = v * v
grads = torch.tensor([[1.0, 1.0], [1.0, 1.0]])
result.backward(grads)
```
</details>
%% Cell type:markdown id:2e403021-4854-4e97-9898-82ed355293e7 tags:
#### **Stopping Gradient Tracking**
%% Cell type:markdown id:ba644253-8523-480d-8318-a87047671a21 tags:
There are scenarios where we don't want to track the gradients for certain operations. This can be achieved in two main ways:
1. **Using `torch.no_grad()`**: This context manager ensures that the enclosed operations are excluded from gradient tracking.
2. **Using `.detach()`**: Creates a tensor that shares the same storage but does not require gradients.
>[Task]: Create a tensor with `requires_grad=True`. Then, demonstrate both methods above to prevent gradient computation.
%% Cell type:code id:1feb2f9b-0c5f-4e9d-b042-e74052bc83a9 tags:
``` python
# TODO: Demonstrate operations without gradient tracking
```
%% Cell type:markdown id:a5eff82b-bfbd-4be7-afa3-dc00f5341568 tags:
<details>
<summary>Hint (click to reveal)</summary>
```python
# Using torch.no_grad()
with torch.no_grad():
result_no_grad = v * v
print(result_no_grad.requires_grad)
# Using .detach()
detached_tensor = v.detach()
result_detach = detached_tensor * detached_tensor
print(result_detach.requires_grad)
```
</details>
%% Cell type:markdown id:efe66a5d-ac63-4623-8182-3b5aff58abbe tags:
## **Building a Simple Neural Network with PyTorch**
%% Cell type:markdown id:aa4b7630-fc1e-4f7b-b86b-3c0d233cdc49 tags:
Neural networks are the cornerstone of deep learning. They are organized as a series of interconnected nodes or "neurons" that are structured into layers: an input layer, several hidden layers, and an output layer. Data flows through this network, undergoing transformations at each node, until it emerges at the output.
With PyTorch's `torch.nn` module, constructing these neural networks becomes straightforward. Let's dive into its main components:
%% Cell type:markdown id:8e98f379-5580-477c-8b7b-c641f5edf710 tags:
### **nn.Module: The Base Class for Neural Networks**
%% Cell type:markdown id:15d72ea2-c846-44f5-85d5-bd1990c154bc tags:
Every neural network in PyTorch is derived from the `nn.Module` class. This class offers:
- Organization and management of the layers.
- Capabilities for GPU acceleration.
- Implementation of the forward pass.
When we inherit from `nn.Module`, our custom neural network class benefits from these functionalities.
For more details, you can refer to the official [documentation](https://pytorch.org/docs/stable/generated/torch.nn.Module.html).
>**Task:** Familiarize yourself with the structure of a simple neural network provided below. Later, you'll be enriching it.
%% Cell type:code id:425abefe-54b9-4944-bc6e-cc78de892c66 tags:
``` python
import torch.nn as nn
class SimpleNet(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(SimpleNet, self).__init__()
# Define layers here
def forward(self, x):
# Call the layers in the correct order here
return x
```
%% Cell type:markdown id:892e3b55-097b-436e-bbf8-a380fd7d9e35 tags:
### **Linear Layers: Making Connections**
%% Cell type:markdown id:564c17bb-543f-42f6-8c5d-b855ccaf71e6 tags:
In PyTorch, a linear layer performs an affine transformation. It has both weights and biases which get updated during training. The transformation it performs can be described as:
$ y = xA^T + b $
Where:
- \( x \) is the input
- \( A \) represents the weights
- \( b \) is the bias
The `nn.Linear` class in PyTorch creates such a layer.
[Documentation Link for nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html)
> **Task:** Add an input layer and an output layer to the `SimpleNet` class.
>
> - The input layer should transform from `input_size` to `hidden_size`.
> - The output layer should transform from `hidden_size` to `output_size`.
> - After defining the layers in the `__init__` method, call them in the `forward` method to perform the transformations.
%% Cell type:code id:daa8829a-05e9-474e-b6e6-c7f749e22295 tags:
``` python
# Modify the below code by adding input and output linear layers in the appropriate places
class SimpleNet(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(SimpleNet, self).__init__()
# Define layers here
def forward(self, x):
# Call the layers in the correct order here
return x
```
%% Cell type:markdown id:c5038840-2713-4492-b7ab-c70469a2e96e tags:
<details>
<summary>Hint (click to reveal)</summary>
To define the input and output linear layers, use the `nn.Linear` class in the `__init__` method:
Then, in the `forward` method, pass the input through the defined layers.
```python
class SimpleNet(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(SimpleNet, self).__init__()
self.input_layer = nn.Linear(input_size, hidden_size)
self.output_layer = nn.Linear(hidden_size, output_size)
def forward(self, x):
x = self.input_layer(x)
x = self.output_layer(x)
return x
```
</details>
%% Cell type:markdown id:c2bb82c9-8949-4472-84fe-def36c514150 tags:
### **Activation Functions: Introducing Non-Linearity**
%% Cell type:markdown id:d989e2d8-5530-45f3-8664-e0d1b9eb627a tags:
Activation functions are critical components in neural networks, introducing non-linearity between layers. This non-linearity allows networks to learn from the error and make adjustments, which is essential for learning complex patterns.
In PyTorch, many activation functions are available as part of the `torch.nn` module, such as ReLU, Sigmoid, and Tanh.
For our `SimpleNet` model, we'll use the ReLU (Rectified Linear Unit) activation function after the input layer. The ReLU function is defined as \(f(x) = max(0, x)\).
Learn more about [ReLU and other activation functions in the official documentation](https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity).
> **Task**: Update your `SimpleNet` class to include the ReLU activation function after the input layer. For this, you'll need to both define the activation function in `__init__` and apply it in the `forward` method.
%% Cell type:code id:9e426301-5a55-46a2-8305-241b8f1ca4bf tags:
``` python
# Copy the previous SimpleNet definition and modify the code to include the ReLU activation function.
```
%% Cell type:markdown id:212ef244-f7bf-49a2-b4c9-b1b90af315de tags:
<details>
<summary>Hint (click to reveal)</summary>
To include the ReLU activation in your neural network:
1. Define the ReLU activation function in the `__init__` method.
2. Apply the activation function in the `forward` method after passing through the `input_layer`.
```python
class SimpleNet(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(SimpleNet, self).__init__()
self.input_layer = nn.Linear(input_size, hidden_size)
self.relu = nn.ReLU() # Defining the ReLU activation function
self.output_layer = nn.Linear(hidden_size, output_size)
def forward(self, x):
x = self.input_layer(x)
x = self.relu(x) # Applying the ReLU activation function
x = self.output_layer(x)
return x
```
</details>
%% Cell type:markdown id:640ef2f4-6816-4c5e-955c-c14c33349512 tags:
#### **Adjusting the Network: Adding Dropout**
%% Cell type:markdown id:e5596abf-b262-461d-ad5f-6a3488a79a42 tags:
[Dropout](https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html) is a regularization technique that can improve generalization in neural networks. It works by randomly setting a fraction of input units to 0 at each update during training time.
> **Task**: Modify the `SimpleNet` class to include a dropout layer with a dropout probability of 0.5 between the input layer and the output layer. Don't forget to call this layer in the forward method.
>
> Remember, after modifying the class structure, you'll need to re-instantiate your model object.
%% Cell type:code id:1c68ffd4-1de6-4d77-a15f-705b24c924af tags:
``` python
# Add a dropout layer to your previous code
```
%% Cell type:markdown id:d78c2dab-95c1-441c-b661-80bfba9a2dfd tags:
<details>
<summary>Hint (click to reveal)</summary>
Here's how you can modify the SimpleNet class to include dropout:
```python
class SimpleNet(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(SimpleNet, self).__init__()
self.input_layer = nn.Linear(input_size, hidden_size)
self.dropout = nn.Dropout(0.5)
self.output_layer = nn.Linear(hidden_size, output_size)
def forward(self, x):
x = self.input_layer(x)
x = self.dropout(x)
return self.output_layer(x)
model = SimpleNet(input_size, hidden_size, output_size).to(device)
```
Don't forget to create a new instance of your model: model = SimpleNet(input_size, hidden_size, output_size).to(device)
</details>
%% Cell type:markdown id:ce1cb22c-8288-4c69-9dcb-56896de49794 tags:
### **Utilizing the Neural Network**
%% Cell type:markdown id:255c3bf2-419d-4d14-82d6-7959e9280670 tags:
Once our neural network is defined, it's time to put it to use. This section will cover:
1. Instantiating the network
2. Transferring the network to GPU (if available)
3. Making predictions using the network (forward pass)
4. Understanding training and evaluation modes
5. Performing a backward pass to compute gradients
%% Cell type:markdown id:9f28cee5-c7a0-48c5-8341-6da6fae516c5 tags:
#### **1. Instantiating the Network**
%% Cell type:markdown id:0760bef6-d77a-4b7b-b5c7-18b208d93b98 tags:
To use our `SimpleNet`, we first need to create an instance of it. While creating an instance, the network's weights are also initialized.
> **Task**: Instantiate the `SimpleNet` class. Use `input_size=5`, `hidden_size=3`, and `output_size=1` as parameters.
%% Cell type:code id:ae9bfc87-5b09-476c-b32b-92c09f992fe3 tags:
``` python
# Your code here: Instantiate the model
```
%% Cell type:markdown id:f951e5d2-e0b4-451d-9a9b-44256f8a224c tags:
<details>
<summary>Hint (click to reveal)</summary>
To instantiate the SimpleNet class:
```python
model = SimpleNet(input_size=5, hidden_size=3, output_size=1)
print(model)
```
</details>
%% Cell type:markdown id:35567e41-6de6-429b-be4b-a14598313aca tags:
#### **2. Transferring the Network to GPU**
%% Cell type:markdown id:b3f3b3c3-4d7a-46db-9634-1e14b277c808 tags:
PyTorch makes it very straightforward to transfer our model to a GPU if one is available. This is done using the .to() method.
> **Task**: Check if GPU (CUDA) is available. If it is, transfer the model to the GPU.
%% Cell type:code id:91cb61a0-d890-4697-88d9-7749ea2bf144 tags:
``` python
# Check for GPU availability and transfer the model to GPU if available.
```
%% Cell type:markdown id:8a405f2d-3d8d-4e4c-90d1-54a05ff08b90 tags:
<details>
<summary>Hint (click to reveal)</summary>
To transfer the model to the GPU if it's available:
```python
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
```
</details>
%% Cell type:markdown id:175ab7cc-cddf-4460-ab01-f0193c2908d7 tags:
#### **3. Making Predictions using the Network (Forward Pass)**
%% Cell type:markdown id:e3724444-e0a6-48b0-8872-0b53b000a3bd tags:
With our model instantiated and potentially on a GPU, we can use it to make predictions. This involves passing some input data through the model, which is commonly referred to as a forward pass.
> **Task**: Create a tensor of size [1, 5] (representing one sample with five features) with random values. Transfer this tensor to the same device as your model (GPU or CPU). Then, pass this tensor through your model to get the prediction.
%% Cell type:code id:00e818ee-72e0-4960-a87e-a27b771d58eb tags:
``` python
# Create a tensor, transfer it to the right device, and perform a forward pass.
```
%% Cell type:markdown id:8bc38fde-0c14-45a6-b237-76ec7beab7f0 tags:
<details>
<summary>Hint (click to reveal)</summary>
To make predictions using your model:
```python
# Create a tensor with random values
input_tensor = torch.randn(1, 5).to(device)
# Pass the tensor through the model
output = model(input_tensor)
print(output)
```
</details>
%% Cell type:markdown id:fad9f46f-b591-4a2f-b2bf-3b4cf54cf961 tags:
#### **4. Understanding Training and Evaluation Modes**
%% Cell type:markdown id:2f197278-8d74-4a69-8da9-caf3f952e7bc tags:
Every PyTorch model has two modes:
- `train` mode: In this mode, certain layers like dropout or batch normalization behave differently than during evaluation. For instance, dropout will randomly set a fraction of input units to 0 at each update during training.
- `eval` mode: Here, the model behaves in a deterministic manner. Dropout layers don't drop activations, and batch normalization uses the entire dataset's statistics instead of the current mini-batch's statistics.
Setting the model to the correct mode is crucial. Let's demonstrate this.
> **Task**: Set your model to `train` mode, then perform a forward pass using the same input tensor multiple times and observe the outputs. Then, set your model to `eval` mode and repeat. Notice any differences?
%% Cell type:code id:4c2d921d-d409-4ae6-8ee4-8376fc9a209d tags:
``` python
# Perform the forward passes multiple times with the same input in both modes and observe the outputs.
```
%% Cell type:markdown id:0dbd65fa-b86b-4516-9fb1-aceae0c9d8a3 tags:
<details>
<summary>Hint (click to reveal)</summary>
Here's how you can demonstrate the difference:
```python
# Set to train mode
model.train()
# Forward pass multiple times
print("Train mode:")
for i in range(5):
print(model(input_tensor))
# Set to eval mode
model.eval()
print("Eval mode:")
# Forward pass multiple times
for i in range(5):
print(model(input_tensor))
```
If there were layers like dropout in your model, you'd notice that the outputs in training mode might differ on each pass, while in evaluation mode, they remain consistent.
</details>
%% Cell type:markdown id:e8c55be3-71f7-45e7-91d1-c556e8108fef tags:
## **The Training Procedure in PyTorch**
%% Cell type:markdown id:eac54af7-c8db-4a19-861b-2eecf68fb44e tags:
Training a neural network involves several key components: defining a loss function to measure errors, selecting an optimization method to adjust the model's weights, and iterating over the dataset multiple times. In this section, we will break down these components step by step, starting with the basics and moving towards more complex tasks.
%% Cell type:markdown id:3e9231a9-105c-4aed-bfa5-846ddc07245f tags:
### **Datasets and DataLoaders: Handling and Batching Data**
%% Cell type:markdown id:8dbc3fcf-5a29-4fd8-9e82-3eaae4c8dc90 tags:
In PyTorch, the torch.utils.data.Dataset class is used to represent a dataset. This abstract class requires the implementation of two primary methods: __len__ (to return the number of items) and __getitem__ (to return the item at a given index). However, PyTorch provides a utility class, TensorDataset, that wraps tensors in the dataset format, making it easier to use with the DataLoader.
The torch.utils.data.DataLoader class is a more powerful tool, responsible for:
- Batching the data
- Shuffling the data
- Loading the data in parallel using multiprocessing workers
Let's wrap some data in a Dataset and use a DataLoader to handle batching and shuffling.
> **Task**: Convert the input and target tensors into a dataset and dataloader. For this exercise, set the batch size to 32.
Below we define synthetic data that is learnable.
This way, we're essentially modeling the relationship $y=mx+c+noise$ where:
- $y$ is the target or output.
- $m$ is the slope of the line.
- $c$ is the y-intercept.
- $x$ is the input.
- $noise$ is a small random value added to each point to make the data more realistic.
%% Cell type:code id:f8335e62-e0c0-4381-9c20-1ca8ed78516c tags:
``` python
num_samples = 1000
# Define the relationship
m = 2.0
c = 1.0
noise_factor = 0.05
# Generate input tensor
input_tensor = torch.linspace(-10, 10, num_samples).view(-1, 1)
# Generate target tensor based on the relationship
target_tensor = m * input_tensor + c + noise_factor * torch.randn(num_samples, 1)
import matplotlib.pyplot as plt
plt.figure(figsize=(10,6))
plt.scatter(input_tensor.numpy(), target_tensor.numpy(), color='blue', marker='o')
plt.title("Synthetic Data Visualization")
plt.xlabel("Input")
plt.ylabel("Target")
plt.grid(True)
plt.show()
```
%% Cell type:code id:9535ad7e-6534-491b-b38d-b61cdd60b39d tags:
``` python
# Convert our data into a dataset
# ...
# Create a data loader for mini-batch training
# ...
```
%% Cell type:markdown id:da99866e-ebd0-403d-8159-8a36d601bf09 tags:
<details>
<summary>Hint (click to reveal)</summary>
Use the TensorDataset class from torch.utils.data to wrap your tensors in a dataset format. After defining your dataset, you can use the DataLoader class to create an iterator that will return batches of data.
```python
from torch.utils.data import DataLoader, TensorDataset
# Convert our data into a dataset
dataset = TensorDataset(input_tensor, target_tensor)
# Create a data loader for mini-batch training
batch_size = 32
data_loader = DataLoader(dataset, batch_size=batch_size, shuffle=True)
```
</details>
%% Cell type:markdown id:ea5aee0c-6c8a-485f-b099-9844a28bafa3 tags:
> **Task**: Explore the `dataset` and `data_loader`:
> 1. Print the total number of samples in the dataset and DataLoader.
> 2. Iterate one time over both and print the shape of items you retrieve.
%% Cell type:code id:244a8198-60c5-4154-93ab-3d96fbf3488a tags:
``` python
# Total number of samples
# ...
# Dataset elements
# ...
# DataLoader elements
# ...
```
%% Cell type:markdown id:882438f7-3cc7-4a20-a223-41ede7856ef4 tags:
<details>
<summary>Hint (click to reveal)</summary>
When you iterate over the dataset, each item you get from the iteration should be a tuple of (input, target), so you should retrieve two elements each of len 1.
On the other hand, when you iterate over the data_loader, each item you get from the iteration is a mini-batch of data. Thus, the length you get from each iteration should correspond to the batch size you've set (i.e., 5 in our case), except possibly the last batch if the dataset size isn't a perfect multiple of the batch size.
```python
# Total number of samples
print(f"Total samples in dataset: {len(dataset)}")
print(f"Total batches in DataLoader: {len(data_loader)}")
# Dataset elements
(index, (data, target)) = next(enumerate(dataset))
print(f"Sample {index}: Data shape {data.shape}, Target shape {target.shape}")
# DataLoader elements
(index, (batch_data, batch_target)) = next(enumerate(data_loader))
print(f"Batch {index}: Data shape {batch_data.shape}, Target shape {batch_target.shape}")
```
</details>
%% Cell type:markdown id:8dc08bb3-e5b2-4a7d-be10-6adc496a812d tags:
### **Splitting the Dataset: Training, Validation, and Testing Sets**
%% Cell type:markdown id:659a4899-cb14-4a47-b990-ea1a77592102 tags:
When training neural networks, it's common to split the dataset into at least two sets:
1. **Training Set**: This set is used to train the model, i.e., adjust the weights using gradient descent.
2. **Validation Set** (optional, but often used): This set is used to evaluate the model during training, allowing for hyperparameter tuning without overfitting.
3. **Test Set**: This set is used to evaluate the model's performance after training, providing an unbiased assessment of its performance on new, unseen data.
In PyTorch, we can use the `random_split` function from `torch.utils.data` to easily split datasets.
First, let's define the lengths for each split:
%% Cell type:code id:32202871-2911-44e6-8ad6-6d848cb3ede0 tags:
``` python
total_samples = len(dataset)
train_size = int(0.8 * total_samples)
val_size = total_samples - train_size
```
%% Cell type:markdown id:a1f7a839-8ee0-460f-bef0-87ca30f7409e tags:
> **Task**: Using the random_split function, split the dataset into a training set and a validation set using the sizes provided above.
[Here's the documentation for random_split](https://pytorch.org/docs/stable/data.html#torch.utils.data.random_split).
> **Task**: Create the train_loader and val_loader
%% Cell type:code id:50a80fc9-ef6e-4118-ad6a-3dea9d16e94f tags:
``` python
# Splitting the dataset
```
%% Cell type:markdown id:b01bb0d7-17c0-4edd-a2b6-17e4ca74b2aa tags:
<details>
<summary>Hint (click to reveal)</summary>
```python
# Splitting the dataset
from torch.utils.data import random_split
train_dataset, val_dataset = random_split(dataset, [train_size, val_size])
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)
```
</details>
%% Cell type:markdown id:e2729431-701c-4451-931c-2ae0ed58dbb5 tags:
> **Task**: Now, using the provided training and validation datasets, print out the number of samples in each set. Also, fetch one sample from each set and print its shape.
%% Cell type:code id:770c42f6-7a52-4856-a4fe-23a60666389a tags:
``` python
# Your code here
```
%% Cell type:markdown id:583948e8-898a-4336-92c6-aaddef6adbcf tags:
<details>
<summary>Hint (click to reveal)</summary>
```python
# Print number of samples in each set
print(f"Number of training samples: {len(train_dataset)}")
print(f"Number of validation samples: {len(val_dataset)}")
# Fetching one sample from each set and printing its shape
train_sample, train_target = train_dataset[0]
print(f"Training sample shape: {train_sample.shape}, Target shape: {train_target.shape}")
val_sample, val_target = val_dataset[0]
print(f"Validation sample shape: {val_sample.shape}, Target shape: {val_target.shape}")
```
</details>
%% Cell type:markdown id:0fdec6d6-9b32-457d-b8e6-d94d8e020e4f tags:
### **Loss Functions: Measuring Model Errors**
%% Cell type:markdown id:899ce66c-e878-4f6a-b37c-34cdeae438a1 tags:
Every training process needs a metric to determine how well the model's predictions align with the actual data. This metric is called the loss function or cost function. PyTorch provides many [loss functions](https://pytorch.org/docs/stable/nn.html#loss-functions) suitable for different types of tasks.
Different problems might require different loss functions. PyTorch provides a variety of [loss functions](https://pytorch.org/docs/stable/nn.html#loss-functions) suited for different tasks. For instance:
- **Mean Squared Error (MSE)**: Commonly used for regression tasks.
- **Cross-Entropy Loss**: Suited for classification tasks.
For a simple regression task, a common choice is the Mean Squared Error (MSE) loss.
> **Task**: Familiarize yourself with the [MSE loss documentation](https://pytorch.org/docs/stable/generated/torch.nn.MSELoss.html). You will soon use it in the training loop.
> **Task**: Instantiate the Mean Squared Error (MSE) loss provided by PyTorch for our current neural network.
%% Cell type:code id:692e83d7-7382-4ab2-9caf-daa3a77bfd4d tags:
``` python
# Define the loss function.
```
%% Cell type:markdown id:7fe8dcb5-8a43-4561-88a0-a4a2a2d1bf53 tags:
<details>
<summary>Hint (click to reveal)</summary>
To define the MSE loss in PyTorch, you can use:
```python
criterion = nn.MSELoss()
```
</details>
%% Cell type:markdown id:e957d999-0a56-4320-808a-05d1af6b81c7 tags:
### **Optimizers: Adjusting Weights**
%% Cell type:markdown id:d3d4a09d-8838-4fd3-9e16-bfdc5018abde tags:
Optimizers adjust the weights of the network based on the gradients computed during backpropagation. Different optimizers might update weights in varying ways. For example, the popular **Stochastic Gradient Descent (SGD)** optimizer simply updates weights in the direction of negative gradients, while **Adam** and **RMSprop** are more advanced optimizers that consider aspects like momentum and weight decay.
PyTorch offers a wide range of [optimizers](https://pytorch.org/docs/stable/optim.html).
> **Task**: Review the [SGD optimizer documentation](https://pytorch.org/docs/stable/optim.html#torch.optim.SGD). It will be pivotal in the training loop you'll construct.
> **Task**: For this exercise, let's use the SGD optimizer. Instantiate it, setting our neural network parameters as the ones to be optimized and choosing a learning rate of 0.01.
%% Cell type:code id:39c8dfa8-7ea0-44e4-9429-118a6333bfe1 tags:
``` python
# Define the optimizer.
```
%% Cell type:markdown id:05e37f67-519a-4c49-97b3-2fafb7176de1 tags:
<details>
<summary>Hint (click to reveal)</summary>
To define the SGD optimizer in PyTorch, you can use:
```python
optimizer = torch.optim.SGD(model.parameters(), lr=0.0001)
```
Because of how simple the task is, you will probably need a really small learning rate to reach good results.
</details>
%% Cell type:markdown id:13b2fb3e-5391-4e66-ba83-55e66935d2aa tags:
### **Setting Up the Basic Training Loop Function**
%% Cell type:markdown id:7a364925-b4d9-4ffd-b3f8-be30a5bb1613 tags:
Having a training loop within a function allows us to reuse the same code structure for different models, datasets, or other training parameters without redundancy. This modular approach also promotes code clarity and maintainability.
Let's define the training loop function which takes the model, data (inputs and targets), loss function, optimizer, and the number of epochs as parameters. The function should return the history of the loss after each epoch.
A typical training loop consists of:
1. Sending the input through the model (forward pass).
2. Calculating the loss.
3. Propagating the loss backward through the model to compute gradients (backward pass).
4. Updating the weights using the optimizer.
5. Repeating the steps for several epochs.
Training with the entire dataset as one batch can be memory-intensive and sometimes not as effective. Hence, in practice, we usually divide our dataset into smaller chunks or mini-batches and update our weights after each mini-batch.
> **Task**: Create a function named `train_model` that encapsulates the training loop for the `SimpleNet` model. The function should follow the signature the next code cell:
%% Cell type:code id:734864fe-46b6-4435-b58d-19b085ebd3f9 tags:
``` python
def train_model(model, dataloader, loss_function, optimizer, epochs):
# Your code here
pass
```
%% Cell type:markdown id:a6fee8dc-59da-4d48-918e-d6e093e997e5 tags:
<details>
<summary>Hint (click to reveal)</summary>
Here's how the train_model function might look:
```python
def train_model(model, dataloader, loss_function, optimizer, epochs):
# Store the loss values at each epoch
loss_history = []
for epoch in range(epochs):
for inputs, targets in dataloader:
# Ensure that data is on the right device
inputs, targets = inputs.to(device), targets.to(device)
# Reset the gradients to zero
optimizer.zero_grad()
# Execute a forward pass
outputs = model(inputs)
# Calculate the loss
loss = loss_function(outputs, targets)
# Conduct a backward pass
loss.backward()
# Update the weights
optimizer.step()
# Append the loss to the history
loss_history.append(loss.item())
print(f"Epoch [{epoch+1}/{epochs}], Loss: {loss_history[-1]:.4f}")
return loss_history
```
</details>
%% Cell type:markdown id:c4e4b485-ffa6-487d-8dbc-b0b0590a796a tags:
### **Training the Neural Network**
%% Cell type:markdown id:15ba6b07-728f-4444-a3a9-af8cfeb884e1 tags:
With all the components defined in the previous sections, it's now time to integrate everything and set the training process in motion.
> **Task**: Combine all the previously defined elements to initiate the training procedure for your neural network model.
> 1. Don't forget to Move your model and to the same device (GPU or CPU).
> 2. Train the model using the `train_loader` and `val_loader`.
%% Cell type:code id:90d043f7-213d-42a7-a14b-e6b716003b70 tags:
``` python
# Your code here to initiate the training process
```
%% Cell type:markdown id:398aaeec-5d6d-4ef6-bd24-27d51b32c148 tags:
<details>
<summary>Hint (click to reveal)</summary>
To train the model, you need to integrate all the previously defined components:
```python
# Moving the model to the device
model = SimpleNet(input_size=1, hidden_size=10, output_size=1).to(device)
# Training the model using the train_loader
loss_history = train_model(model, train_loader, criterion, optimizer, epochs=50)
```
Make sure you have defined the loss_function, optimizer, and epochs in the previous sections.
</details>
%% Cell type:code id:c7cf3df1-9fe2-4eee-a5bf-386f77b257f1 tags:
``` python
import matplotlib.pyplot as plt
# Plotting the loss curve
plt.figure(figsize=(10,6))
plt.plot(loss_history, label='Training Loss')
plt.title("Loss Curve")
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.legend()
plt.grid(True)
plt.show()
```
%% Cell type:markdown id:2b7f9d87-c172-427c-a2f4-1090b1120148 tags:
## **Conclusion: Moving Beyond the Basics**
%% Cell type:markdown id:6074877c-c149-4af9-8503-153455edd42a tags:
You've now built and trained a simple neural network using PyTorch, and you might be wondering: why aren't my results as good as I expected?
While you've certainly made strides, the journey of mastering deep learning and neural networks is filled with nuance, challenges, and constant learning. Here are some reasons why your results might not be optimal and what you'll discover in your next steps:
1. **Hyperparameters Tuning**: So far, we've set values like learning rate and batch size somewhat arbitrarily. These values are critical and often require careful tuning specific to each problem.
2. **Learning Rate Scheduling**: A fixed learning rate might not always be the best strategy. Reducing the learning rate during training, known as learning rate annealing or scheduling, often leads to better convergence.
3. **Model Architecture**: The neural network we built is basic. There's an entire world of architectures out there, designed for specific types of data and tasks. The right architecture can make a significant difference.
4. **Regularization**: To prevent overfitting, techniques like dropout, weight decay, and early stopping can be applied. We haven't touched upon these, but they're crucial for ensuring your model generalizes well to unseen data.
5. **Data Quality and Quantity**: While we used synthetic data for simplicity, real-world data is messy. Cleaning and preprocessing data, augmenting it, and ensuring it's representative can have a significant impact on performance.
6. **Optimization Techniques**: There are advanced optimization algorithms and techniques that can speed up training and lead to better convergence. Techniques like momentum, adaptive learning rates (e.g., Adam, RMSprop) can play a crucial role.
7. **Evaluation Metrics**: We've looked at loss values, but in real-world scenarios, understanding and selecting the right evaluation metrics for the task (accuracy, F1-score, AUC-ROC, etc.) is vital.
8. **Training Dynamics**: Understanding how models train, visualizing the activations, weights, and gradients, and knowing when and why a model is struggling can offer insights into how to improve performance.
Remember, while the mechanics of building and training a neural network are essential, the art of deep learning lies in understanding the nuances and iterating based on insights and knowledge. The next steps in your learning, focusing on methodology, will provide the tools and knowledge to navigate these complexities and achieve better results.
Keep learning, experimenting, and iterating! The world of deep learning is vast, and there's always something new to discover.
%% Cell type:markdown id:ca6048e4-f3cf-40eb-bd50-c95f281f0554 tags:
## **Extra for the Fast Movers: Diving Deeper**
%% Cell type:markdown id:46a25dfd-1cc9-444d-98d6-966e7cc9da07 tags:
To further enhance your understanding and capability with PyTorch, this section introduces additional topics that cater to more advanced use-cases. These tools and techniques can be essential when dealing with larger and more complex projects, providing valuable insights into optimization and performance.
%% Cell type:markdown id:30edeed8-321b-4b1f-ace6-0decd8a167e5 tags:
### **Profiling with PyTorch Profiler in TensorBoard**
%% Cell type:markdown id:256bd4a2-aa6f-4a50-9c5d-854ca25293de tags:
PyTorch, starting from version 1.9.0, incorporates the PyTorch Profiler as a TensorBoard plugin. This integration allows users to profile their PyTorch code and visualize the results directly within TensorBoard.
Below, we will be instrumenting PyTorch Code for TensorBoard Profiling.
Use this [documentation](http://www.idris.fr/jean-zay/pre-post/profiler_pt.html) to achieve the next tasks.
> **Task:** Before instrumenting your PyTorch code, you'll need to import the necessary modules for profiling.
> **Task:** Modify the training loop to invoke the profiler.
%% Cell type:code id:86b471a6-7de6-40f0-af58-c41e8e8acbae tags:
``` python
# Your imports here
# Your code here
def train_model_with_profiling(model, train_loader, criterion, optimizer, epochs, profiler_dir='./profiler'):
# Your code here
pass
```
%% Cell type:markdown id:f389816a-fa2a-4668-9f0b-07d2a5abf5e1 tags:
<details>
<summary>Hint (click to reveal)</summary>
```python
from torch.profiler import profile, tensorboard_trace_handler, ProfilerActivity, schedule
def train_model_with_profiling(model, dataloader, loss_function, optimizer, epochs, profiler_dir='./profiler'):
# Store the loss values at each epoch
loss_history = []
with profile(activities=[ProfilerActivity.CPU, ProfilerActivity.CUDA],
schedule=schedule(wait=1, warmup=1, active=12, repeat=1),
on_trace_ready=tensorboard_trace_handler(profiler_dir)) as prof:
for epoch in range(epochs):
for inputs, targets in dataloader:
# Ensure that data is on the right device
inputs, targets = inputs.to(device), targets.to(device)
# Reset the gradients to zero
optimizer.zero_grad()
# Execute a forward pass
outputs = model(inputs)
# Calculate the loss
loss = loss_function(outputs, targets)
# Conduct a backward pass
loss.backward()
# Update the weights
optimizer.step()
# Append the loss to the history
loss_history.append(loss.item())
# Notify profiler of step boundary
prof.step()
print(f"Epoch [{epoch+1}/{epochs}], Loss: {loss_history[-1]:.4f}")
return loss_history
```
Make sure you have defined the loss_function, optimizer, and epochs in the previous sections.
</details>
%% Cell type:code id:cb82f0a9-522f-4746-87f9-ba7b7952d863 tags:
``` python
# Training the model using the train_loader
loss_history = train_model_with_profiling(model, train_loader, criterion, optimizer, 10, profiler_dir='./profiler')
```
%% Cell type:markdown id:313e4f40-521a-4beb-a278-c1ca9502b499 tags:
> **Task:** Visualize the profiling, you will need to open a Tensorboard interface using the Blue button on the top left corner.
>
> **Make sur to specify the logdir with "--logid=/path/to/profiler_folder".**
%% Cell type:markdown id:06f86768-3b78-4874-b083-64bc365080fb tags:
### **Learning Rate Scheduling**
%% Cell type:markdown id:44721444-ba4a-44d0-9b65-16890dd4f097 tags:
One of the key hyperparameters to tune during neural network training is the learning rate. While it's possible to set a static learning rate for the entire training process, in practice, dynamically adjusting the learning rate often leads to better convergence and overall performance. This dynamic adjustment is often referred to as learning rate scheduling or annealing.
Concept of Learning Rate Scheduling
The learning rate determines the step size at each iteration while moving towards a minimum of the loss function. If it's too large, the optimization might overshoot the minimum. Conversely, if it's too small, the training might get stuck, or convergence could be very slow.
A learning rate scheduler changes the learning rate during training based on the provided scheduling policy. By adjusting the learning rate during training, you can achieve faster convergence and better final results.
Using Learning Rate Schedulers in PyTorch
PyTorch provides a variety of learning rate schedulers through the torch.optim.lr_scheduler module. Some of the popular ones are:
- StepLR: Decays the learning rate of each parameter group by gamma every step_size epochs.
- ExponentialLR: Decays the learning rate of each parameter group by gamma every epoch.
- ReduceLROnPlateau: Reduces the learning rate when a metric has stopped improving.
> **Task:** Take a look at the [documentation]() or click on the hint in the following cell then integrate an LR scheduler in your own code that you wrote before
%% Cell type:markdown id:0c79a170-35d0-438f-b01b-a3f236f8b724 tags:
<details>
<summary>Hint (click to reveal)</summary>
Below, you have a typical training loop with a learning rate scheduler.
```python
from torch.optim.lr_scheduler import StepLR
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)
scheduler = StepLR(optimizer, step_size=10, gamma=0.1)
for epoch in range(epochs):
for input, target in data:
optimizer.zero_grad()
output = model(input)
loss = loss_fn(output, target)
loss.backward()
optimizer.step()
# Step the learning rate scheduler
scheduler.step()```
</details>
%% Cell type:markdown id:33f99f6e-3120-495a-a25b-8b9f3d14deb2 tags:
### **Automatic Mixed Precision**
%% Cell type:markdown id:217a7249-6655-4587-92b8-72dea7de8c9d tags:
Training deep neural networks can be both time-consuming and resource-intensive. One way to address this problem is by leveraging mixed precision training. In essence, mixed precision training uses both 16-bit and 32-bit floating-point types to represent numbers in the model, which can speed up training without sacrificing the accuracy of the final model.
**Overview of AMP (Automatic Mixed Precision)**
AMP (Automatic Mixed Precision) is a set of utilities provided by PyTorch to enable mixed precision training more effortlessly. The main advantages of AMP are:
- Faster Training: By using reduced precision, the model requires less memory bandwidth, resulting in faster data transfers and faster matrix multiplication.
- Reduced GPU Memory Usage: This enables training of larger models or utilization of larger batch sizes.
PyTorch has integrated the AMP utilities starting from version 1.6.
> **Task**: Setup AMP in the training function by checking the [documentation](http://www.idris.fr/eng/ia/mixed-precision-eng.html). You will need to do the necessary imports, initialize the GradScaler, modify the training loop by including "with autocast():" around the forward and loss computation.
%% Cell type:code id:ad131b4b-02ba-472d-af78-a048868e3efc tags:
``` python
# Your code here
```
%% Cell type:markdown id:de38cb30-7b24-48cb-b804-ed296e38e3fb tags:
<details>
<summary>Hint (click to reveal)</summary>
Below, you have a typical training loop with autocast.
```python
from torch.cuda.amp import autocast, GradScaler
scaler = GradScaler()
for epoch in epochs:
for input, target in data:
optimizer.zero_grad()
with autocast():
output = model(input)
loss = loss_fn(output, target)
scaler.scale(loss).backward()
scaler.step(optimizer)
scaler.update()
```
</details>
%% Cell type:markdown id:a3f7818a-fea1-4a12-b52a-cd83e0ae2ffe tags:
### **Pytorch Compiler**
%% Cell type:markdown id:dbb5f69b-009e-40b3-94f0-5a420afbd003 tags:
**For this section, you will need to use Pytorch with a version superior to 2.0.**
PyTorch, a widely adopted deep learning framework, has consistently evolved to offer users better performance and ease of use. One such advancement is the introduction of the PyTorch Compiler. This cutting-edge feature accelerates PyTorch code execution by JIT-compiling it into optimized kernels. What's even more impressive is its ability to enhance performance with minimal modifications to the original codebase.
Historically, PyTorch has introduced compiler solutions like TorchScript and FX Tracing. However, the introduction of torch.compile with PyTorch 2.0 has taken performance optimization to a new level. It provides a seamless experience, enabling you to transform typical PyTorch functions and even torch.nn.Module instances into their faster, compiled counterparts.
For those eager to dive deep into its workings and benefits, detailed documentation and tutorials have been made available:
- [torch.compile Tutorial](https://pytorch.org/tutorials/intermediate/torch_compile_tutorial.html)
- [PyTorch 2.0 Release Notes](https://pytorch.org/get-started/pytorch-2.0/)
> **Task:** Your task is to make your existing PyTorch model take advantage of the performance benefits offered by torch.compile. This will not only make your model run faster but also give you hands-on experience with one of the latest features in PyTorch.
%% Cell type:markdown id:8d5236bc-08e4-4142-8c9c-fd7007474ff2 tags:
<details>
<summary>Hint (click to reveal)</summary>
1. **Ensure Dependencies**:
- Ensure that you have the required dependencies, especially PyTorch version 2.0 or higher.
2. **Check for GPU Compatibility**:
- For optimal performance, it's recommended to use a modern NVIDIA GPU (H100, A100, or V100).
3. **Compile Functions**:
- You can optimize arbitrary Python functions as shown in the example:
```python
def your_function(x, y):
# ... Your PyTorch code here ...
opt_function = torch.compile(your_function)
```
- Alternatively, use the decorator approach:
```python
@torch.compile
def opt_function(x, y):
# ... Your PyTorch code here ...
```
4. **Compile Modules**:
- If you have a PyTorch module (a class derived from `torch.nn.Module`), you can compile it similarly:
```python
class YourModule(torch.nn.Module):
# ... Your module definition here ...
model = YourModule()
opt_model = torch.compile(model)
```
</details>
%% Cell type:markdown id:bd4066a6-3f24-4b63-b2be-da0350ec6145 tags:
Remember, while torch.compile optimizes performance, the underlying logic remains the same. Ensure to test and validate your compiled model's outputs against the original to confirm consistent behavior.
%% Cell type:markdown id:4340d5df tags:
---
<img width="80px" src="../fidle/img/logo-paysage.svg"></img>
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/header.svg"></img>
# <!-- TITLE --> [TSB1] - Tensorboard with/from Jupyter
<!-- DESC --> 4 ways to use Tensorboard from the Jupyter environment
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
## Objectives :
- Using [**Tensorboard**](https://www.tensorflow.org/tensorboard/get_started)
## What we're going to do :
- Using Tensorboard
%% Cell type:markdown id: tags:
## In the Fidle environment :
To access logs with tensorboad :
- Under **Docker**, from a terminal launched via the jupyterlab launcher, use the following command:<br>
```tensorboard --logdir <path-to-logs> --host 0.0.0.0```
- If you're **not using Docker**, from a terminal :<br>
```tensorboard --logdir <path-to-logs>```
**Note:** One tensorboard instance can be used simultaneously.
%% Cell type:markdown id: tags:
## Otherwise, in the real world, from Jupyter (***)
It's the easiest and the best way \!
Launch Tensorboard directly from Jupiter.
Works very fine on Jean-Zay (at IDRIS) :-)
%% Cell type:markdown id: tags:
## Otherwise, in the real word, Tensorboard as a magic command (**)
Tensorboard can be run from Jupiter with a magic command.
See [documentation](https://www.tensorflow.org/tensorboard/tensorboard_in_notebooks)
Load the extention : ```%load_ext tensorboard```
Start tensorboard : ```%tensorboard --logdir logs```
%% Cell type:raw id: tags:
%load_ext tensorboard
%tensorboard --logdir logs
%% Cell type:markdown id: tags:
## Otherwise, in the real word, Option 2 - Shell command (*)
Basic way, from a shell
More about it : `# tensorboard --help`
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/logo-paysage.svg"></img>
%% Cell type:markdown id: tags:
<img width="800px" src="../fidle/img/header.svg"></img>
# <!-- TITLE --> [K3LSTM1] - Basic Keras LSTM Layer
<!-- DESC --> A small example of an LSTM layer in Keras
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->
%% Cell type:code id: tags:
``` python
import os
os.environ['KERAS_BACKEND'] = 'torch'
import keras
import numpy as np
```
%% Cell type:code id: tags:
``` python
input = keras.random.normal( [32, 20, 8] )
lstm = keras.layers.LSTM(16)
output = lstm(input)
print('input shape is : ',input.shape)
print('output shape is : ',output.shape)
```
%% Cell type:code id: tags:
``` python
input = keras.random.normal( [32, 20, 8] )
lstm = keras.layers.LSTM(18, return_sequences=True, return_state=True)
output, memory_state, carry_state = lstm(input)
print('input shape : ',input.shape)
print('output shape : ',output.shape)
print('memory_state : ', memory_state.shape)
print('carry_state : ', memory_state.shape)
```
%% Cell type:markdown id: tags:
---
<img width="80px" src="../fidle/img/logo-paysage.svg"></img>