Commit 92cafd84 authored by Sara Si Moussi's avatar Sara Si Moussi
Browse files

Upload New File

parent 9c4aedb9
%% Cell type:markdown id: tags:
# Tensorflow advanced techniques cheat sheet
%% Cell type:code id: tags:
``` python
import tensorflow as tf
import numpy as np
from tensorflow import keras
```
%% Cell type:markdown id: tags:
## Custom loss functions, layers and models
%% Cell type:markdown id: tags:
In this section, I gathered cheat codes for customizing tensorflow/keras models, losses or layers
%% Cell type:markdown id: tags:
### Custom loss
%% Cell type:markdown id: tags:
Let's try to implement the Huber loss:
$$\displaystyle L_{\delta }(y,f(x))={\begin{cases}{\frac {1}{2}}(y-f(x))^{2}&{\textrm {for}}|y-f(x)|\leq \delta ,\\\delta \,(|y-f(x)|-{\frac {1}{2}}\delta ),&{\textrm {otherwise.}}\end{cases}}$$
%% Cell type:markdown id: tags:
<img src="Huber_loss.svg.png">
%% Cell type:code id: tags:
``` python
def my_huber_loss(y_true, y_pred):
threshold = 1
error = y_true - y_pred
is_small_error = tf.abs(error) <= threshold
small_error_loss = tf.square(error) / 2
big_error_loss = threshold * (tf.abs(error) - (0.5 * threshold))
return tf.where(is_small_error, small_error_loss, big_error_loss)
```
%% Cell type:markdown id: tags:
Let's try it on the same vector
%% Cell type:code id: tags:
``` python
yt=np.array([-1.0, 0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)
yp1=yt
yp2=yt*2+3
```
%% Cell type:code id: tags:
``` python
my_huber_loss(yt,yp1)
```
%%%% Output: execute_result
<tf.Tensor: shape=(6,), dtype=float64, numpy=array([0., 0., 0., 0., 0., 0.])>
%% Cell type:code id: tags:
``` python
my_huber_loss(yt,yp2)
```
%%%% Output: execute_result
<tf.Tensor: shape=(6,), dtype=float64, numpy=array([1.5, 2.5, 3.5, 4.5, 5.5, 6.5])>
%% Cell type:markdown id: tags:
To use your custom loss in your model, just specify it as your loss argument in your_model.compile
%% Cell type:code id: tags:
``` python
model.compile(optimizer='sgd', loss=my_huber_loss) ##DO NOT RUN
```
%% Cell type:markdown id: tags:
### Custom loss with hyperparameters
%% Cell type:markdown id: tags:
To customize a loss function with a specified hyperparameters, use a wrapper function that returns an instance of the loss function with the chosen value of the hyperparameter
%% Cell type:code id: tags:
``` python
def customize_huber_loss(threshold):
def my_huber_loss(y_true, y_pred):
error = y_true - y_pred
is_small_error = tf.abs(error) <= threshold
small_error_loss = tf.square(error) / 2
big_error_loss = threshold * (tf.abs(error) - (0.5 * threshold))
return tf.where(is_small_error, small_error_loss, big_error_loss)
return my_huber_loss
```
%% Cell type:code id: tags:
``` python
custom_loss=customize_huber_loss(0.5)
np.mean(custom_loss(yt,yp2))
```
%%%% Output: execute_result
2.125
%% Cell type:markdown id: tags:
To use it in your model:
%% Cell type:code id: tags:
``` python
model.compile(optimizer='sgd', loss=customize_huber_loss(0.5)) ##DO NOT RUN
```
%% Cell type:markdown id: tags:
Another possibility is to use a Loss Object instead of a function, by defining a class object representing your loss, as follows:
%% Cell type:code id: tags:
``` python
from tensorflow.keras.losses import Loss
class MyHuberLoss(Loss):
threshold=1 ##default value
def __init__(self,threshold):
super().__init__()
self.threshold=threshold
def call(self,y_true,y_pred):
error = y_true - y_pred
is_small_error = tf.abs(error) <= self.threshold
small_error_loss = tf.square(error) / 2
big_error_loss = self.threshold * (tf.abs(error) - (0.5 * self.threshold))
return tf.where(is_small_error, small_error_loss, big_error_loss)
```
%% Cell type:code id: tags:
``` python
custom_loss=MyHuberLoss(threshold=0.5)
custom_loss(yt,yp2)
```
%%%% Output: execute_result
<tf.Tensor: shape=(), dtype=float64, numpy=2.125>
%% Cell type:markdown id: tags:
Use it in the model as follow:
%% Cell type:code id: tags:
``` python
model.compile(optimizer='sgd', loss=MyHuberLoss(threshold=0.5)) ##DO NOT RUN
```
%% Cell type:markdown id: tags:
### Contrastive loss
%% Cell type:markdown id: tags:
It's called like this, because it contrasts two inputs and applies two different losses dependening on whether they are similar or not.
The goal is to learn an embedding such that similar inputs are close in the output space.
It was proposed by http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf
Here:
- $Y = 1$ if inputs are similar and 0 otherwise
- $\hat{Y}$ : is the distance in embedding space
%% Cell type:markdown id: tags:
$$closs = y \cdot \hat{y}^2 + (1-y) \cdot max(m-\hat{y},0)^2$$
%% Cell type:code id: tags:
``` python
def contrastive_loss_with_margin(margin):
def contrastive_loss(y_true, y_pred):
'''Contrastive loss from Hadsell-et-al.'06
http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf
'''
square_pred = K.square(y_pred)
margin_square = K.square(K.maximum(margin - y_pred, 0))
return K.mean(y_true * square_pred + (1 - y_true) * margin_square)
return contrastive_loss
```
%% Cell type:markdown id: tags:
### Lambda layers
%% Cell type:code id: tags:
``` python
from keras import backend as K
def my_relu(x):
return K.maximum(0,x)
##define your lambda layer with the custom computation defined previously
tf.keras.layers.Lambda(my_relu)
tf.keras.layers.Lambda(lambda x: tf.abs(x))
```
%%%% Output: execute_result
<tensorflow.python.keras.layers.core.Lambda at 0x1d68b6a4f70>
%% Cell type:markdown id: tags:
### Custom layers
%% Cell type:markdown id: tags:
Common layers in tensorflow include:
<img src="layers.png">
%% Cell type:markdown id: tags:
The structure of a layer is as follow: </br>
<img src="layer.png">
%% Cell type:code id: tags:
``` python
# inherit from this base class
from tensorflow.keras.layers import Layer
class SimpleDense(Layer):
def __init__(self, units=32, activation=None):
'''Initializes the instance attributes'''
super(SimpleDense, self).__init__()
self.units = units
self.activation=tf.keras.activations.get(activation)
def build(self, input_shape):
'''Create the state of the layer (weights)'''
# initialize the weights
w_init = tf.random_uniform_initializer()
self.w = tf.Variable(name="kernel",
initial_value=w_init(shape=(input_shape[-1], self.units),
dtype='float32'),
trainable=True)
# initialize the biases
b_init = tf.zeros_initializer()
self.b = tf.Variable(name="bias",
initial_value=b_init(shape=(self.units,), dtype='float32'),
trainable=True)
super().build(input_shape)
def call(self, inputs):
'''Defines the computation from inputs to outputs'''
return self.activation(tf.matmul(inputs, self.w) + self.b)
```
%% Cell type:markdown id: tags:
Now, to customize the activations:
%% Cell type:markdown id: tags:
## Custom Models
%% Cell type:markdown id: tags:
you'll define all the layers in one function, `init`, and connect the layers together in another function, `call`.
%% Cell type:code id: tags:
``` python
from keras import Model
from keras.layers import Dense, Input,concatenate
# inherit from the Model base class
class WideAndDeepModel(Model):
def __init__(self, units=30, activation='relu', **kwargs):
'''initializes the instance attributes'''
super().__init__(**kwargs)
self.hidden1 = Dense(units, activation=activation)
self.hidden2 = Dense(units, activation=activation)
self.main_output = Dense(1)
self.aux_output = Dense(1)
def call(self, inputs):
'''defines the network architecture'''
input_A, input_B = inputs
hidden1 = self.hidden1(input_B)
hidden2 = self.hidden2(hidden1)
concat = concatenate([input_A, hidden2])
main_output = self.main_output(concat)
aux_output = self.aux_output(hidden2)
return main_output, aux_output
```
%% Cell type:markdown id: tags:
### ResNet model
%% Cell type:markdown id: tags:
Here's a picture of the model we'd like to build:</br>
<img src="miniresnet.JPG">
%% Cell type:markdown id: tags:
We notice that the following blocks are repeated, so we build a submodel with the corresponding layers:</br>
<img src="identitiyblocks.JPG">
%% Cell type:code id: tags:
``` python
class IdentityBlock(tf.keras.Model):
def __init__(self, filters, kernel_size):
super(IdentityBlock, self).__init__(name='')
self.conv1 = tf.keras.layers.Conv2D(filters, kernel_size, padding='same')
self.bn1 = tf.keras.layers.BatchNormalization()
self.conv2 = tf.keras.layers.Conv2D(filters, kernel_size, padding='same')
self.bn2 = tf.keras.layers.BatchNormalization()
self.act = tf.keras.layers.Activation('relu')
self.add = tf.keras.layers.Add()
def call(self, input_tensor):
x = self.conv1(input_tensor)
x = self.bn1(x)
x = self.act(x)
x = self.conv2(x)
x = self.bn2(x)
x = self.add([x, input_tensor])
x = self.act(x)
return x
```
%% Cell type:markdown id: tags:
Then, we define the architecture of the main model using the previous blocks and other layers:
%% Cell type:code id: tags:
``` python
class ResNet(tf.keras.Model):
def __init__(self, num_classes):
super(ResNet, self).__init__()
self.conv = tf.keras.layers.Conv2D(64, 7, padding='same')
self.bn = tf.keras.layers.BatchNormalization()
self.act = tf.keras.layers.Activation('relu')
self.max_pool = tf.keras.layers.MaxPool2D((3, 3))
# Use the Identity blocks that you just defined
self.id1a = IdentityBlock(64, 3)
self.id1b = IdentityBlock(64, 3)
self.global_pool = tf.keras.layers.GlobalAveragePooling2D()
self.classifier = tf.keras.layers.Dense(num_classes, activation='softmax')
def call(self, inputs):
x = self.conv(inputs)
x = self.bn(x)
x = self.act(x)
x = self.max_pool(x)
# insert the identity blocks in the middle of the network
x = self.id1a(x)
x = self.id1b(x)
x = self.global_pool(x)
return self.classifier(x)
```
%% Cell type:markdown id: tags:
Now, we can instantiate our model according to the problem dimension (number of classes ), and train it on MNIST for instance. Upload the notebook to Colab to run the following.
%% Cell type:code id: tags:
``` python
import tensorflow_datasets as tfds
# utility function to normalize the images and return (image, label) pairs.
def preprocess(features):
return tf.cast(features['image'], tf.float32) / 255., features['label']
# create a ResNet instance with 10 output units for MNIST
resnet = ResNet(10)
resnet.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
# load and preprocess the dataset
dataset = tfds.load('mnist', split=tfds.Split.TRAIN)
dataset = dataset.map(preprocess).batch(32)
# train the model
resnet.fit(dataset, epochs=1)
```
%% Cell type:markdown id: tags:
### VGG
%% Cell type:markdown id: tags:
Here's an illustration of the architecture we want to implement:</br>
<img src="VGG.png">
%% Cell type:markdown id: tags:
We notice that there is a common structure made of a suite of a variable number of conv2D layers with variable filter sizes, followed by a maxpoll2d layer. So, we define it as a generic block.
%% Cell type:code id: tags:
``` python
# Please uncomment all lines in this cell and replace those marked with `# YOUR CODE HERE`.
# You can select all lines in this code cell with Ctrl+A (Windows/Linux) or Cmd+A (Mac), then press Ctrl+/ (Windows/Linux) or Cmd+/ (Mac) to uncomment.
class Block(tf.keras.Model):
def __init__(self, filters, kernel_size, repetitions, pool_size=2, strides=2):
super(Block, self).__init__()
self.filters = filters
self.kernel_size = kernel_size
self.repetitions = repetitions
# Define a conv2D_0, conv2D_1, etc based on the number of repetitions
for i in range(self.repetitions):
# Define a Conv2D layer, specifying filters, kernel_size, activation and padding.
vars(self)[f'conv2D_{i}'] = tf.keras.layers.Conv2D(filters=self.filters,
kernel_size=self.kernel_size,
activation='relu',
padding='same'
)
# Define the max pool layer that will be added after the Conv2D blocks
self.max_pool = tf.keras.layers.MaxPool2D(pool_size=(pool_size,pool_size),strides=(strides,strides))
def call(self, inputs):
# access the class's conv2D_0 layer
conv2D_0 = vars(self)['conv2D_0']
# Connect the conv2D_0 layer to inputs
x = conv2D_0(inputs)
# for the remaining conv2D_i layers from 1 to `repetitions` they will be connected to the previous layer
for i in range(1,self.repetitions):
# access conv2D_i by formatting the integer `i`. (hint: check how these were saved using `vars()` earlier)
conv2D_i = vars(self)[f'conv2D_{i}']
# Use the conv2D_i and connect it to the previous layer
x = conv2D_i(x)
# Finally, add the max_pool layer
max_pool = self.max_pool(x)
return max_pool
```
%% Cell type:markdown id: tags:
Next, we can define the full VGG model using the previous information.
%% Cell type:code id: tags:
``` python
# Please uncomment all lines in this cell and replace those marked with `# YOUR CODE HERE`.
# You can select all lines in this code cell with Ctrl+A (Windows/Linux) or Cmd+A (Mac), then press Ctrl+/ (Windows/Linux) or Cmd+/ (Mac) to uncomment.
class MyVGG(tf.keras.Model):
def __init__(self, num_classes):
super(MyVGG, self).__init__()
# Creating blocks of VGG with the following
# (filters, kernel_size, repetitions) configurations
self.block_a = Block(filters=64, kernel_size=3, repetitions=2, pool_size=2, strides=2)
self.block_b = Block(filters=128, kernel_size=3, repetitions=2, pool_size=2, strides=2)
self.block_c = Block(filters=256, kernel_size=3, repetitions=3, pool_size=2, strides=2)
self.block_d = Block(filters=512, kernel_size=3, repetitions=3, pool_size=2, strides=2)
self.block_e = Block(filters=512, kernel_size=3, repetitions=3, pool_size=2, strides=2)
# Classification head
# Define a Flatten layer
self.flatten = tf.keras.layers.Flatten()
# Create a Dense layer with 256 units and ReLU as the activation function
self.fc = tf.keras.layers.Dense(units=256,activation='relu')
# Finally add the softmax classifier using a Dense layer
self.classifier = tf.keras.layers.Dense(units=num_classes,activation='softmax')
def call(self, inputs):
# Chain all the layers one after the other
x = self.block_a(inputs)
x = self.block_b(x)
x = self.block_c(x)
x = self.block_d(x)
x = self.block_e(x)
x = self.flatten(x)
x = self.fc(x)
x = self.classifier(x)
return x
```
%% Cell type:markdown id: tags:
Here, we can instantiate the VGG model with the desired number of classes
%% Cell type:code id: tags:
``` python
vgg=MyVGG(num_classes=10)
```
%% Cell type:markdown id: tags:
## Callbacks
%% Cell type:markdown id: tags:
Callbacks are a useful piece of functionality in Tensorflow that lets you have control during the training process. Useful to visualize the internal state of the model as well as intermediary statistics about the loss and metrics. </br>
%% Cell type:markdown id: tags:
### Common built-in callbacks
%% Cell type:code id: tags:
``` python
from keras.callbacks import TensorBoard, ModelCheckpoint, EarlyStopping, CSVLogger
tb = TensorBoard(log_dir='logdir')
```
%% Cell type:markdown id: tags:
In collab:
%load_ext tensorboard
%% Cell type:code id: tags:
``` python
chkpt=ModelCheckpoint(filepath='weights.{epoch:02d}-{val_loss:.2f}.h5',
save_weights_only=True,verbose=1,monitor='val_loss',save_best_only=True)
```
%% Cell type:code id: tags:
``` python
es=EarlyStopping(patience=3,monitor='val_loss',mode='min',verbose=1,baseline=0.8,min_delta=0.001)
```
%% Cell type:code id: tags:
``` python
csvlog=CSVLogger('log_file.csv')
```
%% Cell type:code id: tags:
``` python
##DO NOT RUN
model.fit(...,callbacks=[chkpt,tb,es,csvlog])
```
%% Cell type:markdown id: tags:
### Custom callback
%% Cell type:markdown id: tags:
<img src="callbacks.JPG">
%% Cell type:code id: tags:
``` python
```
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment