<img width="800px" src="../fidle/img/00-Fidle-header-01.svg"></img>

# <!-- TITLE --> [VAE6] - Variational AutoEncoder (VAE) with CelebA (small)
<!-- DESC --> VAE with a more fun and realistic dataset - small resolution and batchable
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->

## Objectives :
 - Build and train a VAE model with a large dataset in small resolution(>70 GB)
 - Understanding a more advanced programming model with data generator

The [CelebFaces Attributes Dataset (CelebA)](http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html) contains about 200,000 images (202599,218,178,3). 

## What we're going to do :

 - Defining a VAE model
 - Build the model
 - Train it
 - Follow the learning process with Tensorboard


## Step 1 - Setup environment
### 1.1 - Python stuff

In [None]:
import tensorflow as tf
import numpy as np
import os,sys
from importlib import reload

import modules.vae
import modules.data_generator

reload(modules.data_generator)
reload(modules.vae)

from modules.vae import VariationalAutoencoder
from modules.data_generator import DataGenerator

sys.path.append('..')
import fidle.pwk as ooo
reload(ooo)

ooo.init()

VariationalAutoencoder.about()
DataGenerator.about()

### 1.2 - The good place

In [None]:
place, dataset_dir = ooo.good_place( { 'GRICAD' : f'{os.getenv("SCRATCH_DIR","")}/PROJECTS/pr-fidle/datasets/celeba',
 'IDRIS' : f'{os.getenv("WORK","")}/datasets/celeba',
 'HOME' : f'{os.getenv("HOME","")}/datasets/celeba'} )

# ---- train/test datasets

train_dir = f'{dataset_dir}/clusters.train'
test_dir = f'{dataset_dir}/clusters.test'

## Step 2 - DataGenerator and validation data
Ok, everything's perfect, now let's instantiate our generator for the entire dataset.

In [None]:
data_gen = DataGenerator(train_dir, 32, k_size=1)
x_test = np.load(f'{test_dir}/images-000.npy')

print(f'Data generator : {len(data_gen)} batchs of {data_gen.batch_size} images, or {data_gen.dataset_size} images')
print(f'x_test : {len(x_test)} images')

## Step 3 - Get VAE model

In [None]:
tag = f'CelebA.006-S.{os.getenv("SLURM_JOB_ID","unknown")}'

input_shape = (128, 128, 3)
z_dim = 200
verbose = 1

encoder= [ {'type':'Conv2D', 'filters':32, 'kernel_size':(3,3), 'strides':2, 'padding':'same', 'activation':'relu'},
 {'type':'Dropout', 'rate':0.25},
 {'type':'Conv2D', 'filters':64, 'kernel_size':(3,3), 'strides':2, 'padding':'same', 'activation':'relu'},
 {'type':'Dropout', 'rate':0.25},
 {'type':'Conv2D', 'filters':64, 'kernel_size':(3,3), 'strides':2, 'padding':'same', 'activation':'relu'},
 {'type':'Dropout', 'rate':0.25},
 {'type':'Conv2D', 'filters':64, 'kernel_size':(3,3), 'strides':2, 'padding':'same', 'activation':'relu'},
 {'type':'Dropout', 'rate':0.25},
 ]

decoder= [ {'type':'Conv2DTranspose', 'filters':64, 'kernel_size':(3,3), 'strides':2, 'padding':'same', 'activation':'relu'},
 {'type':'Dropout', 'rate':0.25},
 {'type':'Conv2DTranspose', 'filters':64, 'kernel_size':(3,3), 'strides':2, 'padding':'same', 'activation':'relu'},
 {'type':'Dropout', 'rate':0.25},
 {'type':'Conv2DTranspose', 'filters':32, 'kernel_size':(3,3), 'strides':2, 'padding':'same', 'activation':'relu'},
 {'type':'Dropout', 'rate':0.25},
 {'type':'Conv2DTranspose', 'filters':3, 'kernel_size':(3,3), 'strides':2, 'padding':'same', 'activation':'sigmoid'}
 ]

vae = modules.vae.VariationalAutoencoder(input_shape = input_shape, 
 encoder_layers = encoder, 
 decoder_layers = decoder,
 z_dim = z_dim, 
 verbose = verbose,
 run_tag = tag)
vae.save(model=None)

## Step 4 - Compile it

In [None]:
optimizer = tf.keras.optimizers.Adam(1e-4)
# optimizer = 'adam'
r_loss_factor = 10000

vae.compile(optimizer, r_loss_factor)

## Step 5 - Train
For 10 epochs, adam optimizer : 
- Run time at IDRIS : 1299.77 sec. - 0:21:39
- Run time at GRICAD : 2092.77 sec. - 0:34:52

In [None]:
epochs = 10
initial_epoch = 0

In [None]:
vae.train(data_generator = data_gen,
 x_test = x_test,
 epochs = epochs,
 initial_epoch = initial_epoch
 )

---
<img width="80px" src="../fidle/img/00-Fidle-logo-01.svg"></img>