Newer
Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img width=\"800px\" src=\"../fidle/img/00-Fidle-header-01.svg\"></img>\n",
"\n",
"# <!-- TITLE --> [VAE8] - Variational AutoEncoder (VAE) with CelebA (small)\n",
"<!-- DESC --> Variational AutoEncoder (VAE) with CelebA (small res. 128x128)\n",
"<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->\n",
"\n",
"## Objectives :\n",
" - Build and train a VAE model with a large dataset in **small resolution(>70 GB)**\n",
" - Understanding a more advanced programming model with **data generator**\n",
"\n",
"The [CelebFaces Attributes Dataset (CelebA)](http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html) contains about 200,000 images (202599,218,178,3). \n",
"\n",
"## What we're going to do :\n",
"\n",
" - Defining a VAE model\n",
" - Build the model\n",
" - Train it\n",
" - Follow the learning process with Tensorboard\n",
"\n",
"## Acknowledgements :\n",
"As before, thanks to **François Chollet** who is at the base of this example. \n",
"See : https://keras.io/examples/generative/vae\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 1 - Init python stuff"
]
},
{
"cell_type": "code",
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<style>\n",
"\n",
"div.warn { \n",
" background-color: #fcf2f2;\n",
" border-color: #dFb5b4;\n",
" border-left: 5px solid #dfb5b4;\n",
" padding: 0.5em;\n",
" font-weight: bold;\n",
" font-size: 1.1em;;\n",
" }\n",
"\n",
"\n",
"\n",
"div.nota { \n",
" background-color: #DAFFDE;\n",
" border-left: 5px solid #92CC99;\n",
" padding: 0.5em;\n",
" }\n",
"\n",
"div.todo:before { content:url();\n",
" float:left;\n",
" margin-right:20px;\n",
" margin-top:-20px;\n",
" margin-bottom:20px;\n",
"}\n",
"div.todo{\n",
" font-weight: bold;\n",
" font-size: 1.1em;\n",
" margin-top:40px;\n",
"}\n",
"div.todo ul{\n",
" margin: 0.2em;\n",
"}\n",
"div.todo li{\n",
" margin-left:60px;\n",
" margin-top:0;\n",
" margin-bottom:0;\n",
"}\n",
"\n",
"div .comment{\n",
" font-size:0.8em;\n",
" color:#696969;\n",
"}\n",
"\n",
"\n",
"\n",
"</style>\n",
"\n"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Override : Attribute [run_dir=./run/CelebA.001] with [./run/test-VAE8-3370]\n"
]
},
{
"data": {
"text/markdown": [
"**FIDLE 2020 - Practical Work Module**"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Version : 0.6.1 DEV\n",
"Notebook id : VAE8\n",
"Run time : Wednesday 6 January 2021, 19:47:34\n",
"TensorFlow version : 2.2.0\n",
"Keras version : 2.3.0-tf\n",
"Datasets dir : /home/pjluc/datasets/fidle\n",
"Run dir : ./run/test-VAE8-3370\n",
"Update keras cache : False\n",
"Save figs : True\n",
"Path figs : ./run/test-VAE8-3370/figs\n"
]
},
{
"data": {
"text/markdown": [
"<br>**FIDLE 2021 - VAE**"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Version : 1.2\n",
"TensorFlow version : 2.2.0\n",
"Keras version : 2.3.0-tf\n"
]
},
{
"data": {
"text/markdown": [
"<br>**FIDLE 2020 - DataGenerator**"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Version : 0.4.1\n",
"TensorFlow version : 2.2.0\n",
"Keras version : 2.3.0-tf\n"
]
}
],
"source": [
"import numpy as np\n",
"from skimage import io\n",
"\n",
"import tensorflow as tf\n",
"from tensorflow import keras\n",
"from tensorflow.keras import layers\n",
"from tensorflow.keras.callbacks import ModelCheckpoint, TensorBoard\n",
"\n",
"import os,sys,json,time,datetime\n",
"from IPython.display import display,Image,Markdown,HTML\n",
"\n",
"from modules.data_generator import DataGenerator\n",
"from modules.VAE import VAE, Sampling\n",
"from modules.callbacks import ImagesCallback, BestModelCallback\n",
"\n",
"sys.path.append('..')\n",
"import fidle.pwk as pwk\n",
"\n",
"run_dir = './run/CelebA.001' # Output directory\n",
"datasets_dir = pwk.init('VAE8', run_dir)\n",
"\n",
"VAE.about()\n",
"DataGenerator.about()"
]
},
{
"cell_type": "code",
"source": [
"# To clean run_dir, uncomment and run this next line\n",
"# ! rm -r \"$run_dir\"/images-* \"$run_dir\"/logs \"$run_dir\"/figs \"$run_dir\"/models ; rmdir \"$run_dir\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 2 - Get some data\n",
"Let's instantiate our generator for the entire dataset."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 1.1 - Parameters\n",
"Uncomment the right lines according to the data you want to use"
]
},
{
"cell_type": "code",
"metadata": {},
"outputs": [],
"source": [
"# ---- For tests\n",
"scale = 0.3\n",
"image_size = (128,128)\n",
"enhanced_dir = './data'\n",
"latent_dim = 300\n",
"r_loss_factor = 0.6\n",
"\n",
"# ---- Training with a full dataset\n",
"# scale = 1.\n",
"# image_size = (128,128)\n",
"# enhanced_dir = f'{datasets_dir}/celeba/enhanced'\n",
"# latent_dim = 300\n",
"# r_loss_factor = 0.6\n",
"\n",
"# ---- Training with a full dataset of large images\n",
"# scale = 1.\n",
"# image_size = (192,160)\n",
"# enhanced_dir = f'{datasets_dir}/celeba/enhanced'\n",
"# latent_dim = 300\n",
"# r_loss_factor = 0.6\n",
"# batch_size = 64\n",
"# epochs = 15"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 1.2 - Finding the right place"
]
},
{
"cell_type": "code",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Train directory is : ./data/clusters-128x128\n"
"# ---- Override parameters (batch mode) - Just forget this line\n",
"pwk.override('scale', 'image_size', 'enhanced_dir', 'latent_dim', 'r_loss_factor', 'batch_size', 'epochs')\n",
"\n",
"# ---- the place of the clusters files\n",
"#\n",
"lx,ly = image_size\n",
"train_dir = f'{enhanced_dir}/clusters-{lx}x{ly}'\n",
"print('Train directory is :',train_dir)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 1.2 - Get a DataGenerator"
]
},
{
"cell_type": "code",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Data generator is ready with : 379 batchs of 32 images, or 12155 images\n"
]
}
],
"source": [
"data_gen = DataGenerator(train_dir, 32, k_size=scale)\n",
"\n",
"print(f'Data generator is ready with : {len(data_gen)} batchs of {data_gen.batch_size} images, or {data_gen.dataset_size} images')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 3 - Build model"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Encoder"
]
},
{
"cell_type": "code",
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
"metadata": {},
"outputs": [],
"source": [
"inputs = keras.Input(shape=(lx, ly, 3))\n",
"x = layers.Conv2D(32, 3, strides=2, padding=\"same\", activation=\"relu\")(inputs)\n",
"x = layers.Conv2D(64, 3, strides=2, padding=\"same\", activation=\"relu\")(x)\n",
"x = layers.Conv2D(64, 3, strides=2, padding=\"same\", activation=\"relu\")(x)\n",
"x = layers.Conv2D(64, 3, strides=2, padding=\"same\", activation=\"relu\")(x)\n",
"\n",
"shape_before_flattening = keras.backend.int_shape(x)[1:]\n",
"\n",
"x = layers.Flatten()(x)\n",
"x = layers.Dense(512, activation=\"relu\")(x)\n",
"\n",
"z_mean = layers.Dense(latent_dim, name=\"z_mean\")(x)\n",
"z_log_var = layers.Dense(latent_dim, name=\"z_log_var\")(x)\n",
"z = Sampling()([z_mean, z_log_var])\n",
"\n",
"encoder = keras.Model(inputs, [z_mean, z_log_var, z], name=\"encoder\")\n",
"encoder.compile()\n",
"# encoder.summary()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Decoder"
]
},
{
"cell_type": "code",
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
"metadata": {},
"outputs": [],
"source": [
"inputs = keras.Input(shape=(latent_dim,))\n",
"\n",
"x = layers.Dense(np.prod(shape_before_flattening))(inputs)\n",
"x = layers.Reshape(shape_before_flattening)(x)\n",
"\n",
"x = layers.Conv2DTranspose(64, 3, strides=2, padding=\"same\", activation=\"relu\")(x)\n",
"x = layers.Conv2DTranspose(64, 3, strides=2, padding=\"same\", activation=\"relu\")(x)\n",
"x = layers.Conv2DTranspose(64, 3, strides=2, padding=\"same\", activation=\"relu\")(x)\n",
"x = layers.Conv2DTranspose(32, 3, strides=2, padding=\"same\", activation=\"relu\")(x)\n",
"outputs = layers.Conv2DTranspose(3, 3, padding=\"same\", activation=\"sigmoid\")(x)\n",
"\n",
"decoder = keras.Model(inputs, outputs, name=\"decoder\")\n",
"decoder.compile()\n",
"# decoder.summary()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### VAE\n",
"Our loss function is the weighted sum of two values. \n",
"`reconstruction_loss` which measures the loss during reconstruction. \n",
"`kl_loss` which measures the dispersion. \n",
"\n",
"The weights are defined by: `r_loss_factor` : \n",
"`total_loss = r_loss_factor*reconstruction_loss + (1-r_loss_factor)*kl_loss`\n",
"\n",
"if `r_loss_factor = 1`, the loss function includes only `reconstruction_loss` \n",
"if `r_loss_factor = 0`, the loss function includes only `kl_loss` \n",
"In practice, a value arround 0.5 gives good results here.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"vae = VAE(encoder, decoder, r_loss_factor)\n",
"\n",
"vae.compile(optimizer=keras.optimizers.Adam())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 4 - Train\n",
"20' on a CPU \n",
"1'12 on a GPU (V100, IDRIS)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 4.1 - Callbacks"
]
},
{
"cell_type": "code",
"execution_count": null,
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
"metadata": {},
"outputs": [],
"source": [
"x_draw,_ = data_gen[0]\n",
"data_gen.rewind()\n",
"\n",
"# ---- Callback : Images encoded\n",
"pwk.mkdir(run_dir + '/images-encoded')\n",
"filename = run_dir + '/images-encoded/image-{epoch:03d}-{i:02d}.jpg'\n",
"callback_images1 = ImagesCallback(filename, x=x_draw[:5], encoder=encoder,decoder=decoder)\n",
"\n",
"# ---- Callback : Images generated\n",
"pwk.mkdir(run_dir + '/images-generated')\n",
"filename = run_dir + '/images-generated/image-{epoch:03d}-{i:02d}.jpg'\n",
"callback_images2 = ImagesCallback(filename, x=None, nb_images=5, z_dim=latent_dim, encoder=encoder,decoder=decoder) \n",
"\n",
"# ---- Callback : Best model\n",
"pwk.mkdir(run_dir + '/models')\n",
"filename = run_dir + '/models/best_model'\n",
"callback_bestmodel = BestModelCallback(filename)\n",
"\n",
"# ---- Callback tensorboard\n",
"dirname = run_dir + '/logs'\n",
"callback_tensorboard = TensorBoard(log_dir=dirname, histogram_freq=1)\n",
"\n",
"callbacks_list = [callback_images1, callback_images2, callback_bestmodel, callback_tensorboard]\n",
"callbacks_list = [callback_images1, callback_images2, callback_bestmodel]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 4.2 - Train it"
]
},
{
"cell_type": "code",
"execution_count": null,
"source": [
"pwk.chrono_start()\n",
"\n",
"history = vae.fit(data_gen, epochs=epochs, batch_size=batch_size, callbacks=callbacks_list)\n",
"\n",
"pwk.chrono_show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 5 - About our training session\n",
"### 5.1 - History"
]
},
{
"cell_type": "code",
"execution_count": null,
"source": [
"pwk.plot_history(history, plot={\"Loss\":['loss','r_loss', 'kl_loss']}, save_as='01-history')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 5.2 - Reconstruction (input -> encoder -> decoder)"
]
},
{
"cell_type": "code",
"execution_count": null,
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
"source": [
"imgs=[]\n",
"labels=[]\n",
"for epoch in range(1,epochs,1):\n",
" for i in range(5):\n",
" filename = f'{run_dir}/images-encoded/image-{epoch:03d}-{i:02d}.jpg'.format(epoch=epoch, i=i)\n",
" img = io.imread(filename)\n",
" imgs.append(img)\n",
" \n",
"\n",
"pwk.subtitle('Original images :')\n",
"pwk.plot_images(x_draw[:5], None, indices='all', columns=5, x_size=2,y_size=2, save_as='02-original')\n",
"\n",
"pwk.subtitle('Encoded/decoded images')\n",
"pwk.plot_images(imgs, None, indices='all', columns=5, x_size=2,y_size=2, save_as='03-reconstruct')\n",
"\n",
"pwk.subtitle('Original images :')\n",
"pwk.plot_images(x_draw[:5], None, indices='all', columns=5, x_size=2,y_size=2, save_as=None)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 5.3 Generation (latent -> decoder)"
]
},
{
"cell_type": "code",
"execution_count": null,
"source": [
"imgs=[]\n",
"labels=[]\n",
"for epoch in range(1,epochs,1):\n",
" for i in range(5):\n",
" filename = f'{run_dir}/images-generated/image-{epoch:03d}-{i:02d}.jpg'.format(epoch=epoch, i=i)\n",
" img = io.imread(filename)\n",
" imgs.append(img)\n",
" \n",
"pwk.subtitle('Generated images from latent space')\n",
"pwk.plot_images(imgs, None, indices='all', columns=5, x_size=2,y_size=2, save_as='04-encoded')\n"
]
},
{
"cell_type": "code",
"execution_count": null,
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
"source": [
"pwk.end()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"<img width=\"80px\" src=\"../fidle/img/00-Fidle-logo-01.svg\"></img>"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.9"
}
},
"nbformat": 4,
"nbformat_minor": 4
}