<img width="800px" src="../fidle/img/header.svg"></img>

# <!-- TITLE --> [K2AE1] - Prepare a noisy MNIST dataset
<!-- DESC --> Episode 1: Preparation of a noisy MNIST dataset, using Keras 2 and Tensorflow (obsolete)

<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->

## Objectives :
 - Prepare a MNIST noisy dataset, usable with our denoiser autoencoder (duration : <50s)

## What we're going to do :

 - Load original MNIST dataset
 - Adding noise, a lot !
 - Save it :-)

## Step 1 - Init and set parameters
### 1.1 - Init python

In [None]:
import numpy as np
import sys

from skimage import io
from skimage.util import random_noise

import modules.MNIST
from modules.MNIST import MNIST

import fidle

# Init Fidle environment
run_id, run_dir, datasets_dir = fidle.init('K2AE1')

### 1.2 - Parameters
`prepared_dataset` : Filename of the future prepared dataset (example : ./data/mnist-noisy.h5)\
`scale` : Dataset scale. 1 mean 100% of the dataset - set 0.1 for tests\
`progress_verbosity`: Verbosity of progress bar: 0=silent, 1=progress bar, 2=One line

In [None]:
prepared_dataset = './data/mnist-noisy.h5'
scale = 1
progress_verbosity = 1

Override parameters (batch mode) - Just forget this cell

In [None]:
fidle.override('prepared_dataset', 'scale', 'progress_verbosity')

## Step 2 - Get original dataset
We load : 
`clean_data` : Original and clean images - This is what we will want to ontain at the **output** of the AE 
`class_data` : Image classes - Useless, because the training will be unsupervised 
We'll build : 
`noisy_data` : Noisy images - These are the images that we will give as **input** to our AE


In [None]:
clean_data, class_data = MNIST.get_origine(scale=scale)

## Step 3 - Add noise
We add noise to the original images (clean_data) to obtain noisy images (noisy_data) 
Need 30-40 seconds

In [None]:
def noise_it(data):
 new_data = np.copy(data)
 for i,image in enumerate(new_data):
 fidle.utils.update_progress('Add noise : ',i+1,len(data),verbosity=progress_verbosity)
 image=random_noise(image, mode='gaussian', mean=0, var=0.3)
 image=random_noise(image, mode='s&p', amount=0.2, salt_vs_pepper=0.5)
 image=random_noise(image, mode='poisson') 
 image=random_noise(image, mode='speckle', mean=0, var=0.1)
 new_data[i]=image
 print('Done.')
 return new_data

# ---- Add noise to input data : x_data
#
noisy_data = noise_it(clean_data)


## Step 4 - Have a look

In [None]:
print('Clean dataset (clean_data) : ',clean_data.shape)
print('Noisy dataset (noisy_data) : ',noisy_data.shape)

fidle.utils.subtitle("Noisy images we'll have in input (or x)")
fidle.scrawler.images(noisy_data[:5], None, indices='all', columns=5, x_size=3,y_size=3, interpolation=None, save_as='01-noisy')
fidle.utils.subtitle('Clean images we want to obtain (or y)')
fidle.scrawler.images(clean_data[:5], None, indices='all', columns=5, x_size=3,y_size=3, interpolation=None, save_as='02-original')


## Step 5 - Shuffle dataset

In [None]:
p = np.random.permutation(len(clean_data))
clean_data, noisy_data, class_data = clean_data[p], noisy_data[p], class_data[p]
print('Shuffled.')

## Step 6 - Save our prepared dataset

In [None]:
MNIST.save_prepared_dataset( clean_data, noisy_data, class_data, filename=prepared_dataset )

In [None]:
fidle.end()

---
<img width="80px" src="../fidle/img/logo-paysage.svg"></img>