{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img width=\"800px\" src=\"../fidle/img/header.svg\"></img>\n",
    "\n",
    "# <!-- TITLE --> [K2AE1] - Prepare a noisy MNIST dataset\n",
    "<!-- DESC --> Episode 1: Preparation of a noisy MNIST dataset, using Keras 2 and Tensorflow (obsolete)\n",
    "\n",
    "<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->\n",
    "\n",
    "## Objectives :\n",
    " - Prepare a MNIST noisy dataset, usable with our denoiser autoencoder (duration : <50s)\n",
    "\n",
    "## What we're going to do :\n",
    "\n",
    " - Load original MNIST dataset\n",
    " - Adding noise, a lot !\n",
    " - Save it :-)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 1 - Init and set parameters\n",
    "### 1.1 - Init python"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import sys\n",
    "\n",
    "from skimage import io\n",
    "from skimage.util import random_noise\n",
    "\n",
    "import modules.MNIST\n",
    "from modules.MNIST     import MNIST\n",
    "\n",
    "import fidle\n",
    "\n",
    "# Init Fidle environment\n",
    "run_id, run_dir, datasets_dir = fidle.init('K2AE1')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.2 - Parameters\n",
    "`prepared_dataset` : Filename of the future prepared dataset (example : ./data/mnist-noisy.h5)\\\n",
    "`scale` : Dataset scale. 1 mean 100% of the dataset - set 0.1 for tests\\\n",
    "`progress_verbosity`: Verbosity of progress bar: 0=silent, 1=progress bar, 2=One line"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "prepared_dataset   = './data/mnist-noisy.h5'\n",
    "scale              = 1\n",
    "progress_verbosity = 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Override parameters (batch mode) - Just forget this cell"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fidle.override('prepared_dataset', 'scale', 'progress_verbosity')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 2 - Get original dataset\n",
    "We load :  \n",
    "`clean_data` : Original and clean images - This is what we will want to ontain at the **output** of the AE  \n",
    "`class_data` : Image classes - Useless, because the training will be unsupervised  \n",
    "We'll build :  \n",
    "`noisy_data` : Noisy images - These are the images that we will give as **input** to our AE\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "clean_data, class_data = MNIST.get_origine(scale=scale)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 3 - Add noise\n",
    "We add noise to the original images (clean_data) to obtain noisy images (noisy_data)  \n",
    "Need 30-40 seconds"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def noise_it(data):\n",
    "    new_data = np.copy(data)\n",
    "    for i,image in enumerate(new_data):\n",
    "        fidle.utils.update_progress('Add noise : ',i+1,len(data),verbosity=progress_verbosity)\n",
    "        image=random_noise(image, mode='gaussian', mean=0, var=0.3)\n",
    "        image=random_noise(image, mode='s&p',      amount=0.2, salt_vs_pepper=0.5)\n",
    "        image=random_noise(image, mode='poisson') \n",
    "        image=random_noise(image, mode='speckle',  mean=0, var=0.1)\n",
    "        new_data[i]=image\n",
    "    print('Done.')\n",
    "    return new_data\n",
    "\n",
    "# ---- Add noise to input data : x_data\n",
    "#\n",
    "noisy_data = noise_it(clean_data)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 4 - Have a look"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print('Clean dataset (clean_data) : ',clean_data.shape)\n",
    "print('Noisy dataset (noisy_data) : ',noisy_data.shape)\n",
    "\n",
    "fidle.utils.subtitle(\"Noisy images we'll have in input (or x)\")\n",
    "fidle.scrawler.images(noisy_data[:5], None, indices='all', columns=5, x_size=3,y_size=3, interpolation=None, save_as='01-noisy')\n",
    "fidle.utils.subtitle('Clean images we want to obtain (or y)')\n",
    "fidle.scrawler.images(clean_data[:5], None, indices='all', columns=5, x_size=3,y_size=3, interpolation=None, save_as='02-original')\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 5 - Shuffle dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "p = np.random.permutation(len(clean_data))\n",
    "clean_data, noisy_data, class_data = clean_data[p], noisy_data[p], class_data[p]\n",
    "print('Shuffled.')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 6 - Save our prepared dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "MNIST.save_prepared_dataset( clean_data, noisy_data, class_data, filename=prepared_dataset )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fidle.end()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "<img width=\"80px\" src=\"../fidle/img/logo-paysage.svg\"></img>"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.9.2 ('fidle-env')",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.2"
  },
  "vscode": {
   "interpreter": {
    "hash": "b3929042cc22c1274d74e3e946c52b845b57cb6d84f2d591ffe0519b38e4896d"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}