<img width="800px" src="../fidle/img/00-Fidle-header-01.svg"></img>

# <!-- TITLE --> [IMDB3] - Reload and reuse a saved model
<!-- DESC --> Retrieving a saved model to perform a sentiment analysis (movie review)
<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->

## Objectives :
 - The objective is to guess whether our personal film reviews are **positive or negative** based on the analysis of the text. 
 - For this, we will use our **previously saved model**.

## What we're going to do :

 - Preparing our data
 - Retrieve our saved model
 - Evaluate the result


## Step 1 - Init python stuff

In [None]:
import numpy as np

import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.datasets.imdb as imdb

import matplotlib.pyplot as plt
import matplotlib
import pandas as pd

import os,sys,h5py,json,re

from importlib import reload

import fidle

# Init Fidle environment
run_id, run_dir, datasets_dir = fidle.init('IMDB3')

### 1.2 - Parameters
The words in the vocabulary are classified from the most frequent to the rarest. 
`vocab_size` is the number of words we will remember in our vocabulary (the other words will be considered as unknown). 
`review_len` is the review length 
`saved_models` where our models were previously saved 
`dictionaries_dir` is where we will go to save our dictionaries. (./data is a good choice)

In [None]:
vocab_size = 10000
review_len = 256

saved_models = './run/IMDB2'
dictionaries_dir = './data'

Override parameters (batch mode) - Just forget this cell

In [None]:
fidle.override('vocab_size', 'review_len', 'saved_models', 'dictionaries_dir')

## Step 2 : Preparing the data
### 2.1 - Our reviews :

In [None]:
reviews = [ "This film is particularly nice, a must see.",
 "This film is a great classic that cannot be ignored.",
 "I don't remember ever having seen such a movie...",
 "This movie is just abominable and doesn't deserve to be seen!"]

### 2.2 - Retrieve dictionaries
Note : This dictionary is generated by [01-Embedding-Keras](01-Embedding-Keras.ipynb) notebook.

In [None]:
with open(f'{dictionaries_dir}/word_index.json', 'r') as fp:
 word_index = json.load(fp)
 word_index = { w:int(i) for w,i in word_index.items() }
 print('Loaded. ', len(word_index), 'entries in word_index' )
 index_word = { i:w for w,i in word_index.items() }
 print('Loaded. ', len(index_word), 'entries in index_word' )

### 2.3 - Clean, index and padd
Phases are split into words, punctuation is removed, sentence length is limited and padding is added... 
**Note** : 1 is "Start" and 2 is "unknown"

In [None]:
nb_reviews = len(reviews)
x_data = []

# ---- For all reviews
for review in reviews:
 print('Words are : ', end='')
 # ---- First index must be <start>
 index_review=[1]
 print('1 ', end='')
 # ---- For all words
 for w in review.split(' '):
 # ---- Clean it
 w_clean = re.sub(r"[^a-zA-Z0-9]", "", w)
 # ---- Not empty ?
 if len(w_clean)>0:
 # ---- Get the index
 w_index = word_index.get(w,2)
 if w_index>vocab_size : w_index=2
 # ---- Add the index if < vocab_size
 index_review.append(w_index)
 print(f'{w_index} ', end='')
 # ---- Add the indexed review
 x_data.append(index_review)
 print()

# ---- Padding
x_data = keras.preprocessing.sequence.pad_sequences(x_data, value = 0, padding = 'post', maxlen = review_len)

### 2.4 - Have a look

In [None]:
def translate(x):
 return ' '.join( [index_word.get(i,'?') for i in x] )

for i in range(nb_reviews):
 imax=np.where(x_data[i]==0)[0][0]+5
 print(f'\nText review :', reviews[i])
 print( f'x_train[{i:}] :', list(x_data[i][:imax]), '(...)')
 print( 'Translation :', translate(x_data[i][:imax]), '(...)')

## Step 3 - Bring back the model

In [None]:
model = keras.models.load_model(f'{saved_models}/models/best_model.h5')

## Step 4 - Predict

In [None]:
y_pred = model.predict(x_data)

#### And the winner is :

In [None]:
for i,review in enumerate(reviews):
 rate = y_pred[i][0]
 opinion = 'NEGATIVE :-(' if rate<0.5 else 'POSITIVE :-)' 
 print(f'{review:<70} => {rate:.2f} - {opinion}')

In [None]:
fidle.end()

---
<img width="80px" src="../fidle/img/00-Fidle-logo-01.svg"></img>