pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
/home/huderl/.local/lib/python3.6/site-packages/ipykernel_launcher.py:12: FutureWarning: The signature of `Series.to_csv` was aligned to that of `DataFrame.to_csv`, and argument 'header' will change its default value from False to True: please pass an explicit value to suppress this warning.
if sys.path[0] == '':
%% Cell type:markdown id: tags:
# Dataframe
A dictionnary of series where keys are column name
<matplotlib.axes._subplots.AxesSubplot at 0x7febf9fdf9b0>
%%%% Output: display_data
[Hidden Image Output]
%% Cell type:markdown id: tags:
# DIY
%% Cell type:markdown id: tags:
## Goals : Compute light statistics on IMDB Movies files
The goal of this session is to end up with a script that computes some simple statistics from IMDB Movies files. The file was modified and reduced for this exercice
Material
Data are in 2 files Directory named "files"
- name.tsv
This file contains the actors, the separation character is tabulation '\t'. The first line is the header.
"For consistent figure changes, define your own stylesheets that are basically a list of parameters to tune the aspect of the figure elements.\n",
"See https://matplotlib.org/tutorials/introductory/customizing.html for more info."
%% Cell type:markdown id: tags:
# Python training UGA 2017
**A training to acquire strong basis in Python to use it efficiently**
Pierre Augier (LEGI), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre), Oliver Henriot (GRICAD), Christophe Picard (LJK), Loïc Huder (ISTerre)
# Python scientific ecosystem
# A short introduction to Matplotlib ([gallery](http://matplotlib.org/gallery.html))
%% Cell type:markdown id: tags:
The default library to plot data is `Matplotlib`.
It allows one the creation of graphs that are ready for publications with the same functionality than Matlab.
%% Cell type:code id: tags:
``` python
# these ipython commands load special backend for notebooks
# (do not use "notebook" outside jupyter)
# %matplotlib notebook
# for jupyter-lab:
# %matplotlib ipympl
%matplotlib inline
```
%% Cell type:markdown id: tags:
When running code using matplotlib, it is highly recommended to start ipython with the option `--matplotlib` (or to use the magic ipython command `%matplotlib`).
%% Cell type:code id: tags:
``` python
import numpy as np
import matplotlib.pyplot as plt
```
%% Cell type:code id: tags:
``` python
A = np.random.random([5,5])
```
%% Cell type:markdown id: tags:
You can plot any kind of numerical data.
%% Cell type:code id: tags:
``` python
lines = plt.plot(A)
```
%%%% Output: display_data
[Hidden Image Output]
%% Cell type:markdown id: tags:
In scripts, the `plt.show` method needs to be invoked at the end of the script.
%% Cell type:markdown id: tags:
We can plot data by giving specific coordinates.
%% Cell type:code id: tags:
``` python
x = np.linspace(0, 2, 20)
y = x**2
```
%% Cell type:code id: tags:
``` python
plt.figure()
plt.plot(x,y, label='Square function')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
```
%%%% Output: execute_result
<matplotlib.legend.Legend at 0x7f39e3136d68>
%%%% Output: display_data
[Hidden Image Output]
%% Cell type:markdown id: tags:
We can associate the plot with an object figure. This object will allow us to add labels, subplot, modify the axis or save it as an image.
%% Cell type:code id: tags:
``` python
fig = plt.figure()
ax = fig.add_subplot(111)
res = ax.plot(x, y, color="red", linestyle='dashed', linewidth=3, marker='o',
markerfacecolor='blue', markersize=5)
ax.set_xlabel('$Re$')
ax.set_ylabel('$\Pi / \epsilon$')
```
%%%% Output: execute_result
Text(0, 0.5, '$\\Pi / \\epsilon$')
%%%% Output: display_data
[Hidden Image Output]
%% Cell type:markdown id: tags:
We can also recover the plotted matplotlib object to get info on it.
%% Cell type:code id: tags:
``` python
line_object = res[0]
print(type(line_object))
print('Color of the line is', line_object.get_color())
print('X data of the plot:', line_object.get_xdata())
```
%%%% Output: stream
<class 'matplotlib.lines.Line2D'>
Color of the line is red
X data of the plot: [0. 0.10526316 0.21052632 0.31578947 0.42105263 0.52631579
ax1 = fig.add_subplot(211) # First, number of subplots along X (2), then along Y (1), then the id of the subplot (1)
ax2 = fig.add_subplot(212, sharex=ax1) # It is possible to share axes between subplots
X = np.arange(0, 2*np.pi, 0.1)
ax1.plot(X, np.cos(2*X), color="red")
ax2.plot(X, np.sin(2*X), color="magenta")
ax2.set_xlabel('Angle (rad)')
```
%%%% Output: execute_result
Text(0.5, 0, 'Angle (rad)')
%%%% Output: display_data
[Hidden Image Output]
%% Cell type:markdown id: tags:
## Anatomy of a Matplotlib figure


For consistent figure changes, define your own stylesheets that are basically a list of parameters to tune the aspect of the figure elements.
See https://matplotlib.org/tutorials/introductory/customizing.html for more info.
%% Cell type:markdown id: tags:
We can also plot 2D data arrays.
%% Cell type:code id: tags:
``` python
noise = np.random.random((256,256))
plt.figure()
plt.imshow(noise)
```
%%%% Output: execute_result
<matplotlib.image.AxesImage at 0x7f39e2ef9438>
%%%% Output: display_data
[Hidden Image Output]
%% Cell type:markdown id: tags:
We can also add a colorbar and adjust the colormap.
%% Cell type:code id: tags:
``` python
plt.figure()
plt.imshow(noise, cmap=plt.cm.gray)
plt.colorbar()
```
%%%% Output: execute_result
<matplotlib.colorbar.Colorbar at 0x7f39e167e780>
%%%% Output: display_data
[Hidden Image Output]
%% Cell type:markdown id: tags:
#### Choose your colormaps wisely !
When doing such colorplots, it is easy to lose the interesting features by setting a colormap that is not adapted to the data.
Also, when producing scientific figures, think about how will your plot will look like to colorblind people or in greyscales (as it can happen in printed articles...).
See the interesting discussion on matplotlib website: https://matplotlib.org/users/colormaps.html.
%% Cell type:markdown id: tags:
## Other plot types
Matplotlib also allows to plot:
- Histograms
- Plots with error bars
- Box plots
- Contours
- in 3D
- ...
See the [gallery](http://matplotlib.org/gallery.html) to see what suits you the most.
%% Cell type:markdown id: tags:
## Do it yourself:
With miscellaneous routines of scipy we can get an example image:
%% Cell type:code id: tags:
``` python
import scipy.misc
raccoon = np.array(scipy.misc.face())
```
%% Cell type:markdown id: tags:
Write a script to print shape and dtype the raccoon image. Next plot the image using matplotlib.
%% Cell type:code id: tags:
``` python
print("shape of raccoon = ", raccoon.shape)
print("dtype of raccoon = ", raccoon.dtype)
```
%%%% Output: stream
shape of raccoon = (768, 1024, 3)
dtype of raccoon = uint8
%% Cell type:code id: tags:
``` python
plt.imshow(raccoon)
```
%%%% Output: execute_result
<matplotlib.image.AxesImage at 0x7f39e00aba58>
%%%% Output: display_data
[Hidden Image Output]
%% Cell type:markdown id: tags:
0. Write a script to generate a border around the raccoon image (for example a 20 pixel size black border; black color code is 0 0 0)
1. Do it again without losing pixels and generate then a raccoon1 array/image
2. 1. Mask the face of the raccoon with a grey circle (centered of radius 240 at location 690 260 of the raccoon1 image; grey color code is for example (120 120 120))
2. Mask the face of the raccon with a grey square by using NumPy broadcast capabilities (height and width 480 and same center as before)
3. We propose to smooth the image : the value of a pixel of the smoothed image is the the average of the values of its neighborhood (ie the 8 neighbors + itself).
%% Cell type:markdown id: tags:
### Solution 0
Write a script to generate a border around the raccoon image (for example a 20 pixel size black border; black color code is 0 0 0)
%% Cell type:code id: tags:
``` python
raccoon[0:20, :, :] = 0
raccoon[-20:-1, :, :] = 0
raccoon[:, 0:20, :] = 0
raccoon[:, -20:-1, :] = 0
plt.imshow(raccoon)
```
%%%% Output: execute_result
<matplotlib.image.AxesImage at 0x7f39ce79dbe0>
%%%% Output: display_data
[Hidden Image Output]
%% Cell type:markdown id: tags:
### Solution 1
Do it again without losing pixels and generate then a raccoon1 array/image
Mask the face of the raccoon with a grey circle (centered of radius 240 at location 690 260 of the raccoon1 image; grey color code is for example (120 120 120))
%% Cell type:code id: tags:
``` python
raccoon2A = raccoon1.copy()
x_center = 260
y_center = 690
radius = 240
x_max, y_max, z = raccoon2A.shape
for i in range(x_max):
for j in range(y_max):
if ((j - y_center)**2 + (i-x_center)**2) <= radius**2:
raccoon2A[i, j, :] = 120
plt.imshow(raccoon2A)
```
%%%% Output: execute_result
<matplotlib.image.AxesImage at 0x7f39ce6e65f8>
%%%% Output: display_data
[Hidden Image Output]
%% Cell type:markdown id: tags:
### Solution 2.B
Mask the face of the raccon with a grey square by using NumPy broadcast capabilities (height and width 480 and same center as before)
"While doing the job, the previous example does not allow to unveil the power of matplotlib. For that, we need to keep in mind that in matplotlib plots, **everything** is an object.\n",
"It is therefore possible to change any aspect of the figure by acting on the appropriate objects. "
]
...
...
%% Cell type:markdown id: tags:
# Advanced matplotlib
Pierre Augier (LEGI), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre), Christophe Picard (LJK), Loïc Huder (ISTerre)
%% Cell type:markdown id: tags:
## Introduction
This is the second part of the introductive presentation given in the [Python initiation training](https://gricad-gitlab.univ-grenoble-alpes.fr/python-uga/py-training-2017/blob/master/ipynb/pres111_intro_matplotlib.ipynb).
The aim is to present more advanced usecases of matplotlib.
%% Cell type:markdown id: tags:
## Quick reminders
%% Cell type:code id: tags:
``` python
importnumpyasnp
importmatplotlib.pyplotasplt
```
%% Cell type:code id: tags:
``` python
X=np.arange(0,2,0.01)
Y=np.exp(X)-1
plt.plot(X,X,linewidth=3)
plt.plot(X,Y)
plt.plot(X,X**2)
plt.xlabel('Abscisse')
plt.ylabel('Ordinate')
```
%% Cell type:markdown id: tags:
## Object-oriented plots
While doing the job, the previous example does not allow to unveil the power of matplotlib. For that, we need to keep in mind that in matplotlib plots, **everything** is an object.
Know first that other plotting libraries offers interactions more smoothly (`plotly`, `bokeh`, ...). Nevertheless, `matplotlib` gives access to backend-independent methods to add interactivity to plots.
These methods use [`Events`](https://matplotlib.org/api/backend_bases_api.html#matplotlib.backend_bases.Event) to catch user interactions (mouse clicks, key presses, mouse hovers, etc...).
These events must be connected to callback functions using the `mpl_connect` method of `Figure.Canvas`:
But, here we are referencing `x_data` and `y_data` in `add_datapoint` that are defined outside the function : this breaks encapsulation !
A nicer solution would be to use an object to handle the interactivity. We can also take advantage of this to add more functionality (such as clearing of the figure when the mouse exits) :
More examples could be shown but it always revolves around the same process: connecting an `Event` to a callback function.
Note that the connection can be severed using `mpl_disconnect` that takes the callback id in arg (in the previous case `self.button_callback` or `self.clear_callback`.
Some usages of interactivity:
- Print the value of a point on click
- Trigger a plot in the third dimension of a 3D plot displayed in 2D
- Save a figure on closing
- Ideas ?
%% Cell type:markdown id: tags:
## Animations
From the matplotlib page (https://matplotlib.org/api/animation_api.html):
> The easiest way to make a live animation in matplotlib is to use one of the Animation classes.
><table>
<tr><td>FuncAnimation</td><td>Makes an animation by repeatedly calling a function func.</td></tr>
<tr><td>ArtistAnimation</td><td>Animation using a fixed set of Artist objects.</td></tr>
</table>
%% Cell type:markdown id: tags:
### Example from matplotlib page
This example uses `FuncAnimation` to animate the plot of a sin function.
The animation consists in making repeated calls to the `update` function that adds at each frame a datapoint to the plot.
The previous code executed in a regular Python script should display the animation without problem. In a Jupyter Notebook, if we use `%matplotlib inline`, we can use IPython to display it in HTML.
%% Cell type:code id: tags:
``` python
fromIPython.displayimportHTML
HTML(ani.to_jshtml())
```
%% Cell type:markdown id: tags:
### Stroop test
The [Stroop effect](http://en.wikipedia.org/wiki/Stroop_effect) is when a psychological cause inteferes with the reaction time of a task.
A common demonstration of this effect (called a Stroop test) is naming the color in which a word is written if the word describes another color. This usually takes longer than for a word that is not a color.
Ex: Naming blue for <divstyle='text-align:center; font-size:36px'><spanstyle='color:blue'>RED</span> vs. <spanstyle='color:blue'>BIRD</span></div>
_Funfact: As this test relies on the significance of the words, people that are more used to English should find the test more difficult !_
In this part, we show how `matplotlib` animations can generate a Stroop test that shows random color words in random colors at random positions.
#### With `FuncAnimation`
We will generate a single object `word` whose position, color and text will be updated by the repeatedly called function.
Rather than updating through a function, `ArtistAnimation` requires to generate first all the `Artists` that will be displayed during the whole animation.
A list of `Artists` must therefore be supplied for each frame. Then, all frame lists must be compiled in a single list (of lists) that will be given in argument of `ArtistAnimation`.
In our case, to reproduce the behaviour above, we need to have only one word per frame. Each frame will therefore have a list of a single element (the colored word for this frame).