Commit 76162001 authored by paugier's avatar paugier
Browse files

Pres on profiling

parent 528d73a1
......@@ -27,4 +27,5 @@ ipynb/index.html
pyfiles/dtw_cort_dist/V5_cython/*.c
pyfiles/dtw_cort_dist/V5_cython/*.html
**/V*/res_cort.npy
**/V*/res_dtw.npy
\ No newline at end of file
**/V*/res_dtw.npy
pyfiles/dtw_cort_dist/V*/prof.png
......@@ -10,157 +10,234 @@
"source": [
"# Profiling\n",
"\n",
"Pierre Augier (LEGI), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre), Christophe Picard (LJK), Loïc Huder (ISTerre)\n"
"Pierre Augier (LEGI), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre), Christophe Picard (LJK), Loïc Huder (ISTerre)\n",
"\n",
"### Measure ⏱, don't guess! Profile to find the bottlenecks.\n",
"\n",
"<p class=\"small\"><br></p>\n",
"\n",
"### Do not optimize everything!\n",
"\n",
"- *\"Premature optimization is the root of all evil\"* (Donald Knuth)\n",
"\n",
"- 80 / 20 rule, efficiency important for expensive things and NOT for small things"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
"slide_type": "slide"
}
},
"source": [
"\n",
"# Road map\n",
"# Different types of profiling\n",
"\n",
"## Time profiling\n",
"- timeit\n",
"- script base time (unix cmd)\n",
"- function based profiling (cprofile)\n",
"- line base profiling \n",
"\n",
"## Memory profiling \n",
"- further readings\n"
"- Small code snippets\n",
"- Script based benchmark\n",
"- Function based profiling\n",
"- Line based profiling\n",
"\n",
"<p class=\"small\"><br></p>\n",
"\n",
"## Memory profiling \n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
"slide_type": "slide"
}
},
"source": [
"timeit\n",
"------\n",
"## Small code snippets\n",
"\n",
"- There is a module [`timeit` in the standard library](https://docs.python.org/3/library/timeit.html).\n",
"\n",
" `python3 -m timeit -s \"import math; l=[]\" \"for x in range(100): l.append(math.pow(x,2))\"`\n",
"\n",
" Problem: the module `timeit` does not try to guess how many times to execute the statement.\n",
"\n",
"- In IPython, you can use the magic command `%timeit` that execute a piece of code and stats the time it spends: "
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"26.8 µs ± 373 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n",
"19.7 µs ± 146 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n",
"11.2 µs ± 37.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n",
"5.53 µs ± 11.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n"
]
}
],
"source": [
"import math\n",
"l = [] \n",
"%timeit for x in range(100): l.append(math.pow(x,2))\n",
"%timeit [math.pow(x,2) for x in range(100)]\n",
"l = []\n",
"%timeit for x in range(100): l.append(x*x)\n",
"%timeit [x*x for x in range(100)]\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- [`pyperf`](https://pypi.org/project/pyperf/) is a more powerful tool but we can also do the same as with the module `timeit`:\n",
"\n",
"In ipython, you can use the magic command timeit that execute a piece of code and stats the time it spends:\n"
"`python3 -m pyperf timeit -s \"import math; l=[]\" \"for x in range(100): l.append(math.pow(x,2))\"`"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
"slide_type": "slide"
}
},
"source": [
"Basic profiling\n",
"-----------------\n",
"## Script base benchmark\n",
"\n",
"Evaluate the time execution of your script as a whole\n",
"\n",
"- Using the Unix command `time`:\n",
"\n",
"While writing code, you can use the magic command timeit: "
" `time myscript.py`\n",
"\n",
"- Using the Unix program [`perf`](https://perf.wiki.kernel.org)\n",
"\n",
" `perf myscript.py`\n",
"\n",
"Issues: \n",
"\n",
"- not accurate (only one run!)\n",
"- includes the import and initialization time. It can be better to modify the script to print the elapsed time measured with:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"100000 loops, best of 5: 16 µs per loop\n",
"100000 loops, best of 5: 12.7 µs per loop\n",
"100000 loops, best of 5: 6.7 µs per loop\n",
"100000 loops, best of 5: 3.98 µs per loop\n"
"elapsed time: 2.56e-04 s\n"
]
}
],
"source": [
"import math\n",
"l=[] \n",
"%timeit for x in range(100): l.append(math.pow(x,2))\n",
"%timeit [math.pow(x,2) for x in range(100)]\n",
"from time import time\n",
"\n",
"l = []\n",
"%timeit for x in range(100): l.append(x*x)\n",
"%timeit [x*x for x in range(100)]\n"
"\n",
"t_start = time()\n",
"[math.pow(x,2) for x in range(100)]\n",
"print(f\"elapsed time: {time() - t_start:.2e} s\")\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
"slide_type": "slide"
}
},
"source": [
"Basic profiling\n",
"-----------------\n",
"## Function based profiling (cProfile)\n",
"\n",
"cProfile (https://docs.python.org/3.7/library/profile.html): **deterministic profiling** of Python programs.\n",
"\n",
"2 steps: (1) run the profiler and (2) analyze the results.\n",
"\n",
"1. Run the profiler\n",
"\n",
" - With an already written script `python3 -m cProfile myscript.py`\n",
"\n",
"Evaluate you script as a whole, *e.g.* using the unix time function:\n",
" - Much better, write a dedicated script using the module cProfile. See `pyfiles/dtw_cort_dist/V0_numpy_loops/prof.py`\n",
"\n",
"`time myscript intput_data`"
" **Warning: profiling is much slower than a classical run, so do not profile with a long during setting**\n",
"\n",
"2. Analyze the results\n",
"\n",
" The standard tool is `pstats` (https://docs.python.org/3.7/library/profile.html#module-pstats)\n",
"\n",
" Or visualize the results with `gprof2dot`, `SnakeViz`, `pyprof2calltree` and `kcachegrind`\n",
"\n",
" Example: `pyprof2calltree -i prof.pstats -k`\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
"slide_type": "slide"
}
},
"source": [
"Function based profiling (cprofile)\n",
"-----------------------------------------\n",
"## Statistical profiling\n",
"\n",
"See http://pramodkumbhar.com/2019/01/python-deterministic-vs-statistical-profilers/\n",
"\n",
"Use the cProfile module to profile the code. \n",
"Advantage compared to deterministic profiling: **very small overhead**\n",
"\n",
"- Option -s ask to sort using cumulative time \n",
"- profile_data.pyprof is the output of the profiling\n",
"- myscript intput_data: the script with its regular arguments\n",
"- [pyflame](https://github.com/uber/pyflame)\n",
"\n",
"**Warning: profiling is much slower than a classical run, so do not profile with a long during setting**\n",
"- [py-spy](https://github.com/benfred/py-spy)\n",
"\n",
"`python3 -m cProfile -s cumulative -o profile_data.pyprof myscript intput_data`\n",
"\n",
"Visualize you result (*e.g.*) using `pyprof2calltree` and `kcachegrind`\n",
"- [plop](https://github.com/bdarnell/plop)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Line based profiling\n",
"\n",
"`pyprof2calltree -i profile_data.pyprof -k`\n"
"- [line_profiler](https://github.com/rkern/line_profiler)\n",
"- [pprofile](https://github.com/vpelletier/pprofile)\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
"slide_type": "slide"
}
},
"source": [
"Line based profiling\n",
"-----------------------\n",
"## Memory profiler\n",
"\n",
"- pprofile\n",
"- vprof \n"
"- [memory_profiler](https://pypi.org/project/memory-profiler/)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
"slide_type": "slide"
}
},
"source": [
"Memory profiler\n",
"-----------------\n",
"\n"
"## Time and memory profiler\n",
"\n",
"- [vprof](https://pypi.org/project/vprof/)\n"
]
},
{
......@@ -173,7 +250,7 @@
"source": [
"# Further reading \n",
"\n",
"More on profiling on the stackoverflow discussion: \n",
"More on profiling on a stackoverflow discussion: \n",
"\n",
"https://stackoverflow.com/questions/582336/how-can-you-profile-a-python-script\n"
]
......@@ -196,7 +273,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3rc1"
"version": "3.7.3"
}
},
"nbformat": 4,
......
%% Cell type:markdown id: tags:
# Profiling
Pierre Augier (LEGI), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre), Christophe Picard (LJK), Loïc Huder (ISTerre)
%% Cell type:markdown id: tags:
### Measure ⏱, don't guess! Profile to find the bottlenecks.
<p class="small"><br></p>
### Do not optimize everything!
- *"Premature optimization is the root of all evil"* (Donald Knuth)
- 80 / 20 rule, efficiency important for expensive things and NOT for small things
%% Cell type:markdown id: tags:
# Road map
# Different types of profiling
## Time profiling
- timeit
- script base time (unix cmd)
- function based profiling (cprofile)
- line base profiling
- Small code snippets
- Script based benchmark
- Function based profiling
- Line based profiling
<p class="small"><br></p>
## Memory profiling
- further readings
%% Cell type:markdown id: tags:
timeit
------
## Small code snippets
In ipython, you can use the magic command timeit that execute a piece of code and stats the time it spends:
- There is a module [`timeit` in the standard library](https://docs.python.org/3/library/timeit.html).
%% Cell type:markdown id: tags:
`python3 -m timeit -s "import math; l=[]" "for x in range(100): l.append(math.pow(x,2))"`
Basic profiling
-----------------
Problem: the module `timeit` does not try to guess how many times to execute the statement.
While writing code, you can use the magic command timeit:
- In IPython, you can use the magic command `%timeit` that execute a piece of code and stats the time it spends:
%% Cell type:code id: tags:
``` python
import math
l=[]
l = []
%timeit for x in range(100): l.append(math.pow(x,2))
%timeit [math.pow(x,2) for x in range(100)]
l = []
%timeit for x in range(100): l.append(x*x)
%timeit [x*x for x in range(100)]
```
%%%% Output: stream
100000 loops, best of 5: 16 µs per loop
100000 loops, best of 5: 12.7 µs per loop
100000 loops, best of 5: 6.7 µs per loop
100000 loops, best of 5: 3.98 µs per loop
26.8 µs ± 373 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
19.7 µs ± 146 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
11.2 µs ± 37.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
5.53 µs ± 11.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%% Cell type:markdown id: tags:
Basic profiling
-----------------
- [`pyperf`](https://pypi.org/project/pyperf/) is a more powerful tool but we can also do the same as with the module `timeit`:
`python3 -m pyperf timeit -s "import math; l=[]" "for x in range(100): l.append(math.pow(x,2))"`
Evaluate you script as a whole, *e.g.* using the unix time function:
%% Cell type:markdown id: tags:
`time myscript intput_data`
## Script base benchmark
Evaluate the time execution of your script as a whole
- Using the Unix command `time`:
`time myscript.py`
- Using the Unix program [`perf`](https://perf.wiki.kernel.org)
`perf myscript.py`
Issues:
- not accurate (only one run!)
- includes the import and initialization time. It can be better to modify the script to print the elapsed time measured with:
%% Cell type:code id: tags:
``` python
from time import time
l = []
t_start = time()
[math.pow(x,2) for x in range(100)]
print(f"elapsed time: {time() - t_start:.2e} s")
```
%%%% Output: stream
elapsed time: 2.56e-04 s
%% Cell type:markdown id: tags:
## Function based profiling (cProfile)
cProfile (https://docs.python.org/3.7/library/profile.html): **deterministic profiling** of Python programs.
2 steps: (1) run the profiler and (2) analyze the results.
1. Run the profiler
- With an already written script `python3 -m cProfile myscript.py`
- Much better, write a dedicated script using the module cProfile. See `pyfiles/dtw_cort_dist/V0_numpy_loops/prof.py`
**Warning: profiling is much slower than a classical run, so do not profile with a long during setting**
2. Analyze the results
The standard tool is `pstats` (https://docs.python.org/3.7/library/profile.html#module-pstats)
Or visualize the results with `gprof2dot`, `SnakeViz`, `pyprof2calltree` and `kcachegrind`
Example: `pyprof2calltree -i prof.pstats -k`
%% Cell type:markdown id: tags:
Function based profiling (cprofile)
-----------------------------------------
## Statistical profiling
See http://pramodkumbhar.com/2019/01/python-deterministic-vs-statistical-profilers/
Use the cProfile module to profile the code.
Advantage compared to deterministic profiling: **very small overhead**
- Option -s ask to sort using cumulative time
- profile_data.pyprof is the output of the profiling
- myscript intput_data: the script with its regular arguments
- [pyflame](https://github.com/uber/pyflame)
**Warning: profiling is much slower than a classical run, so do not profile with a long during setting**
- [py-spy](https://github.com/benfred/py-spy)
`python3 -m cProfile -s cumulative -o profile_data.pyprof myscript intput_data`
- [plop](https://github.com/bdarnell/plop)
%% Cell type:markdown id: tags:
Visualize you result (*e.g.*) using `pyprof2calltree` and `kcachegrind`
## Line based profiling
`pyprof2calltree -i profile_data.pyprof -k`
- [line_profiler](https://github.com/rkern/line_profiler)
- [pprofile](https://github.com/vpelletier/pprofile)
%% Cell type:markdown id: tags:
Line based profiling
-----------------------
## Memory profiler
- pprofile
- vprof
- [memory_profiler](https://pypi.org/project/memory-profiler/)
%% Cell type:markdown id: tags:
Memory profiler
-----------------
## Time and memory profiler
- [vprof](https://pypi.org/project/vprof/)
%% Cell type:markdown id: tags:
# Further reading
More on profiling on the stackoverflow discussion:
More on profiling on a stackoverflow discussion:
https://stackoverflow.com/questions/582336/how-can-you-profile-a-python-script
......
......@@ -23,7 +23,7 @@ def serie_pair_index_generator(number):
)
def DTWDistance(s1, s2):
def dtw_distance(s1, s2):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first serie (ie an iterable over floats64)
......@@ -83,7 +83,7 @@ def compute(series, nb_series):
_dist_mat_dtw = np.zeros((nb_series, nb_series), dtype=np.float64)
_dist_mat_cort = np.zeros((nb_series, nb_series), dtype=np.float64)
for t1, t2 in gen:
dist_dtw = DTWDistance(series[t1], series[t2])
dist_dtw = dtw_distance(series[t1], series[t2])
_dist_mat_dtw[t1, t2] = dist_dtw
_dist_mat_dtw[t2, t1] = dist_dtw
dist_cort = 0.5 * (1 - cort(series[t1], series[t2]))
......
import cProfile
import pstats
from time import time
from dtw_cort_dist_mat import main, compute
series, nb_series = main(only_init=True)
t0 = t0 = time()
a, b = compute(series, nb_series)
t_end = time()
print('\nelapsed time = {:.3f} s'.format(t_end - t0))
t0 = t0 = time()
cProfile.runctx("a, b = compute(series, nb_series)", globals(), locals(), "prof.pstats")
t_end = time()
s = pstats.Stats('prof.pstats')
s.sort_stats('time').print_stats(12)
print('\nelapsed time = {:.3f} s'.format(t_end - t0))
print(
'\nwith gprof2dot and graphviz (command dot):\n'
'gprof2dot -f pstats prof.pstats | dot -Tpng -o prof.png')
......@@ -23,7 +23,7 @@ def serie_pair_index_generator(number):
)
def DTWDistance(s1, s2):
def dtw_distance(s1, s2):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first serie (ie an iterable over floats64)
......@@ -79,7 +79,7 @@ def compute(series, nb_series):
_dist_mat_dtw = np.zeros((nb_series, nb_series), dtype=np.float64)
_dist_mat_cort = np.zeros((nb_series, nb_series), dtype=np.float64)
for t1, t2 in gen:
dist_dtw = DTWDistance(series[t1], series[t2])
dist_dtw = dtw_distance(series[t1], series[t2])
_dist_mat_dtw[t1, t2] = dist_dtw
_dist_mat_dtw[t2, t1] = dist_dtw
dist_cort = 0.5 * (1 - cort(series[t1], series[t2]))
......
......@@ -8,7 +8,7 @@ from libc.math cimport abs
@cython.boundscheck(False)
@cython.wraparound(False)
def DTWDistance(double[:] s1, double[:] s2):
def dtw_distance(double[:] s1, double[:] s2):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first serie (ie an iterable over floats64)
......
......@@ -6,7 +6,7 @@ from pathlib import Path
import numpy as np
from dtw_cort import cort, DTWDistance
from dtw_cort import cort, dtw_distance
util = run_path(Path(__file__).absolute().parent.parent / "util.py")
......@@ -30,7 +30,7 @@ def compute(series, nb_series):
_dist_mat_dtw = np.zeros((nb_series, nb_series), dtype=np.float64)
_dist_mat_cort = np.zeros((nb_series, nb_series), dtype=np.float64)
for t1, t2 in gen:
dist_dtw = DTWDistance(series[t1], series[t2])
dist_dtw = dtw_distance(series[t1], series[t2])
_dist_mat_dtw[t1, t2] = dist_dtw
_dist_mat_dtw[t2, t1] = dist_dtw
dist_cort = 0.5 * (1 - cort(series[t1], series[t2]))
......
......@@ -25,7 +25,7 @@ def serie_pair_index_generator(number):
)
def DTWDistance(s1, s2):
def dtw_distance(s1, s2):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first serie (ie an iterable over floats64)
......@@ -83,7 +83,7 @@ def compute(series: "float64[:, :]", nb_series: int):
_dist_mat_dtw = np.zeros((nb_series, nb_series), dtype=np.float64)
_dist_mat_cort = np.zeros((nb_series, nb_series), dtype=np.float64)
for t1, t2 in gen:
dist_dtw = DTWDistance(series[t1], series[t2])
dist_dtw = dtw_distance(series[t1], series[t2])
_dist_mat_dtw[t1, t2] = dist_dtw
_dist_mat_dtw[t2, t1] = dist_dtw
dist_cort = 0.5 * (1 - cort(series[t1], series[t2]))
......
......@@ -26,7 +26,7 @@ def serie_pair_index_generator(number):
)
def DTWDistance(s1, s2):
def dtw_distance(s1, s2):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first serie (ie an iterable over floats64)
......@@ -84,7 +84,7 @@ def compute(series, nb_series):
_dist_mat_dtw = np.zeros((nb_series, nb_series), dtype=np.float64)
_dist_mat_cort = np.zeros((nb_series, nb_series), dtype=np.float64)