Skip to content
GitLab
Menu
Projects
Groups
Snippets
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
python-uga
training-hpc
Commits
76162001
Commit
76162001
authored
Jun 13, 2019
by
paugier
Browse files
Pres on profiling
parent
528d73a1
Changes
18
Hide whitespace changes
Inline
Side-by-side
.gitignore
View file @
76162001
...
...
@@ -27,4 +27,5 @@ ipynb/index.html
pyfiles/dtw_cort_dist/V5_cython/*.c
pyfiles/dtw_cort_dist/V5_cython/*.html
**/V*/res_cort.npy
**/V*/res_dtw.npy
\ No newline at end of file
**/V*/res_dtw.npy
pyfiles/dtw_cort_dist/V*/prof.png
ipynb/22_profiling.ipynb
View file @
76162001
...
...
@@ -10,157 +10,234 @@
"source": [
"# Profiling\n",
"\n",
"Pierre Augier (LEGI), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre), Christophe Picard (LJK), Loïc Huder (ISTerre)\n"
"Pierre Augier (LEGI), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre), Christophe Picard (LJK), Loïc Huder (ISTerre)\n",
"\n",
"### Measure ⏱, don't guess! Profile to find the bottlenecks.\n",
"\n",
"<p class=\"small\"><br></p>\n",
"\n",
"### Do not optimize everything!\n",
"\n",
"- *\"Premature optimization is the root of all evil\"* (Donald Knuth)\n",
"\n",
"- 80 / 20 rule, efficiency important for expensive things and NOT for small things"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "
sub
slide"
"slide_type": "slide"
}
},
"source": [
"\n",
"# Road map\n",
"# Different types of profiling\n",
"\n",
"## Time profiling\n",
"- timeit\n",
"- script base time (unix cmd)\n",
"- function based profiling (cprofile)\n",
"- line base profiling \n",
"\n",
"## Memory profiling \n",
"- further readings\n"
"- Small code snippets\n",
"- Script based benchmark\n",
"- Function based profiling\n",
"- Line based profiling\n",
"\n",
"<p class=\"small\"><br></p>\n",
"\n",
"## Memory profiling \n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "
sub
slide"
"slide_type": "slide"
}
},
"source": [
"timeit\n",
"------\n",
"## Small code snippets\n",
"\n",
"- There is a module [`timeit` in the standard library](https://docs.python.org/3/library/timeit.html).\n",
"\n",
" `python3 -m timeit -s \"import math; l=[]\" \"for x in range(100): l.append(math.pow(x,2))\"`\n",
"\n",
" Problem: the module `timeit` does not try to guess how many times to execute the statement.\n",
"\n",
"- In IPython, you can use the magic command `%timeit` that execute a piece of code and stats the time it spends: "
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"26.8 µs ± 373 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n",
"19.7 µs ± 146 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n",
"11.2 µs ± 37.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n",
"5.53 µs ± 11.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n"
]
}
],
"source": [
"import math\n",
"l = [] \n",
"%timeit for x in range(100): l.append(math.pow(x,2))\n",
"%timeit [math.pow(x,2) for x in range(100)]\n",
"l = []\n",
"%timeit for x in range(100): l.append(x*x)\n",
"%timeit [x*x for x in range(100)]\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- [`pyperf`](https://pypi.org/project/pyperf/) is a more powerful tool but we can also do the same as with the module `timeit`:\n",
"\n",
"
In i
python
, you can use the magic command timeit that execute a piece of code and stats the time it spends:\n
"
"
`
python
3 -m pyperf timeit -s \"import math; l=[]\" \"for x in range(100): l.append(math.pow(x,2))\"`
"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "
sub
slide"
"slide_type": "slide"
}
},
"source": [
"Basic profiling\n",
"-----------------\n",
"## Script base benchmark\n",
"\n",
"Evaluate the time execution of your script as a whole\n",
"\n",
"- Using the Unix command `time`:\n",
"\n",
"While writing code, you can use the magic command timeit: "
" `time myscript.py`\n",
"\n",
"- Using the Unix program [`perf`](https://perf.wiki.kernel.org)\n",
"\n",
" `perf myscript.py`\n",
"\n",
"Issues: \n",
"\n",
"- not accurate (only one run!)\n",
"- includes the import and initialization time. It can be better to modify the script to print the elapsed time measured with:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"100000 loops, best of 5: 16 µs per loop\n",
"100000 loops, best of 5: 12.7 µs per loop\n",
"100000 loops, best of 5: 6.7 µs per loop\n",
"100000 loops, best of 5: 3.98 µs per loop\n"
"elapsed time: 2.56e-04 s\n"
]
}
],
"source": [
"import math\n",
"l=[] \n",
"%timeit for x in range(100): l.append(math.pow(x,2))\n",
"%timeit [math.pow(x,2) for x in range(100)]\n",
"from time import time\n",
"\n",
"l = []\n",
"%timeit for x in range(100): l.append(x*x)\n",
"%timeit [x*x for x in range(100)]\n"
"\n",
"t_start = time()\n",
"[math.pow(x,2) for x in range(100)]\n",
"print(f\"elapsed time: {time() - t_start:.2e} s\")\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "
sub
slide"
"slide_type": "slide"
}
},
"source": [
"Basic profiling\n",
"-----------------\n",
"## Function based profiling (cProfile)\n",
"\n",
"cProfile (https://docs.python.org/3.7/library/profile.html): **deterministic profiling** of Python programs.\n",
"\n",
"2 steps: (1) run the profiler and (2) analyze the results.\n",
"\n",
"1. Run the profiler\n",
"\n",
" - With an already written script `python3 -m cProfile myscript.py`\n",
"\n",
"
Evaluate you script as a whole, *e.g.* using the unix time function:
\n",
"
- Much better, write a dedicated script using the module cProfile. See `pyfiles/dtw_cort_dist/V0_numpy_loops/prof.py`
\n",
"\n",
"`time myscript intput_data`"
" **Warning: profiling is much slower than a classical run, so do not profile with a long during setting**\n",
"\n",
"2. Analyze the results\n",
"\n",
" The standard tool is `pstats` (https://docs.python.org/3.7/library/profile.html#module-pstats)\n",
"\n",
" Or visualize the results with `gprof2dot`, `SnakeViz`, `pyprof2calltree` and `kcachegrind`\n",
"\n",
" Example: `pyprof2calltree -i prof.pstats -k`\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "
sub
slide"
"slide_type": "slide"
}
},
"source": [
"Function based profiling (cprofile)\n",
"-----------------------------------------\n",
"## Statistical profiling\n",
"\n",
"See http://pramodkumbhar.com/2019/01/python-deterministic-vs-statistical-profilers/\n",
"\n",
"
Use the cProfile module to profile the code.
\n",
"
Advantage compared to deterministic profiling: **very small overhead**
\n",
"\n",
"- Option -s ask to sort using cumulative time \n",
"- profile_data.pyprof is the output of the profiling\n",
"- myscript intput_data: the script with its regular arguments\n",
"- [pyflame](https://github.com/uber/pyflame)\n",
"\n",
"
**Warning: profiling is much slower than a classical run, so do not profile with a long during setting**
\n",
"
- [py-spy](https://github.com/benfred/py-spy)
\n",
"\n",
"`python3 -m cProfile -s cumulative -o profile_data.pyprof myscript intput_data`\n",
"\n",
"Visualize you result (*e.g.*) using `pyprof2calltree` and `kcachegrind`\n",
"- [plop](https://github.com/bdarnell/plop)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Line based profiling\n",
"\n",
"`pyprof2calltree -i profile_data.pyprof -k`\n"
"- [line_profiler](https://github.com/rkern/line_profiler)\n",
"- [pprofile](https://github.com/vpelletier/pprofile)\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "
sub
slide"
"slide_type": "slide"
}
},
"source": [
"Line based profiling\n",
"-----------------------\n",
"## Memory profiler\n",
"\n",
"- pprofile\n",
"- vprof \n"
"- [memory_profiler](https://pypi.org/project/memory-profiler/)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "
sub
slide"
"slide_type": "slide"
}
},
"source": [
"
M
emory profiler\n",
"
-----------------
\n",
"\n"
"
## Time and m
emory profiler\n",
"\n",
"
- [vprof](https://pypi.org/project/vprof/)
\n"
]
},
{
...
...
@@ -173,7 +250,7 @@
"source": [
"# Further reading \n",
"\n",
"More on profiling on
the
stackoverflow discussion: \n",
"More on profiling on
a
stackoverflow discussion: \n",
"\n",
"https://stackoverflow.com/questions/582336/how-can-you-profile-a-python-script\n"
]
...
...
@@ -196,7 +273,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3
rc1
"
"version": "3.7.3"
}
},
"nbformat": 4,
...
...
%% Cell type:markdown id: tags:
# Profiling
Pierre Augier (LEGI), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre), Christophe Picard (LJK), Loïc Huder (ISTerre)
%% Cell type:markdown id: tags:
### Measure ⏱, don't guess! Profile to find the bottlenecks.
<p
class=
"small"
><br></p>
### Do not optimize everything!
-
*"Premature optimization is the root of all evil"*
(Donald Knuth)
-
80 / 20 rule, efficiency important for expensive things and NOT for small things
%% Cell type:markdown id: tags:
#
Road map
#
Different types of profiling
## Time profiling
-
timeit
-
script base time (unix cmd)
-
function based profiling (cprofile)
-
line base profiling
-
Small code snippets
-
Script based benchmark
-
Function based profiling
-
Line based profiling
<p
class=
"small"
><br></p>
## Memory profiling
-
further readings
%% Cell type:markdown id: tags:
timeit
------
## Small code snippets
In ipython, you can use the magic command timeit that execute a piece of code and stats the
time
it
spends:
-
There is a module
[
`timeit` in the standard library
](
https://docs.python.org/3/library/
timeit
.html
)
.
%% Cell type:markdown id: tags:
`python3 -m timeit -s "import math; l=[]" "for x in range(100): l.append(math.pow(x,2))"`
Basic profiling
-----------------
Problem: the module
`timeit`
does not try to guess how many times to execute the statement.
While writing code
, you can use the magic command timeit:
-
In IPython
, you can use the magic command
`%
timeit
`
that execute a piece of code and stats the time it spends
:
%% Cell type:code id: tags:
```
python
import
math
l
=
[]
l
=
[]
%
timeit
for
x
in
range
(
100
):
l
.
append
(
math
.
pow
(
x
,
2
))
%
timeit
[
math
.
pow
(
x
,
2
)
for
x
in
range
(
100
)]
l
=
[]
%
timeit
for
x
in
range
(
100
):
l
.
append
(
x
*
x
)
%
timeit
[
x
*
x
for
x
in
range
(
100
)]
```
%%%% Output: stream
100000 loops, best of 5: 16 µs per loop
1
00000 loops, best of 5: 12.7 µs per loop
1
00000 loops, best of 5: 6.7 µs per loop
100000 loops, best of 5: 3.98 µs per loop
26.8 µs ± 373 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
1
9.7 µs ± 146 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
1
1.2 µs ± 37.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
5.53 µs ± 11.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%% Cell type:markdown id: tags:
Basic profiling
-----------------
-
[
`pyperf`
](
https://pypi.org/project/pyperf/
)
is a more powerful tool but we can also do the same as with the module
`timeit`
:
`python3 -m pyperf timeit -s "import math; l=[]" "for x in range(100): l.append(math.pow(x,2))"`
Evaluate you script as a whole,
*e.g.*
using the unix time function
:
%% Cell type:markdown id: tags
:
`time myscript intput_data`
## Script base benchmark
Evaluate the time execution of your script as a whole
-
Using the Unix command
`time`
:
`time myscript.py`
-
Using the Unix program
[
`perf`
](
https://perf.wiki.kernel.org
)
`perf myscript.py`
Issues:
-
not accurate (only one run!)
-
includes the import and initialization time. It can be better to modify the script to print the elapsed time measured with:
%% Cell type:code id: tags:
```
python
from
time
import
time
l
=
[]
t_start
=
time
()
[
math
.
pow
(
x
,
2
)
for
x
in
range
(
100
)]
print
(
f
"elapsed time:
{
time
()
-
t_start
:.
2
e
}
s"
)
```
%%%% Output: stream
elapsed time: 2.56e-04 s
%% Cell type:markdown id: tags:
## Function based profiling (cProfile)
cProfile (https://docs.python.org/3.7/library/profile.html):
**deterministic profiling**
of Python programs.
2 steps: (1) run the profiler and (2) analyze the results.
1.
Run the profiler
-
With an already written script
`python3 -m cProfile myscript.py`
-
Much better, write a dedicated script using the module cProfile. See
`pyfiles/dtw_cort_dist/V0_numpy_loops/prof.py`
**Warning: profiling is much slower than a classical run, so do not profile with a long during setting**
2.
Analyze the results
The standard tool is
`pstats`
(https://docs.python.org/3.7/library/profile.html#module-pstats)
Or visualize the results with
`gprof2dot`
,
`SnakeViz`
,
`pyprof2calltree`
and
`kcachegrind`
Example:
`pyprof2calltree -i prof.pstats -k`
%% Cell type:markdown id: tags:
Function based profiling (cprofile)
-----------------------------------------
## Statistical profiling
See http://pramodkumbhar.com/2019/01/python-deterministic-vs-statistical-profilers/
Use the cProfile module to profile the code.
Advantage compared to deterministic profiling:
**very small overhead**
-
Option -s ask to sort using cumulative time
-
profile_data.pyprof is the output of the profiling
-
myscript intput_data: the script with its regular arguments
-
[
pyflame
](
https://github.com/uber/pyflame
)
**Warning: profiling is much slower than a classical run, so do not profile with a long during setting**
-
[
py-spy
](
https://github.com/benfred/py-spy
)
`python3 -m cProfile -s cumulative -o profile_data.pyprof myscript intput_data`
-
[
plop
](
https://github.com/bdarnell/plop
)
%% Cell type:markdown id: tags:
Visualize you result (
*e.g.*
) using
`pyprof2calltree`
and
`kcachegrind`
## Line based profiling
`pyprof2calltree -i profile_data.pyprof -k`
-
[
line_profiler
](
https://github.com/rkern/line_profiler
)
-
[
pprofile
](
https://github.com/vpelletier/pprofile
)
%% Cell type:markdown id: tags:
Line based profiling
-----------------------
## Memory profiler
-
pprofile
-
vprof
-
[
memory_profiler
](
https://pypi.org/project/memory-profiler/
)
%% Cell type:markdown id: tags:
Memory profiler
-----------------
## Time and memory profiler
-
[
vprof
](
https://pypi.org/project/vprof/
)
%% Cell type:markdown id: tags:
# Further reading
More on profiling on
the
stackoverflow discussion:
More on profiling on
a
stackoverflow discussion:
https://stackoverflow.com/questions/582336/how-can-you-profile-a-python-script
...
...
pyfiles/dtw_cort_dist/V0_numpy_loops/dtw_cort_dist_mat.py
View file @
76162001
...
...
@@ -23,7 +23,7 @@ def serie_pair_index_generator(number):
)
def
DTWD
istance
(
s1
,
s2
):
def
dtw_d
istance
(
s1
,
s2
):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first serie (ie an iterable over floats64)
...
...
@@ -83,7 +83,7 @@ def compute(series, nb_series):
_dist_mat_dtw
=
np
.
zeros
((
nb_series
,
nb_series
),
dtype
=
np
.
float64
)
_dist_mat_cort
=
np
.
zeros
((
nb_series
,
nb_series
),
dtype
=
np
.
float64
)
for
t1
,
t2
in
gen
:
dist_dtw
=
DTWD
istance
(
series
[
t1
],
series
[
t2
])
dist_dtw
=
dtw_d
istance
(
series
[
t1
],
series
[
t2
])
_dist_mat_dtw
[
t1
,
t2
]
=
dist_dtw
_dist_mat_dtw
[
t2
,
t1
]
=
dist_dtw
dist_cort
=
0.5
*
(
1
-
cort
(
series
[
t1
],
series
[
t2
]))
...
...
pyfiles/dtw_cort_dist/V0_numpy_loops/prof.py
0 → 100644
View file @
76162001
import
cProfile
import
pstats
from
time
import
time
from
dtw_cort_dist_mat
import
main
,
compute
series
,
nb_series
=
main
(
only_init
=
True
)
t0
=
t0
=
time
()
a
,
b
=
compute
(
series
,
nb_series
)
t_end
=
time
()
print
(
'
\n
elapsed time = {:.3f} s'
.
format
(
t_end
-
t0
))
t0
=
t0
=
time
()
cProfile
.
runctx
(
"a, b = compute(series, nb_series)"
,
globals
(),
locals
(),
"prof.pstats"
)
t_end
=
time
()
s
=
pstats
.
Stats
(
'prof.pstats'
)
s
.
sort_stats
(
'time'
).
print_stats
(
12
)
print
(
'
\n
elapsed time = {:.3f} s'
.
format
(
t_end
-
t0
))
print
(
'
\n
with gprof2dot and graphviz (command dot):
\n
'
'gprof2dot -f pstats prof.pstats | dot -Tpng -o prof.png'
)
pyfiles/dtw_cort_dist/V1_numpy_vec/dtw_cort_dist_mat.py
View file @
76162001
...
...
@@ -23,7 +23,7 @@ def serie_pair_index_generator(number):
)
def
DTWD
istance
(
s1
,
s2
):
def
dtw_d
istance
(
s1
,
s2
):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first serie (ie an iterable over floats64)
...
...
@@ -79,7 +79,7 @@ def compute(series, nb_series):
_dist_mat_dtw
=
np
.
zeros
((
nb_series
,
nb_series
),
dtype
=
np
.
float64
)
_dist_mat_cort
=
np
.
zeros
((
nb_series
,
nb_series
),
dtype
=
np
.
float64
)
for
t1
,
t2
in
gen
:
dist_dtw
=
DTWD
istance
(
series
[
t1
],
series
[
t2
])
dist_dtw
=
dtw_d
istance
(
series
[
t1
],
series
[
t2
])
_dist_mat_dtw
[
t1
,
t2
]
=
dist_dtw
_dist_mat_dtw
[
t2
,
t1
]
=
dist_dtw
dist_cort
=
0.5
*
(
1
-
cort
(
series
[
t1
],
series
[
t2
]))
...
...
pyfiles/dtw_cort_dist/V5_cython/dtw_cort.pyx
View file @
76162001
...
...
@@ -8,7 +8,7 @@ from libc.math cimport abs
@
cython
.
boundscheck
(
False
)
@
cython
.
wraparound
(
False
)
def
DTWD
istance
(
double
[:]
s1
,
double
[:]
s2
):
def
dtw_d
istance
(
double
[:]
s1
,
double
[:]
s2
):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first serie (ie an iterable over floats64)
...
...
pyfiles/dtw_cort_dist/V5_cython/dtw_cort_dist_mat.py
View file @
76162001
...
...
@@ -6,7 +6,7 @@ from pathlib import Path
import
numpy
as
np
from
dtw_cort
import
cort
,
DTWD
istance
from
dtw_cort
import
cort
,
dtw_d
istance
util
=
run_path
(
Path
(
__file__
).
absolute
().
parent
.
parent
/
"util.py"
)
...
...
@@ -30,7 +30,7 @@ def compute(series, nb_series):
_dist_mat_dtw
=
np
.
zeros
((
nb_series
,
nb_series
),
dtype
=
np
.
float64
)
_dist_mat_cort
=
np
.
zeros
((
nb_series
,
nb_series
),
dtype
=
np
.
float64
)
for
t1
,
t2
in
gen
:
dist_dtw
=
DTWD
istance
(
series
[
t1
],
series
[
t2
])
dist_dtw
=
dtw_d
istance
(
series
[
t1
],
series
[
t2
])
_dist_mat_dtw
[
t1
,
t2
]
=
dist_dtw
_dist_mat_dtw
[
t2
,
t1
]
=
dist_dtw
dist_cort
=
0.5
*
(
1
-
cort
(
series
[
t1
],
series
[
t2
]))
...
...
pyfiles/dtw_cort_dist/V6_pythran_aot/dtw_cort_dist_mat.py
View file @
76162001
...
...
@@ -25,7 +25,7 @@ def serie_pair_index_generator(number):
)
def
DTWD
istance
(
s1
,
s2
):
def
dtw_d
istance
(
s1
,
s2
):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first serie (ie an iterable over floats64)
...
...
@@ -83,7 +83,7 @@ def compute(series: "float64[:, :]", nb_series: int):
_dist_mat_dtw
=
np
.
zeros
((
nb_series
,
nb_series
),
dtype
=
np
.
float64
)
_dist_mat_cort
=
np
.
zeros
((
nb_series
,
nb_series
),
dtype
=
np
.
float64
)
for
t1
,
t2
in
gen
:
dist_dtw
=
DTWD
istance
(
series
[
t1
],
series
[
t2
])
dist_dtw
=
dtw_d
istance
(
series
[
t1
],
series
[
t2
])
_dist_mat_dtw
[
t1
,
t2
]
=
dist_dtw
_dist_mat_dtw
[
t2
,
t1
]
=
dist_dtw
dist_cort
=
0.5
*
(
1
-
cort
(
series
[
t1
],
series
[
t2
]))
...
...
pyfiles/dtw_cort_dist/V6_pythran_jit/dtw_cort_dist_mat.py
View file @
76162001
...
...
@@ -26,7 +26,7 @@ def serie_pair_index_generator(number):
)
def
DTWD
istance
(
s1
,
s2
):
def
dtw_d
istance
(
s1
,
s2
):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first serie (ie an iterable over floats64)
...
...
@@ -84,7 +84,7 @@ def compute(series, nb_series):
_dist_mat_dtw
=
np
.
zeros
((
nb_series
,
nb_series
),
dtype
=
np
.
float64
)
_dist_mat_cort
=
np
.
zeros
((
nb_series
,
nb_series
),
dtype
=
np
.
float64
)