Commit 165e85fd authored by Raphael Bacher's avatar Raphael Bacher
Browse files
parents 305d06e8 5af6249e
......@@ -27,4 +27,5 @@ ipynb/index.html
pyfiles/dtw_cort_dist/V5_cython/*.c
pyfiles/dtw_cort_dist/V5_cython/*.html
**/V*/res_cort.npy
**/V*/res_dtw.npy
\ No newline at end of file
**/V*/res_dtw.npy
pyfiles/dtw_cort_dist/V*/prof.png
......@@ -2,103 +2,162 @@
# Profiling
Pierre Augier (LEGI), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre), Christophe Picard (LJK), Loïc Huder (ISTerre)
### Measure ⏱, don't guess! Profile to find the bottlenecks.
%% Cell type:markdown id: tags:
<p class="small"><br></p>
### Do not optimize everything!
- *"Premature optimization is the root of all evil"* (Donald Knuth)
- 80 / 20 rule, efficiency important for expensive things and NOT for small things
# Road map
%% Cell type:markdown id: tags:
# Different types of profiling
## Time profiling
- timeit
- script base time (unix cmd)
- function based profiling (cprofile)
- line base profiling
- Small code snippets
- Script based benchmark
- Function based profiling
- Line based profiling
<p class="small"><br></p>
## Memory profiling
- further readings
%% Cell type:markdown id: tags:
timeit
------
In ipython, you can use the magic command timeit that execute a piece of code and stats the time it spends:
## Small code snippets
- There is a module [`timeit` in the standard library](https://docs.python.org/3/library/timeit.html).
%% Cell type:markdown id: tags:
`python3 -m timeit -s "import math; l=[]" "for x in range(100): l.append(math.pow(x,2))"`
Basic profiling
-----------------
Problem: the module `timeit` does not try to guess how many times to execute the statement.
While writing code, you can use the magic command timeit:
- In IPython, you can use the magic command `%timeit` that execute a piece of code and stats the time it spends:
%% Cell type:code id: tags:
``` python
import math
l=[]
l = []
%timeit for x in range(100): l.append(math.pow(x,2))
%timeit [math.pow(x,2) for x in range(100)]
l = []
%timeit for x in range(100): l.append(x*x)
%timeit [x*x for x in range(100)]
```
%% Cell type:markdown id: tags:
Basic profiling
-----------------
- [`pyperf`](https://pypi.org/project/pyperf/) is a more powerful tool but we can also do the same as with the module `timeit`:
`python3 -m pyperf timeit -s "import math; l=[]" "for x in range(100): l.append(math.pow(x,2))"`
%% Cell type:markdown id: tags:
## Script base benchmark
Evaluate the time execution of your script as a whole
- Using the Unix command `time`:
`time myscript.py`
- Using the Unix program [`perf`](https://perf.wiki.kernel.org)
`perf myscript.py`
Issues:
- not accurate (only one run!)
- includes the import and initialization time. It can be better to modify the script to print the elapsed time measured with:
%% Cell type:code id: tags:
``` python
from time import time
Evaluate you script as a whole, *e.g.* using the unix time function:
l = []
t_start = time()
[math.pow(x,2) for x in range(100)]
print(f"elapsed time: {time() - t_start:.2e} s")
`time myscript intput_data`
```
%% Cell type:markdown id: tags:
Function based profiling (cprofile)
-----------------------------------------
## Function based profiling (cProfile)
cProfile (https://docs.python.org/3.7/library/profile.html): **deterministic profiling** of Python programs.
Use the cProfile module to profile the code.
2 steps: (1) run the profiler and (2) analyze the results.
- Option -s ask to sort using cumulative time
- profile_data.pyprof is the output of the profiling
- myscript intput_data: the script with its regular arguments
1. Run the profiler
**Warning: profiling is much slower than a classical run, so do not profile with a long during setting**
- With an already written script `python3 -m cProfile myscript.py`
`python3 -m cProfile -s cumulative -o profile_data.pyprof myscript intput_data`
- Much better, write a dedicated script using the module cProfile. See `pyfiles/dtw_cort_dist/V0_numpy_loops/prof.py`
Visualize you result (*e.g.*) using `pyprof2calltree` and `kcachegrind`
**Warning: profiling is much slower than a classical run, so do not profile with a long during setting**
`pyprof2calltree -i profile_data.pyprof -k`
2. Analyze the results
The standard tool is `pstats` (https://docs.python.org/3.7/library/profile.html#module-pstats)
Or visualize the results with `gprof2dot`, `SnakeViz`, `pyprof2calltree` and `kcachegrind`
Example: `pyprof2calltree -i prof.pstats -k`
%% Cell type:markdown id: tags:
Line based profiling
-----------------------
## Statistical profiling
See http://pramodkumbhar.com/2019/01/python-deterministic-vs-statistical-profilers/
Advantage compared to deterministic profiling: **very small overhead**
- [pyflame](https://github.com/uber/pyflame)
- [py-spy](https://github.com/benfred/py-spy)
- [plop](https://github.com/bdarnell/plop)
%% Cell type:markdown id: tags:
## Line based profiling
- [line_profiler](https://github.com/rkern/line_profiler)
- [pprofile](https://github.com/vpelletier/pprofile)
%% Cell type:markdown id: tags:
- pprofile
- vprof
## Memory profiler
- [memory_profiler](https://pypi.org/project/memory-profiler/)
%% Cell type:markdown id: tags:
Memory profiler
-----------------
## Time and memory profiler
- [vprof](https://pypi.org/project/vprof/)
%% Cell type:markdown id: tags:
# Further reading
More on profiling on the stackoverflow discussion:
More on profiling on a stackoverflow discussion:
https://stackoverflow.com/questions/582336/how-can-you-profile-a-python-script
......
......@@ -6,10 +6,11 @@
We consider here wrapping two static languages: C and fortran.
We classically wrapp already existing code to access them via python.
Depending on the language to wrap the tool to use are a bit different.
%% Cell type:markdown id: tags:
## Fortran with [f2py](https://docs.scipy.org/doc/numpy/f2py/)
......@@ -368,5 +369,17 @@
**6. Use your functions**
%% Cell type:markdown id: tags:
sq = dllib.square(2.0, 3.0)
%% Cell type:markdown id: tags:
Alternatives techniques:
------------------------------------
The historical tool is swig (http://swig.org/). It allows to access C/C++ code from a variety of languages.
It requires the writing of an intermediate file that describes the C API.
From now wrapping C code can be done quite easily using CFFI as presented before.
For wrapping C++ code, one will consider pybind11 (https://github.com/pybind/pybind11) that relies on features available from the 11 versions of C++.
......
#!/usr/bin/env python
import concurrent.futures as futures
import itertools
import logging
logging.basicConfig(format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
level=logging.INFO)
def profile(func):
""" decorator that profile a function """
def wrapper(*args, **kwargs):
import time
start = time.time()
func(*args, **kwargs)
end = time.time()
print(end - start)
return wrapper
def factorize_naive(n):
""" A naive factorization method. Take integer 'n', return list of
factors.
"""
logging.debug("starting factorize_naive({})".format(n))
if n < 2:
logging.debug("ending factorize_naive({}) = []".format(n))
return []
factors = []
p = 2
while True:
if n == 1:
logging.debug("ending factorize_naive({}) = {}".format(n, factors))
return factors
r = n % p
if r == 0:
factors.append(p)
n = n / p
elif p * p >= n:
factors.append(n)
logging.debug("ending factorize_naive({}) = {}".format(n, factors))
return factors
elif p > 2:
# Advance in steps of 2 over odd numbers
p += 2
else:
# If p == 2, get to 3
p += 1
def run_func(func, nb_jobs, inputs):
"""run fib in // for each element of the iterable
:param func: (callale) function to call. Take as input one parameter
:param nb_jobs: (int) the number of jobs to run in parallel
:param inputs: iterable over parameters
:return: (dic) key is the input, value is the value returned by the func
"""
# We can use a with statement to ensure threads are cleaned up promptly
future_to_data = {}
with futures.ProcessPoolExecutor(max_workers=nb_jobs) as executor:
future_to_data = {executor.submit(func, data): data for data in inputs}
# all the future object has been submitted -> they are running
# iteration over dictionary iterates over keys
logging.debug("end of submit")
"""
for f in future_to_data:
data = future_to_data[f]
yield data, f.result()
"""
for f in futures.as_completed(future_to_data):
data = future_to_data[f]
yield data, f.result()
def main():
"""run main
"""
inputs = itertools.chain(range(100), range(200000, 12000000, 100))
# logging.info("len inputs=%d", len(inputs))
for data, res in run_func(factorize_naive, 4, inputs):
logging.info("f({}) = {}".format(data, res))
if __name__ == "__main__":
main()
#!/usr/bin/python3
import multiprocessing as mp
import sys
import numpy as np
def is_prime(number):
"""return True if number is prime, else False
:number: (int) a positive number (no check)
:returns: (bool) true if number is prime, False else
"""
for i in range(2, number):
if number % i == 0:
return False
return True
def first_primes(nb_to_check, nb_proc=None):
"""Prints the first prime numbers
:bound: (int) number of primes to check
:nb_proc: (int) the number of parallel jobs to run (default nb procs available)
:returns: a list of the first nb_to_check prime numbers
"""
inputs = range(2, nb_to_check)
with mp.Pool(nb_proc) as pool:
res = [False, False] + pool.map(is_prime, inputs)
return [idx for (idx, e) in enumerate(res) if e]
def main(argv):
if len(sys.argv) < 2:
print(f"usage: {argv[0]} nb_prime_to_check [nb_procs=4]")
sys.exit(1)
if len(sys.argv) < 3:
nb_proc = mp.cpu_count()
else:
nb_proc = int(sys.argv[2])
nb_to_check = int(sys.argv[1])
print(f"looking for the {nb_to_check} primes number using {nb_proc} procs")
primes = first_primes(nb_to_check)
print(f"first {nb_to_check} primes: {primes}")
if __name__ == '__main__':
main(sys.argv)
#!/usr/bin/python3
import multiprocessing as mp
import sys
import numpy as np
def distances(item, collection, dist):
"""Return the set of distances between item and elements of collection,
using dist as distance measure
Complexity = O(|collection| * O(dist))
:item: an element to look for
:collection: (iterable) collection of elements that are comparable with item
:dist: function that takes two items and return the distance between them
:returns:(list) distances between item and each elem of collection
"""
return [dist(item, e) for e in collection]
def abs_dist(x, y):
"""return the absolute of the difference of the inputs """
return abs(x-y)
def compute_abs_dist(col, nb_proc):
"""Computes the distance matrix between elements in collection
:col: (iterable) the set of elements to consider.
:nb_proc: the number of parallel jobs to run
:returns: distance matrix
"""
inputs = [(x, col, abs_dist) for x in col]
collection_size = len(col)
dist_mat = None
with mp.Pool(nb_proc) as pool:
dist_mat = pool.starmap(distances, inputs)
# results gets back sorted in the same order as provided
res = np.empty([collection_size, collection_size])
for idx, elem in enumerate(dist_mat):
res[idx,:] = elem
return res
def main(argv):
if len(sys.argv) < 2:
print(f"usage: {argv[0]} collection_size [nb_procs]")
sys.exit(1)
if len(sys.argv) < 3:
nb_proc = mp.cpu_count()
else:
nb_proc = int(sys.argv[2])
collection_size = int(sys.argv[1])
col = np.random.random(collection_size)
print(f"collection: {col}")
res = compute_abs_dist(col, nb_proc)
print(f"res={res}")
import matplotlib.pyplot as plt
plt.imshow(res)
plt.show()
if __name__ == '__main__':
main(sys.argv)
......@@ -23,7 +23,7 @@ def serie_pair_index_generator(number):
)
def DTWDistance(s1, s2):
def dtw_distance(s1, s2):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first serie (ie an iterable over floats64)
......@@ -83,7 +83,7 @@ def compute(series, nb_series):
_dist_mat_dtw = np.zeros((nb_series, nb_series), dtype=np.float64)
_dist_mat_cort = np.zeros((nb_series, nb_series), dtype=np.float64)
for t1, t2 in gen:
dist_dtw = DTWDistance(series[t1], series[t2])
dist_dtw = dtw_distance(series[t1], series[t2])
_dist_mat_dtw[t1, t2] = dist_dtw
_dist_mat_dtw[t2, t1] = dist_dtw
dist_cort = 0.5 * (1 - cort(series[t1], series[t2]))
......
import cProfile
import pstats
from time import time
from dtw_cort_dist_mat import main, compute
series, nb_series = main(only_init=True)
t0 = t0 = time()
a, b = compute(series, nb_series)
t_end = time()
print('\nelapsed time = {:.3f} s'.format(t_end - t0))
t0 = t0 = time()
cProfile.runctx("a, b = compute(series, nb_series)", globals(), locals(), "prof.pstats")
t_end = time()
s = pstats.Stats('prof.pstats')
s.sort_stats('time').print_stats(12)
print('\nelapsed time = {:.3f} s'.format(t_end - t0))
print(
'\nwith gprof2dot and graphviz (command dot):\n'
'gprof2dot -f pstats prof.pstats | dot -Tpng -o prof.png')
......@@ -23,7 +23,7 @@ def serie_pair_index_generator(number):
)
def DTWDistance(s1, s2):
def dtw_distance(s1, s2):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first serie (ie an iterable over floats64)
......@@ -79,7 +79,7 @@ def compute(series, nb_series):
_dist_mat_dtw = np.zeros((nb_series, nb_series), dtype=np.float64)
_dist_mat_cort = np.zeros((nb_series, nb_series), dtype=np.float64)
for t1, t2 in gen:
dist_dtw = DTWDistance(series[t1], series[t2])
dist_dtw = dtw_distance(series[t1], series[t2])
_dist_mat_dtw[t1, t2] = dist_dtw
_dist_mat_dtw[t2, t1] = dist_dtw
dist_cort = 0.5 * (1 - cort(series[t1], series[t2]))
......
......@@ -8,7 +8,7 @@ from libc.math cimport abs
@cython.boundscheck(False)
@cython.wraparound(False)
def DTWDistance(double[:] s1, double[:] s2):
def dtw_distance(double[:] s1, double[:] s2):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first serie (ie an iterable over floats64)
......
......@@ -6,7 +6,7 @@ from pathlib import Path
import numpy as np
from dtw_cort import cort, DTWDistance
from dtw_cort import cort, dtw_distance
util = run_path(Path(__file__).absolute().parent.parent / "util.py")
......@@ -30,7 +30,7 @@ def compute(series, nb_series):
_dist_mat_dtw = np.zeros((nb_series, nb_series), dtype=np.float64)
_dist_mat_cort = np.zeros((nb_series, nb_series), dtype=np.float64)
for t1, t2 in gen:
dist_dtw = DTWDistance(series[t1], series[t2])
dist_dtw = dtw_distance(series[t1], series[t2])
_dist_mat_dtw[t1, t2] = dist_dtw
_dist_mat_dtw[t2, t1] = dist_dtw
dist_cort = 0.5 * (1 - cort(series[t1], series[t2]))
......
......@@ -25,7 +25,7 @@ def serie_pair_index_generator(number):
)
def DTWDistance(s1, s2):
def dtw_distance(s1, s2):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first serie (ie an iterable over floats64)
......@@ -83,7 +83,7 @@ def compute(series: "float64[:, :]", nb_series: int):
_dist_mat_dtw = np.zeros((nb_series, nb_series), dtype=np.float64)
_dist_mat_cort = np.zeros((nb_series, nb_series), dtype=np.float64)
for t1, t2 in gen:
dist_dtw = DTWDistance(series[t1], series[t2])
dist_dtw = dtw_distance(series[t1], series[t2])
_dist_mat_dtw[t1, t2] = dist_dtw
_dist_mat_dtw[t2, t1] = dist_dtw
dist_cort = 0.5 * (1 - cort(series[t1], series[t2]))
......
......@@ -26,7 +26,7 @@ def serie_pair_index_generator(number):
)
def DTWDistance(s1, s2):
def dtw_distance(s1, s2):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first serie (ie an iterable over floats64)
......@@ -84,7 +84,7 @@ def compute(series, nb_series):
_dist_mat_dtw = np.zeros((nb_series, nb_series), dtype=np.float64)
_dist_mat_cort = np.zeros((nb_series, nb_series), dtype=np.float64)
for t1, t2 in gen:
dist_dtw = DTWDistance(series[t1], series[t2])
dist_dtw = dtw_distance(series[t1], series[t2])
_dist_mat_dtw[t1, t2] = dist_dtw
_dist_mat_dtw[t2, t1] = dist_dtw
dist_cort = 0.5 * (1 - cort(series[t1], series[t2]))
......
......@@ -10,7 +10,7 @@ def serie_pair_index_generator(number):
)
def DTWDistance(s1, s2):
def dtw_distance(s1, s2):
" Computes the dtw between s1 and s2 with distance the absolute distance\n\n :param s1: the first serie (ie an iterable over floats64)\n :param s2: the second serie (ie an iterable over floats64)\n :returns: the dtw distance\n :rtype: float64\n "
len_s1 = len(s1)
len_s2 = len(s2)
......@@ -58,7 +58,7 @@ def compute(series, nb_series):
_dist_mat_dtw = np.zeros((nb_series, nb_series), dtype=np.float64)
_dist_mat_cort = np.zeros((nb_series, nb_series), dtype=np.float64)
for (t1, t2) in gen:
dist_dtw = DTWDistance(series[t1], series[t2])
dist_dtw = dtw_distance(series[t1], series[t2])
_dist_mat_dtw[(t1, t2)] = dist_dtw
_dist_mat_dtw[(t2, t1)] = dist_dtw
dist_cort = 0.5 * (1 - cort(series[t1], series[t2]))
......
......@@ -26,7 +26,7 @@ def serie_pair_index_generator(number):
@jit(cache=True)
def DTWDistance(s1, s2):
def dtw_distance(s1, s2):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first serie (ie an iterable over floats64)
......@@ -90,7 +90,7 @@ def compute(series, nb_series):
_dist_mat_dtw = np.zeros((nb_series, nb_series), dtype=np.float64)
_dist_mat_cort = np.zeros((nb_series, nb_series), dtype=np.float64)
for t1, t2 in gen:
dist_dtw = DTWDistance(series[t1], series[t2])
dist_dtw = dtw_distance(series[t1], series[t2])
_dist_mat_dtw[t1, t2] = dist_dtw
_dist_mat_dtw[t2, t1] = dist_dtw
dist_cort = 0.5 * (1 - cort(series[t1], series[t2]))
......
......@@ -24,7 +24,7 @@ def serie_pair_index_generator(number):
)
def DTWDistance(s1, s2):
def dtw_distance(s1, s2):
""" Computes the dtw between s1 and s2 with distance the absolute distance