Vous avez reçu un message "Your GitLab account has been locked ..." ? Pas d'inquiétude : lisez cet article https://docs.gricad-pages.univ-grenoble-alpes.fr/help/unlock/

Commit 165e85fd authored by Raphael Bacher's avatar Raphael Bacher
Browse files
parents 305d06e8 5af6249e
......@@ -27,4 +27,5 @@ ipynb/index.html
pyfiles/dtw_cort_dist/V5_cython/*.c
pyfiles/dtw_cort_dist/V5_cython/*.html
**/V*/res_cort.npy
**/V*/res_dtw.npy
\ No newline at end of file
**/V*/res_dtw.npy
pyfiles/dtw_cort_dist/V*/prof.png
......@@ -10,157 +10,234 @@
"source": [
"# Profiling\n",
"\n",
"Pierre Augier (LEGI), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre), Christophe Picard (LJK), Loïc Huder (ISTerre)\n"
"Pierre Augier (LEGI), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre), Christophe Picard (LJK), Loïc Huder (ISTerre)\n",
"\n",
"### Measure ⏱, don't guess! Profile to find the bottlenecks.\n",
"\n",
"<p class=\"small\"><br></p>\n",
"\n",
"### Do not optimize everything!\n",
"\n",
"- *\"Premature optimization is the root of all evil\"* (Donald Knuth)\n",
"\n",
"- 80 / 20 rule, efficiency important for expensive things and NOT for small things"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
"slide_type": "slide"
}
},
"source": [
"\n",
"# Road map\n",
"# Different types of profiling\n",
"\n",
"## Time profiling\n",
"- timeit\n",
"- script base time (unix cmd)\n",
"- function based profiling (cprofile)\n",
"- line base profiling \n",
"\n",
"## Memory profiling \n",
"- further readings\n"
"- Small code snippets\n",
"- Script based benchmark\n",
"- Function based profiling\n",
"- Line based profiling\n",
"\n",
"<p class=\"small\"><br></p>\n",
"\n",
"## Memory profiling \n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
"slide_type": "slide"
}
},
"source": [
"timeit\n",
"------\n",
"## Small code snippets\n",
"\n",
"- There is a module [`timeit` in the standard library](https://docs.python.org/3/library/timeit.html).\n",
"\n",
" `python3 -m timeit -s \"import math; l=[]\" \"for x in range(100): l.append(math.pow(x,2))\"`\n",
"\n",
" Problem: the module `timeit` does not try to guess how many times to execute the statement.\n",
"\n",
"- In IPython, you can use the magic command `%timeit` that execute a piece of code and stats the time it spends: "
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"26.8 µs ± 373 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n",
"19.7 µs ± 146 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n",
"11.2 µs ± 37.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n",
"5.53 µs ± 11.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n"
]
}
],
"source": [
"import math\n",
"l = [] \n",
"%timeit for x in range(100): l.append(math.pow(x,2))\n",
"%timeit [math.pow(x,2) for x in range(100)]\n",
"l = []\n",
"%timeit for x in range(100): l.append(x*x)\n",
"%timeit [x*x for x in range(100)]\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- [`pyperf`](https://pypi.org/project/pyperf/) is a more powerful tool but we can also do the same as with the module `timeit`:\n",
"\n",
"In ipython, you can use the magic command timeit that execute a piece of code and stats the time it spends:\n"
"`python3 -m pyperf timeit -s \"import math; l=[]\" \"for x in range(100): l.append(math.pow(x,2))\"`"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
"slide_type": "slide"
}
},
"source": [
"Basic profiling\n",
"-----------------\n",
"## Script base benchmark\n",
"\n",
"Evaluate the time execution of your script as a whole\n",
"\n",
"- Using the Unix command `time`:\n",
"\n",
"While writing code, you can use the magic command timeit: "
" `time myscript.py`\n",
"\n",
"- Using the Unix program [`perf`](https://perf.wiki.kernel.org)\n",
"\n",
" `perf myscript.py`\n",
"\n",
"Issues: \n",
"\n",
"- not accurate (only one run!)\n",
"- includes the import and initialization time. It can be better to modify the script to print the elapsed time measured with:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"100000 loops, best of 5: 16 µs per loop\n",
"100000 loops, best of 5: 12.7 µs per loop\n",
"100000 loops, best of 5: 6.7 µs per loop\n",
"100000 loops, best of 5: 3.98 µs per loop\n"
"elapsed time: 2.56e-04 s\n"
]
}
],
"source": [
"import math\n",
"l=[] \n",
"%timeit for x in range(100): l.append(math.pow(x,2))\n",
"%timeit [math.pow(x,2) for x in range(100)]\n",
"from time import time\n",
"\n",
"l = []\n",
"%timeit for x in range(100): l.append(x*x)\n",
"%timeit [x*x for x in range(100)]\n"
"\n",
"t_start = time()\n",
"[math.pow(x,2) for x in range(100)]\n",
"print(f\"elapsed time: {time() - t_start:.2e} s\")\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
"slide_type": "slide"
}
},
"source": [
"Basic profiling\n",
"-----------------\n",
"## Function based profiling (cProfile)\n",
"\n",
"cProfile (https://docs.python.org/3.7/library/profile.html): **deterministic profiling** of Python programs.\n",
"\n",
"2 steps: (1) run the profiler and (2) analyze the results.\n",
"\n",
"1. Run the profiler\n",
"\n",
" - With an already written script `python3 -m cProfile myscript.py`\n",
"\n",
"Evaluate you script as a whole, *e.g.* using the unix time function:\n",
" - Much better, write a dedicated script using the module cProfile. See `pyfiles/dtw_cort_dist/V0_numpy_loops/prof.py`\n",
"\n",
"`time myscript intput_data`"
" **Warning: profiling is much slower than a classical run, so do not profile with a long during setting**\n",
"\n",
"2. Analyze the results\n",
"\n",
" The standard tool is `pstats` (https://docs.python.org/3.7/library/profile.html#module-pstats)\n",
"\n",
" Or visualize the results with `gprof2dot`, `SnakeViz`, `pyprof2calltree` and `kcachegrind`\n",
"\n",
" Example: `pyprof2calltree -i prof.pstats -k`\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
"slide_type": "slide"
}
},
"source": [
"Function based profiling (cprofile)\n",
"-----------------------------------------\n",
"## Statistical profiling\n",
"\n",
"See http://pramodkumbhar.com/2019/01/python-deterministic-vs-statistical-profilers/\n",
"\n",
"Use the cProfile module to profile the code. \n",
"Advantage compared to deterministic profiling: **very small overhead**\n",
"\n",
"- Option -s ask to sort using cumulative time \n",
"- profile_data.pyprof is the output of the profiling\n",
"- myscript intput_data: the script with its regular arguments\n",
"- [pyflame](https://github.com/uber/pyflame)\n",
"\n",
"**Warning: profiling is much slower than a classical run, so do not profile with a long during setting**\n",
"- [py-spy](https://github.com/benfred/py-spy)\n",
"\n",
"`python3 -m cProfile -s cumulative -o profile_data.pyprof myscript intput_data`\n",
"\n",
"Visualize you result (*e.g.*) using `pyprof2calltree` and `kcachegrind`\n",
"- [plop](https://github.com/bdarnell/plop)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Line based profiling\n",
"\n",
"`pyprof2calltree -i profile_data.pyprof -k`\n"
"- [line_profiler](https://github.com/rkern/line_profiler)\n",
"- [pprofile](https://github.com/vpelletier/pprofile)\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
"slide_type": "slide"
}
},
"source": [
"Line based profiling\n",
"-----------------------\n",
"## Memory profiler\n",
"\n",
"- pprofile\n",
"- vprof \n"
"- [memory_profiler](https://pypi.org/project/memory-profiler/)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
"slide_type": "slide"
}
},
"source": [
"Memory profiler\n",
"-----------------\n",
"\n"
"## Time and memory profiler\n",
"\n",
"- [vprof](https://pypi.org/project/vprof/)\n"
]
},
{
......@@ -173,7 +250,7 @@
"source": [
"# Further reading \n",
"\n",
"More on profiling on the stackoverflow discussion: \n",
"More on profiling on a stackoverflow discussion: \n",
"\n",
"https://stackoverflow.com/questions/582336/how-can-you-profile-a-python-script\n"
]
......@@ -196,7 +273,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3rc1"
"version": "3.7.3"
}
},
"nbformat": 4,
......
......@@ -14,7 +14,8 @@
"We consider here wrapping two static languages: C and fortran.\n",
"\n",
"We classically wrapp already existing code to access them via python. \n",
"\n"
"\n",
"Depending on the language to wrap the tool to use are a bit different. \n"
]
},
{
......@@ -479,6 +480,21 @@
"source": [
"sq = dllib.square(2.0, 3.0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Alternatives techniques: \n",
"------------------------------------\n",
"\n",
"The historical tool is swig (http://swig.org/). It allows to access C/C++ code from a variety of languages. \n",
"It requires the writing of an intermediate file that describes the C API. \n",
"\n",
"From now wrapping C code can be done quite easily using CFFI as presented before. \n",
"\n",
"For wrapping C++ code, one will consider pybind11 (https://github.com/pybind/pybind11) that relies on features available from the 11 versions of C++. "
]
}
],
"metadata": {
......
#!/usr/bin/env python
import concurrent.futures as futures
import itertools
import logging
logging.basicConfig(format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
level=logging.INFO)
def profile(func):
""" decorator that profile a function """
def wrapper(*args, **kwargs):
import time
start = time.time()
func(*args, **kwargs)
end = time.time()
print(end - start)
return wrapper
def factorize_naive(n):
""" A naive factorization method. Take integer 'n', return list of
factors.
"""
logging.debug("starting factorize_naive({})".format(n))
if n < 2:
logging.debug("ending factorize_naive({}) = []".format(n))
return []
factors = []
p = 2
while True:
if n == 1:
logging.debug("ending factorize_naive({}) = {}".format(n, factors))
return factors
r = n % p
if r == 0:
factors.append(p)
n = n / p
elif p * p >= n:
factors.append(n)
logging.debug("ending factorize_naive({}) = {}".format(n, factors))
return factors
elif p > 2:
# Advance in steps of 2 over odd numbers
p += 2
else:
# If p == 2, get to 3
p += 1
def run_func(func, nb_jobs, inputs):
"""run fib in // for each element of the iterable
:param func: (callale) function to call. Take as input one parameter
:param nb_jobs: (int) the number of jobs to run in parallel
:param inputs: iterable over parameters
:return: (dic) key is the input, value is the value returned by the func
"""
# We can use a with statement to ensure threads are cleaned up promptly
future_to_data = {}
with futures.ProcessPoolExecutor(max_workers=nb_jobs) as executor:
future_to_data = {executor.submit(func, data): data for data in inputs}
# all the future object has been submitted -> they are running
# iteration over dictionary iterates over keys
logging.debug("end of submit")
"""
for f in future_to_data:
data = future_to_data[f]
yield data, f.result()
"""
for f in futures.as_completed(future_to_data):
data = future_to_data[f]
yield data, f.result()
def main():
"""run main
"""
inputs = itertools.chain(range(100), range(200000, 12000000, 100))
# logging.info("len inputs=%d", len(inputs))
for data, res in run_func(factorize_naive, 4, inputs):
logging.info("f({}) = {}".format(data, res))
if __name__ == "__main__":
main()
#!/usr/bin/python3
import multiprocessing as mp
import sys
import numpy as np
def is_prime(number):
"""return True if number is prime, else False
:number: (int) a positive number (no check)
:returns: (bool) true if number is prime, False else
"""
for i in range(2, number):
if number % i == 0:
return False
return True
def first_primes(nb_to_check, nb_proc=None):
"""Prints the first prime numbers
:bound: (int) number of primes to check
:nb_proc: (int) the number of parallel jobs to run (default nb procs available)
:returns: a list of the first nb_to_check prime numbers
"""
inputs = range(2, nb_to_check)
with mp.Pool(nb_proc) as pool:
res = [False, False] + pool.map(is_prime, inputs)
return [idx for (idx, e) in enumerate(res) if e]
def main(argv):
if len(sys.argv) < 2:
print(f"usage: {argv[0]} nb_prime_to_check [nb_procs=4]")
sys.exit(1)
if len(sys.argv) < 3:
nb_proc = mp.cpu_count()
else:
nb_proc = int(sys.argv[2])
nb_to_check = int(sys.argv[1])
print(f"looking for the {nb_to_check} primes number using {nb_proc} procs")
primes = first_primes(nb_to_check)
print(f"first {nb_to_check} primes: {primes}")
if __name__ == '__main__':
main(sys.argv)
#!/usr/bin/python3
import multiprocessing as mp
import sys
import numpy as np
def distances(item, collection, dist):
"""Return the set of distances between item and elements of collection,
using dist as distance measure
Complexity = O(|collection| * O(dist))
:item: an element to look for
:collection: (iterable) collection of elements that are comparable with item
:dist: function that takes two items and return the distance between them
:returns:(list) distances between item and each elem of collection
"""
return [dist(item, e) for e in collection]
def abs_dist(x, y):
"""return the absolute of the difference of the inputs """
return abs(x-y)
def compute_abs_dist(col, nb_proc):
"""Computes the distance matrix between elements in collection
:col: (iterable) the set of elements to consider.
:nb_proc: the number of parallel jobs to run
:returns: distance matrix
"""
inputs = [(x, col, abs_dist) for x in col]
collection_size = len(col)
dist_mat = None
with mp.Pool(nb_proc) as pool:
dist_mat = pool.starmap(distances, inputs)
# results gets back sorted in the same order as provided
res = np.empty([collection_size, collection_size])
for idx, elem in enumerate(dist_mat):
res[idx,:] = elem
return res
def main(argv):
if len(sys.argv) < 2:
print(f"usage: {argv[0]} collection_size [nb_procs]")
sys.exit(1)
if len(sys.argv) < 3:
nb_proc = mp.cpu_count()
else:
nb_proc = int(sys.argv[2])
collection_size = int(sys.argv[1])
col = np.random.random(collection_size)
print(f"collection: {col}")
res = compute_abs_dist(col, nb_proc)
print(f"res={res}")
import matplotlib.pyplot as plt
plt.imshow(res)
plt.show()
if __name__ == '__main__':
main(sys.argv)
......@@ -23,7 +23,7 @@ def serie_pair_index_generator(number):
)
def DTWDistance(s1, s2):
def dtw_distance(s1, s2):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first serie (ie an iterable over floats64)
......@@ -83,7 +83,7 @@ def compute(series, nb_series):
_dist_mat_dtw = np.zeros((nb_series, nb_series), dtype=np.float64)
_dist_mat_cort = np.zeros((nb_series, nb_series), dtype=np.float64)
for t1, t2 in gen:
dist_dtw = DTWDistance(series[t1], series[t2])
dist_dtw = dtw_distance(series[t1], series[t2])
_dist_mat_dtw[t1, t2] = dist_dtw
_dist_mat_dtw[t2, t1] = dist_dtw
dist_cort = 0.5 * (1 - cort(series[t1], series[t2]))
......
import cProfile
import pstats
from time import time
from dtw_cort_dist_mat import main, compute
series, nb_series = main(only_init=True)
t0 = t0 = time()
a, b = compute(series, nb_series)
t_end = time()
print('\nelapsed time = {:.3f} s'.format(t_end - t0))
t0 = t0 = time()
cProfile.runctx("a, b = compute(series, nb_series)", globals(), locals(), "prof.pstats")
t_end = time()
s = pstats.Stats('prof.pstats')
s.sort_stats('time').print_stats(12)
print('\nelapsed time = {:.3f} s'.format(t_end - t0))
print(
'\nwith gprof2dot and graphviz (command dot):\n'
'gprof2dot -f pstats prof.pstats | dot -Tpng -o prof.png')
......@@ -23,7 +23,7 @@ def serie_pair_index_generator(number):
)
def DTWDistance(s1, s2):
def dtw_distance(s1, s2):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first serie (ie an iterable over floats64)
......@@ -79,7 +79,7 @@ def compute(series, nb_series):
_dist_mat_dtw = np.zeros((nb_series, nb_series), dtype=np.float64)
_dist_mat_cort = np.zeros((nb_series, nb_series), dtype=np.float64)
for t1, t2 in gen:
dist_dtw = DTWDistance(series[t1], series[t2])
dist_dtw = dtw_distance(series[t1], series[t2])
_dist_mat_dtw[t1, t2] = dist_dtw
_dist_mat_dtw[t2, t1] = dist_dtw
dist_cort = 0.5 * (1 - cort(series[t1], series[t2]))
......
......@@ -8,7 +8,7 @@ from libc.math cimport abs
@cython.boundscheck(False)
@cython.wraparound(False)
def DTWDistance(double[:] s1, double[:] s2):
def dtw_distance(double[:] s1, double[:] s2):
""" Computes the dtw between s1 and s2 with distance the absolute distance
:param s1: the first serie (ie an iterable over floats64)
......