Commit 165e85fd authored by Raphael Bacher's avatar Raphael Bacher
Browse files
parents 305d06e8 5af6249e
......@@ -27,4 +27,5 @@ ipynb/index.html
pyfiles/dtw_cort_dist/V5_cython/*.c
pyfiles/dtw_cort_dist/V5_cython/*.html
**/V*/res_cort.npy
**/V*/res_dtw.npy
\ No newline at end of file
**/V*/res_dtw.npy
pyfiles/dtw_cort_dist/V*/prof.png
%% Cell type:markdown id: tags:
# Profiling
Pierre Augier (LEGI), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre), Christophe Picard (LJK), Loïc Huder (ISTerre)
%% Cell type:markdown id: tags:
### Measure ⏱, don't guess! Profile to find the bottlenecks.
<p class="small"><br></p>
### Do not optimize everything!
- *"Premature optimization is the root of all evil"* (Donald Knuth)
- 80 / 20 rule, efficiency important for expensive things and NOT for small things
%% Cell type:markdown id: tags:
# Road map
# Different types of profiling
## Time profiling
- timeit
- script base time (unix cmd)
- function based profiling (cprofile)
- line base profiling
- Small code snippets
- Script based benchmark
- Function based profiling
- Line based profiling
<p class="small"><br></p>
## Memory profiling
- further readings
%% Cell type:markdown id: tags:
timeit
------
## Small code snippets
In ipython, you can use the magic command timeit that execute a piece of code and stats the time it spends:
- There is a module [`timeit` in the standard library](https://docs.python.org/3/library/timeit.html).
%% Cell type:markdown id: tags:
`python3 -m timeit -s "import math; l=[]" "for x in range(100): l.append(math.pow(x,2))"`
Basic profiling
-----------------
Problem: the module `timeit` does not try to guess how many times to execute the statement.
While writing code, you can use the magic command timeit:
- In IPython, you can use the magic command `%timeit` that execute a piece of code and stats the time it spends:
%% Cell type:code id: tags:
``` python
import math
l=[]
l = []
%timeit for x in range(100): l.append(math.pow(x,2))
%timeit [math.pow(x,2) for x in range(100)]
l = []
%timeit for x in range(100): l.append(x*x)
%timeit [x*x for x in range(100)]
```
%% Output
100000 loops, best of 5: 16 µs per loop
100000 loops, best of 5: 12.7 µs per loop
100000 loops, best of 5: 6.7 µs per loop
100000 loops, best of 5: 3.98 µs per loop
26.8 µs ± 373 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
19.7 µs ± 146 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
11.2 µs ± 37.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
5.53 µs ± 11.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%% Cell type:markdown id: tags:
Basic profiling
-----------------
- [`pyperf`](https://pypi.org/project/pyperf/) is a more powerful tool but we can also do the same as with the module `timeit`:
`python3 -m pyperf timeit -s "import math; l=[]" "for x in range(100): l.append(math.pow(x,2))"`
Evaluate you script as a whole, *e.g.* using the unix time function:
%% Cell type:markdown id: tags:
`time myscript intput_data`
## Script base benchmark
Evaluate the time execution of your script as a whole
- Using the Unix command `time`:
`time myscript.py`
- Using the Unix program [`perf`](https://perf.wiki.kernel.org)
`perf myscript.py`
Issues:
- not accurate (only one run!)
- includes the import and initialization time. It can be better to modify the script to print the elapsed time measured with:
%% Cell type:code id: tags:
``` python
from time import time
l = []
t_start = time()
[math.pow(x,2) for x in range(100)]
print(f"elapsed time: {time() - t_start:.2e} s")
```
%% Output
elapsed time: 2.56e-04 s
%% Cell type:markdown id: tags:
## Function based profiling (cProfile)
cProfile (https://docs.python.org/3.7/library/profile.html): **deterministic profiling** of Python programs.
2 steps: (1) run the profiler and (2) analyze the results.
1. Run the profiler
- With an already written script `python3 -m cProfile myscript.py`
- Much better, write a dedicated script using the module cProfile. See `pyfiles/dtw_cort_dist/V0_numpy_loops/prof.py`
**Warning: profiling is much slower than a classical run, so do not profile with a long during setting**
2. Analyze the results
The standard tool is `pstats` (https://docs.python.org/3.7/library/profile.html#module-pstats)
Or visualize the results with `gprof2dot`, `SnakeViz`, `pyprof2calltree` and `kcachegrind`
Example: `pyprof2calltree -i prof.pstats -k`
%% Cell type:markdown id: tags:
Function based profiling (cprofile)
-----------------------------------------
## Statistical profiling
See http://pramodkumbhar.com/2019/01/python-deterministic-vs-statistical-profilers/
Use the cProfile module to profile the code.
Advantage compared to deterministic profiling: **very small overhead**
- Option -s ask to sort using cumulative time
- profile_data.pyprof is the output of the profiling
- myscript intput_data: the script with its regular arguments
- [pyflame](https://github.com/uber/pyflame)
**Warning: profiling is much slower than a classical run, so do not profile with a long during setting**
- [py-spy](https://github.com/benfred/py-spy)
`python3 -m cProfile -s cumulative -o profile_data.pyprof myscript intput_data`
- [plop](https://github.com/bdarnell/plop)
%% Cell type:markdown id: tags:
Visualize you result (*e.g.*) using `pyprof2calltree` and `kcachegrind`
## Line based profiling
`pyprof2calltree -i profile_data.pyprof -k`
- [line_profiler](https://github.com/rkern/line_profiler)
- [pprofile](https://github.com/vpelletier/pprofile)
%% Cell type:markdown id: tags:
Line based profiling
-----------------------
## Memory profiler
- pprofile
- vprof
- [memory_profiler](https://pypi.org/project/memory-profiler/)
%% Cell type:markdown id: tags:
Memory profiler
-----------------
## Time and memory profiler
- [vprof](https://pypi.org/project/vprof/)
%% Cell type:markdown id: tags:
# Further reading
More on profiling on the stackoverflow discussion:
More on profiling on a stackoverflow discussion:
https://stackoverflow.com/questions/582336/how-can-you-profile-a-python-script
......
%% Cell type:markdown id: tags:
# Wrapping codes in static languages
%% Cell type:markdown id: tags:
We consider here wrapping two static languages: C and fortran.
We classically wrapp already existing code to access them via python.
Depending on the language to wrap the tool to use are a bit different.
%% Cell type:markdown id: tags:
## Fortran with [f2py](https://docs.scipy.org/doc/numpy/f2py/)
`f2py` is a tool that allows to call Fortran code into Python. It is a part of `numpy` meaning that to use it, we only need to install and import numpy (which should already be done if you do scientific Python !) :
```bash
pip3 install numpy
```
%% Cell type:markdown id: tags:
### How does it work ?
The documentation gives several ways to wrap Fortran codes but it all boils down to the same thing:
**f2py allows to wrap the Fortran code in a Python module that can be then imported**
Given this simple Fortran (F90) snippet, that computes the sum of squares of the element of an array:
%% Cell type:markdown id: tags:
*pyfiles/f2py/file_to_wrap.f90*
```fortran
subroutine sum_squares(A, res)
implicit none
real, dimension(:) :: A
real :: res
integer :: i, N
N = size(A)
res = 0.
do i=1, N
res = res + A(i)*A(i)
end do
end subroutine
```
%% Cell type:markdown id: tags:
The Fortran code can then be wrapped in one command:
%% Cell type:markdown id: tags:
```bash
# Syntax: python3 -m numpy.f2py -c <Fortran_files> -m <module_name>
python3 -m numpy.f2py -c "../pyfiles/f2py/file_to_wrap.f90" -m wrap_f90
```
This command calls the module `f2py` of `numpy` to compile (`-c`) *file_to_wrap.f90* into a Python module (`-m`) named *wrap_f90*. The module can then be imported in Python:
%% Cell type:code id: tags:
``` python
import numpy as np
import wrap_f90
A = np.ones(10)
result = 0.
wrap_f90.sum_squares(A, result)
print(result)
```
%% Cell type:markdown id: tags:
### With intents
%% Cell type:markdown id: tags:
In Fortran, it is considered best practice to put intents to subroutine arguments. This also helps `f2py` to wrap efficiently the code but also changes the subroutine a bit.
Let's wrap the code updated with intents:
%% Cell type:markdown id: tags:
*pyfiles/f2py/file_to_wrap2.f90*
```fortran
subroutine sum_squares(A, res)
implicit none
real, dimension(:), intent(in) :: A
real, intent(out) :: res
integer :: i, N
N = size(A)
res = 0.
do i=1, N
res = res + A(i)*A(i)
end do
end subroutine
```
%% Cell type:markdown id: tags:
Again, we wrap...
%% Cell type:markdown id: tags:
```bash
python3 -m numpy.f2py -c "../pyfiles/f2py/file_to_wrap2.f90" -m wrap_f90
```
%% Cell type:markdown id: tags:
And we import...
%% Cell type:code id: tags:
``` python
import numpy as np
import wrap_f90
A = np.ones(10)
result = wrap_f90.sum_squares(A)
print(result)
```
%% Cell type:markdown id: tags:
This time, f2py recognized that `result` was a outgoing arg. As a consequence, the subroutine was wrapped smartly and made to return the arg.
Note that using a `function` (in the Fortran sense of the term) leads to the same result (see the other example in *pyfiles/f2py/file_to_wrap2.f90*).
%% Cell type:markdown id: tags:
### With modules
%% Cell type:markdown id: tags:
In Fortran, it is also considered best practice to organize the subroutines in modules. These are highly similar to Python modules and are in fact, intepreted as such by f2py !
Consider the following code that implements the dtw and cort computations in Fortran:
*pyfiles/dtw_cort_dist/V9_fortran/dtw_cort.f90*
```fortran
module dtw_cort
implicit none
contains
subroutine dtwdistance(s1, s2, dtw_result)
! Computes the dtw between s1 and s2 with distance the absolute distance
doubleprecision, intent(in) :: s1(:), s2(:)
doubleprecision, intent(out) :: dtw_result
integer :: i, j
integer :: len_s1, len_s2
doubleprecision :: dist
doubleprecision, allocatable :: dtw_mat(:, :)
len_s1 = size(s1)
len_s2 = size(s1)
allocate(dtw_mat(len_s1, len_s2))
dtw_mat(1, 1) = dabs(s1(1) - s2(1))
do j = 2, len_s2
dist = dabs(s1(1) - s2(j))
dtw_mat(1, j) = dist + dtw_mat(1, j-1)
end do
do i = 2, len_s1
dist = dabs(s1(i) - s2(1))
dtw_mat(i, 1) = dist + dtw_mat(i-1, 1)
end do
! Fill the dtw_matrix
do i = 2, len_s1
do j = 2, len_s2
dist = dabs(s1(i) - s2(j))
dtw_mat(i, j) = dist + dmin1(dtw_mat(i - 1, j), &
dtw_mat(i, j - 1), &
dtw_mat(i - 1, j - 1))
end do
end do
dtw_result = dtw_mat(len_s1, len_s2)
end subroutine dtwdistance
doubleprecision function cort(s1, s2)
! Computes the cort between s1 and s2 (assuming they have the same length)
doubleprecision, intent(in) :: s1(:), s2(:)
integer :: len_s1, t
doubleprecision :: slope_1, slope_2
doubleprecision :: num, sum_square_x, sum_square_y
len_s1 = size(s1)
num = 0
sum_square_x = 0
sum_square_y = 0
do t=1, len_s1 - 1
slope_1 = s1(t + 1) - s1(t)
slope_2 = s2(t + 1) - s2(t)
num = num + slope_1 * slope_2
sum_square_x = sum_square_x + slope_1 * slope_1
sum_square_y = sum_square_y + slope_2 * slope_2
end do
cort = num / (dsqrt(sum_square_x*sum_square_y))
end function cort
end module dtw_cort
```
%% Cell type:markdown id: tags:
The subroutines `dtwdistance` and `cort` are part of the `dtw_cort` module. The file can be wrapped as before
```bash
python3 -m numpy.f2py -c "../pyfiles/dtw_cort_dist/V9_fortran/dtw_cort.f90" -m distances_fort
```
%% Cell type:markdown id: tags:
But the import slighlty changes as `dtw_cort` is now a module of `distances_fort`:
%% Cell type:code id: tags:
``` python
import numpy as np
from distances_fort import dtw_cort
cort_result = dtw_cort.cort(s1, s2)
print(cort_result)
```
%% Cell type:markdown id: tags:
Note that the wrapping integrates the documentation of the function (if written...) !
%% Cell type:code id: tags:
``` python
from distances_fort import dtw_cort
print(dtw_cort.cort.__doc__)
```
%% Cell type:markdown id: tags:
### To go further...
Running the command `python3 -m numpy.f2py` (without arguments) gives a lot of information on the supported arguments for further uses of `f2py`. Know that you can this way:
* Specify the compiler to use
* Give the compiler flags (warnings, optimisations...)
* Specify the functions to wrap
* ...
The documentation of f2py (https://docs.scipy.org/doc/numpy/f2py/) can also help, covering notably:
* `Cf2py` directives to overcome F77 limitations (e.g. intents)
* How to integrate Fortran sources to your Python packages and compile them on install
* How to use `f2py` inside Python scripts
* ...
%% Cell type:markdown id: tags:
Wrapping C code
--------------------------
%% Cell type:markdown id: tags:
They are different ways of wrapping C code. We present CFFI.
The workflow is the following:
1. Get your C code working (with a .c and a .h)
2. Set up your packaging to compile your code as a module
3. Compile your code
4. In the python code, declare the function you will be using
5. In the python code, open/load the compiled module
6. Use your functions
%% Cell type:markdown id: tags:
**1. Get your C code working (with a .c and a .h)**
Ok, supposed to be done
%% Cell type:markdown id: tags:
**2. Set up your packaging to compile your code as a module**
We give the compilation instructions in the file setup.py:
%% Cell type:raw id: tags:
from setuptools import setup, Extension
version = "0.1"
module_distance = Extension(
name="cdtw",
sources=["cdtw.c"],
)
setup(
name="dtw_cort_dist_mat",
version=version,
description="data scientist tool for time series",
long_description="data scientist tool for time series",
classifiers=[],
author="Robert Bidochon",
author_email="robert@bidochon.fr",
license="GPL",
include_package_data=True,
install_requires=["cffi", "numpy", "setuptools"],
entry_points="",
ext_modules=[module_distance],
)
%% Cell type:markdown id: tags:
**3. Compile your code**
%% Cell type:markdown id: tags:
in a terminal, type:
python3 setup.py build_ext
%% Cell type:markdown id: tags:
**4. In the python code, declare the function you will be using**
%% Cell type:code id: tags:
``` python
from cffi import FFI
ffi = FFI()
ffi.cdef("double square(double x, double y);")
```
%% Cell type:markdown id: tags:
**5. In the python code, open/load the compiled module**
%% Cell type:raw id: tags:
dllib = ffi.dlopen(
str(my_dir / ("cdtw" + sysconfig.get_config_var("EXT_SUFFIX")))
)
%% Cell type:markdown id: tags:
**6. Use your functions**
%% Cell type:markdown id: tags:
sq = dllib.square(2.0, 3.0)
%% Cell type:markdown id: tags:
Alternatives techniques:
------------------------------------
The historical tool is swig (http://swig.org/). It allows to access C/C++ code from a variety of languages.
It requires the writing of an intermediate file that describes the C API.
From now wrapping C code can be done quite easily using CFFI as presented before.