Commit b19669b1 authored by paugier's avatar paugier
Browse files

Work on second week presentations

parent 6b5e018e
......@@ -10,44 +10,62 @@
"source": [
"# Processing time evaluation\n",
"\n",
"Pierre Augier (LEGI), Raphaël Bacher (Gipsa), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre),Loïc Huder (ISTerre)\n"
"Pierre Augier (LEGI), Raphaël Bacher (Gipsa), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre), Loïc Huder (ISTerre)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Given a problem (*e.g* finding which element is missing in a listof elements), we want to code a solution that solves the problem. The classical workflow is: \n",
"\n",
" * **design** an algorithm that solves the problem.\n",
" - study the input of the algorithm (it is special in one sens -- *e.g* the list we are given is almost sorted)\n",
" - design an algorithms comes (given the specificity of your data)\n",
" - choose the adequate data structure, *i.e.* the data structure that will optimize the relevant operations, a.k.a the operations that takes time or that are repeated a large number of time. \n",
" - check the adequation of\n",
" * **evaluate** the complexity of the algorithm (theoretical point of view)\n",
" * **take care** of the special cases (can my list be empty, in such a case what is my strategy ?)\n",
" * **write your specs**: *e.g.* if the list is empty, we raise an exception.\n",
" * **write some tests** to check your implementation is correct\n",
" * **code** *i.e.* do the implementation of both code and test\n",
" * **profile**: check where are the bottleneck \n",
" * **code** the implementation,\n",
" * **profile**\n",
" * ...."
"### Measure ⏱, don't guess! Profile to find the bottlenecks.\n",
"\n",
"\n",
"### Do not optimize everything!\n",
"\n",
"- *\"Premature optimization is the root of all evil\"* (Donald Knuth)\n",
"\n",
"- 80 / 20 rule, efficiency important for expensive things and NOT for small things"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Context: some notes on developing efficient software\n",
"\n",
"Given a problem (e.g. finding which element is missing in a list of elements), we want to code a solution that solves the problem. The classical workflow is: \n",
"\n",
"* **design** an algorithm that solves the problem.\n",
" - study the input of the algorithm (is it special in one sens?)\n",
" - design an algorithms comes (given the specificity of your data)\n",
" - choose the adequate data structure, i.e. the data structure that will optimize the relevant operations, a.k.a the operations that takes time or that are repeated a large number of time. \n",
"\n",
"* **evaluate** the complexity of the algorithm (theoretical point of view)\n",
"* **take care** of the special cases (can my list be empty, in such a case what is my strategy ?)\n",
"* **write your specs**: for example if the list is empty, we raise an exception.\n",
"* **write some tests** to check your implementation is correct\n",
"* **code**\n",
"* **profile**: find where are the bottlenecks\n",
"* **code**\n",
"* **profile**\n",
"* ..."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note 1:\n",
"----------\n",
"## Note 1\n",
" \n",
" If your data is large enough, a basic implementation of an algorithm with low complexity will run faster than a fined tuned implementation of a algorithm with high complexity\n",
"If your data is large enough, a basic implementation of an algorithm with low complexity will run faster than a fined tuned implementation of a algorithm with high complexity\n",
" \n",
"Example:\n",
"--------------\n",
"### Example\n",
"\n",
"Looking for the missing element problem: we know that all the element for 0 to N should be present. We can compute the sum and calculate the difference between the computed sum and the mathematical sum. This algorithm acces only once each element. It thus has an $O(N)$ complexity, where N is the number of elements. \n",
"Looking for the missing element problem: we know that all the element for 0 to N should be present. We can compute the sum and calculate the difference between the computed sum and the mathematical sum. This algorithm access only once each element. It thus has an $O(N)$ complexity, where N is the number of elements. \n",
"\n",
"An algorithm that checks if element e belongs to the list, for each e will has an $O(N^2)$ complexity and will be slower that the previous one for **sufficient large value of N**. \n"
]
......@@ -56,50 +74,31 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Note 2:\n",
"----------\n",
"## Note 2\n",
" \n",
" If your data has some specificity, take advantage of it. \n",
"If your data has some specificity, take advantage of it. \n",
" \n",
" \n",
"Example:\n",
"--------------\n",
"### Example\n",
"\n",
" - if your list is sorted, solving the above problem can be done by checking that two consecutive elements in the list are consective numbers. The complexity is thus $O(N)$\n",
" - sorting N elements can be done in $O(N)$ in the special case where the $N$ items belongs to a range of size N."
"- if your list is sorted, solving the above problem can be done by checking that two consecutive elements in the list are consective numbers. The complexity is thus $O(N)$\n",
"- sorting N elements can be done in $O(N)$ in the special case where the $N$ items belongs to a range of size N."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note 3:\n",
"----------\n",
"## Note 3\n",
" \n",
" Complexity analysis is done over the **worst case**, what is the worst input for our algorithm\n",
"Complexity analysis is done over the **worst case**, what is the worst input for our algorithm\n",
" \n",
"Example:\n",
"--------------\n",
"\n",
" - sorting elements\n",
" - worst case: elements are already sorted but in reverse order\n",
" \n",
" - counting\n",
" - all counts are 1 .... "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Measure ⏱, don't guess! Profile to find the bottlenecks.\n",
"\n",
"\n",
"### Do not optimize everything!\n",
"### Example\n",
"\n",
"- *\"Premature optimization is the root of all evil\"* (Donald Knuth)\n",
"Sorting elements: \n",
"\n",
"- 80 / 20 rule, efficiency important for expensive things and NOT for small things"
"worst case = elements are already sorted but in reverse order\n",
" "
]
},
{
......@@ -145,9 +144,20 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 1,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"18.7 µs ± 700 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n",
"13.5 µs ± 566 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n",
"7.27 µs ± 38.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n",
"3.74 µs ± 18.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n"
]
}
],
"source": [
"import math\n",
"l = [] \n",
......@@ -161,6 +171,19 @@
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- [`pyperf`](https://pypi.org/project/pyperf/) is a more powerful tool but we can also do the same as with the module `timeit`:\n",
"\n",
"`python3 -m pyperf timeit -s \"import math; l=[]\" \"for x in range(100): l.append(math.pow(x,2))\"`"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Do not guess (the return of word counting problem)\n",
"\n"
......@@ -168,9 +191,23 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 2,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"len(s) = 7160000, nbkeys 33 base, count, count_count, except, colection.counter\n",
"517 ms ± 2.26 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n",
"478 ms ± 3.98 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n",
"196 ms ± 680 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n",
"464 ms ± 12.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n",
"247 ms ± 843 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)\n",
"414 ms ± 3.04 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"def build_count_base(t): \n",
" d = {} \n",
......@@ -220,8 +257,30 @@
"%timeit build_count_count(s)\n",
"%timeit build_count_excpt(s)\n",
"%timeit build_count_counter(s)\n",
"%timeit build_count_defaultdict(s)\n",
"\n",
"%timeit build_count_defaultdict(s)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"with split\n",
"len(s) = 1100000, nbkeys 90 base, count, count_count, except, colection.counter\n",
"113 ms ± 4.27 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n",
"104 ms ± 2.48 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n",
"982 ms ± 2.85 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n",
"93.2 ms ± 347 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n",
"59.9 ms ± 1.52 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n",
"410 ms ± 2.74 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"print(\"with split\")\n",
"s2 = s.split()\n",
"print(f\"len(s) = {len(s2)}, nbkeys {len(set(s2))} base, count, count_count, except, colection.counter\")\n",
......@@ -237,25 +296,16 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Conclusion\n",
"### Conclusion of these measurements\n",
"\n",
"The best performing algorithm depends upon the feature of the input.\n",
"The best performing algorithm depends on the feature of the input.\n",
"\n",
"In our case: how many different words do we have in our vocabulary: \n",
"In our case: how many different words do we have in our vocabulary?\n",
" - 4 for ADN, \n",
" - 26 for letters, \n",
" - 60 Millions for natural texts"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- [`pyperf`](https://pypi.org/project/pyperf/) is a more powerful tool but we can also do the same as with the module `timeit`:\n",
"\n",
"`python3 -m pyperf timeit -s \"import math; l=[]\" \"for x in range(100): l.append(math.pow(x,2))\"`"
]
},
{
"cell_type": "markdown",
"metadata": {
......@@ -284,9 +334,17 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 4,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"elapsed time: 9.47e-05 s\n"
]
}
],
"source": [
"from time import time\n",
"\n",
......@@ -422,7 +480,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
"version": "3.7.2"
}
},
"nbformat": 4,
......
%% Cell type:markdown id: tags:
# Processing time evaluation
Pierre Augier (LEGI), Raphaël Bacher (Gipsa), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre),Loïc Huder (ISTerre)
Pierre Augier (LEGI), Raphaël Bacher (Gipsa), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre), Loïc Huder (ISTerre)
%% Cell type:markdown id: tags:
Given a problem (*e.g* finding which element is missing in a listof elements), we want to code a solution that solves the problem. The classical workflow is:
### Measure ⏱, don't guess! Profile to find the bottlenecks.
* **design** an algorithm that solves the problem.
- study the input of the algorithm (it is special in one sens -- *e.g* the list we are given is almost sorted)
- design an algorithms comes (given the specificity of your data)
- choose the adequate data structure, *i.e.* the data structure that will optimize the relevant operations, a.k.a the operations that takes time or that are repeated a large number of time.
- check the adequation of
* **evaluate** the complexity of the algorithm (theoretical point of view)
* **take care** of the special cases (can my list be empty, in such a case what is my strategy ?)
* **write your specs**: *e.g.* if the list is empty, we raise an exception.
* **write some tests** to check your implementation is correct
* **code** *i.e.* do the implementation of both code and test
* **profile**: check where are the bottleneck
* **code** the implementation,
* **profile**
* ....
%% Cell type:markdown id: tags:
### Do not optimize everything!
- *"Premature optimization is the root of all evil"* (Donald Knuth)
Note 1:
----------
- 80 / 20 rule, efficiency important for expensive things and NOT for small things
If your data is large enough, a basic implementation of an algorithm with low complexity will run faster than a fined tuned implementation of a algorithm with high complexity
%% Cell type:markdown id: tags:
Example:
--------------
## Context: some notes on developing efficient software
Looking for the missing element problem: we know that all the element for 0 to N should be present. We can compute the sum and calculate the difference between the computed sum and the mathematical sum. This algorithm acces only once each element. It thus has an $O(N)$ complexity, where N is the number of elements.
Given a problem (e.g. finding which element is missing in a list of elements), we want to code a solution that solves the problem. The classical workflow is:
An algorithm that checks if element e belongs to the list, for each e will has an $O(N^2)$ complexity and will be slower that the previous one for **sufficient large value of N**.
* **design** an algorithm that solves the problem.
- study the input of the algorithm (is it special in one sens?)
- design an algorithms comes (given the specificity of your data)
- choose the adequate data structure, i.e. the data structure that will optimize the relevant operations, a.k.a the operations that takes time or that are repeated a large number of time.
* **evaluate** the complexity of the algorithm (theoretical point of view)
* **take care** of the special cases (can my list be empty, in such a case what is my strategy ?)
* **write your specs**: for example if the list is empty, we raise an exception.
* **write some tests** to check your implementation is correct
* **code**
* **profile**: find where are the bottlenecks
* **code**
* **profile**
* ...
%% Cell type:markdown id: tags:
Note 2:
----------
## Note 1
If your data has some specificity, take advantage of it.
If your data is large enough, a basic implementation of an algorithm with low complexity will run faster than a fined tuned implementation of a algorithm with high complexity
### Example
Example:
--------------
Looking for the missing element problem: we know that all the element for 0 to N should be present. We can compute the sum and calculate the difference between the computed sum and the mathematical sum. This algorithm access only once each element. It thus has an $O(N)$ complexity, where N is the number of elements.
- if your list is sorted, solving the above problem can be done by checking that two consecutive elements in the list are consective numbers. The complexity is thus $O(N)$
- sorting N elements can be done in $O(N)$ in the special case where the $N$ items belongs to a range of size N.
An algorithm that checks if element e belongs to the list, for each e will has an $O(N^2)$ complexity and will be slower that the previous one for **sufficient large value of N**.
%% Cell type:markdown id: tags:
Note 3:
----------
## Note 2
Complexity analysis is done over the **worst case**, what is the worst input for our algorithm
If your data has some specificity, take advantage of it.
Example:
--------------
- sorting elements
- worst case: elements are already sorted but in reverse order
### Example
- counting
- all counts are 1 ....
- if your list is sorted, solving the above problem can be done by checking that two consecutive elements in the list are consective numbers. The complexity is thus $O(N)$
- sorting N elements can be done in $O(N)$ in the special case where the $N$ items belongs to a range of size N.
%% Cell type:markdown id: tags:
### Measure ⏱, don't guess! Profile to find the bottlenecks.
## Note 3
Complexity analysis is done over the **worst case**, what is the worst input for our algorithm
### Do not optimize everything!
### Example
- *"Premature optimization is the root of all evil"* (Donald Knuth)
Sorting elements:
worst case = elements are already sorted but in reverse order
- 80 / 20 rule, efficiency important for expensive things and NOT for small things
%% Cell type:markdown id: tags:
# Different types of profiling
## Time profiling
- Small code snippets
- Script based benchmark
- Function based profiling
- Line based profiling
<p class="small"><br></p>
## Memory profiling
%% Cell type:markdown id: tags:
## Small code snippets
- There is a module [`timeit` in the standard library](https://docs.python.org/3/library/timeit.html).
`python3 -m timeit -s "import math; l=[]" "for x in range(100): l.append(math.pow(x,2))"`
Problem: the module `timeit` does not try to guess how many times to execute the statement.
- In IPython, you can use the magic command `%timeit` that execute a piece of code and stats the time it spends:
%% Cell type:code id: tags:
``` python
import math
l = []
%timeit for x in range(100): l.append(math.pow(x,2))
%timeit [math.pow(x,2) for x in range(100)]
l = []
%timeit for x in range(100): l.append(x*x)
%timeit [x*x for x in range(100)]
```
%%%% Output: stream
18.7 µs ± 700 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
13.5 µs ± 566 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
7.27 µs ± 38.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
3.74 µs ± 18.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%% Cell type:markdown id: tags:
- [`pyperf`](https://pypi.org/project/pyperf/) is a more powerful tool but we can also do the same as with the module `timeit`:
`python3 -m pyperf timeit -s "import math; l=[]" "for x in range(100): l.append(math.pow(x,2))"`
%% Cell type:markdown id: tags:
## Do not guess (the return of word counting problem)
%% Cell type:code id: tags:
``` python
def build_count_base(t):
d = {}
for s in t:
if s in d:
d[s] += 1
else:
d[s] = 1
return d
def build_count_set(t):
d = {k:0 for k in set(t)}
for s in t:
d[s] += 1
return d
def build_count_count(t):
d = {k:t.count(k) for k in set(t)}
return d
def build_count_excpt(t):
d = {}
for s in t:
try:
d[s] += 1
except:
d[s] = 1
return d
import collections
def build_count_counter(t):
return collections.Counter(t)
def build_count_defaultdict(t):
d = collections.defaultdict(int)
for k in s:
d[k] += 1
return d
s = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam tristique at velit in varius. Cras ut ultricies orci. Fusce vel consequat ante, vitae luctus tortor. Sed condimentum faucibus enim, sit amet pulvinar ligula feugiat ac. Sed interdum id risus id rhoncus. Nullam nisi justo, ultrices eu est nec, hendrerit maximus lorem. Nam urna eros, accumsan nec magna eu, elementum semper diam. Nulla tempus, nibh id elementum dapibus, ex diam lacinia est, sit amet suscipit nulla nibh eu sapien. Aliquam orci enim, malesuada in facilisis vitae, pharetra sit amet mi. Pellentesque mi tortor, sagittis quis odio quis, fermentum faucibus ex. Aenean sagittis nisl orci. Maecenas tristique velit sed leo facilisis porttitor. "
s = s*10000
len(s)
print(f"len(s) = {len(s)}, nbkeys {len(set(s))} base, count, count_count, except, colection.counter")
%timeit build_count_base(s)
%timeit build_count_set(s)
%timeit build_count_count(s)
%timeit build_count_excpt(s)
%timeit build_count_counter(s)
%timeit build_count_defaultdict(s)
```
%%%% Output: stream
len(s) = 7160000, nbkeys 33 base, count, count_count, except, colection.counter
517 ms ± 2.26 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
478 ms ± 3.98 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
196 ms ± 680 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
464 ms ± 12.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
247 ms ± 843 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
414 ms ± 3.04 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%% Cell type:code id: tags:
``` python
print("with split")
s2 = s.split()
print(f"len(s) = {len(s2)}, nbkeys {len(set(s2))} base, count, count_count, except, colection.counter")
%timeit build_count_base(s2)
%timeit build_count_set(s2)
%timeit build_count_count(s2)
%timeit build_count_excpt(s2)
%timeit build_count_counter(s2)
%timeit build_count_defaultdict(s2)
```
%%%% Output: stream
with split
len(s) = 1100000, nbkeys 90 base, count, count_count, except, colection.counter
113 ms ± 4.27 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
104 ms ± 2.48 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
982 ms ± 2.85 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
93.2 ms ± 347 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
59.9 ms ± 1.52 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
410 ms ± 2.74 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%% Cell type:markdown id: tags:
# Conclusion
### Conclusion of these measurements
The best performing algorithm depends upon the feature of the input.
The best performing algorithm depends on the feature of the input.
In our case: how many different words do we have in our vocabulary:
In our case: how many different words do we have in our vocabulary?
- 4 for ADN,
- 26 for letters,
- 60 Millions for natural texts
%% Cell type:markdown id: tags:
- [`pyperf`](https://pypi.org/project/pyperf/) is a more powerful tool but we can also do the same as with the module `timeit`:
`python3 -m pyperf timeit -s "import math; l=[]" "for x in range(100): l.append(math.pow(x,2))"`
%% Cell type:markdown id: tags:
## Script base benchmark
Evaluate the time execution of your script as a whole
- Using the Unix command `time`:
`time myscript.py`
- Using the Unix program [`perf`](https://perf.wiki.kernel.org)
`perf myscript.py`
Issues:
- not accurate (only one run!)
- includes the import and initialization time. It can be better to modify the script to print the elapsed time measured with:
%% Cell type:code id: tags:
``` python
from time import time
l = []
t_start = time()
[math.pow(x,2) for x in range(100)]
print(f"elapsed time: {time() - t_start:.2e} s")
```
%%%% Output: stream
elapsed time: 9.47e-05 s
%% Cell type:markdown id: tags:
## Function based profiling (cProfile)
cProfile (https://docs.python.org/3.7/library/profile.html): **deterministic profiling** of Python programs.
2 steps: (1) run the profiler and (2) analyze the results.
1. Run the profiler
- With an already written script `python3 -m cProfile myscript.py`
- Much better, write a dedicated script using the module cProfile. See `pyfiles/dtw_cort_dist/V0_numpy_loops/prof.py`
**Warning: profiling is much slower than a classical run, so do not profile with a long during setting**
2. Analyze the results
The standard tool is `pstats` (https://docs.python.org/3.7/library/profile.html#module-pstats)
Or visualize the results with `gprof2dot`, `SnakeViz`, `pyprof2calltree` and `kcachegrind`
Example: `pyprof2calltree -i prof.pstats -k`
%% Cell type:markdown id: tags:
## Statistical profiling
See http://pramodkumbhar.com/2019/01/python-deterministic-vs-statistical-profilers/
Advantage compared to deterministic profiling: **very small overhead**
- [pyflame](https://github.com/uber/pyflame)
- [py-spy](https://github.com/benfred/py-spy)
- [plop](https://github.com/bdarnell/plop)
%% Cell type:markdown id: tags:
## Line based profiling
- [line_profiler](https://github.com/rkern/line_profiler)
- [pprofile](https://github.com/vpelletier/pprofile)
%% Cell type:markdown id: tags:
## Memory profiler
- [memory_profiler](https://pypi.org/project/memory-profiler/)
%% Cell type:markdown id: tags:
## Time and memory profiler
- [vprof](https://pypi.org/project/vprof/)
%% Cell type:markdown id: tags:
# Further reading
More on profiling on a stackoverflow discussion:
https://stackoverflow.com/questions/582336/how-can-you-profile-a-python-script
......
......@@ -15,11 +15,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We consider here wrapping two static languages: C and Fortran.\n",
"We mainly consider here wrapping codes in C and Fortran.\n",
"\n",
"We classically wrap already existing code to access them via Python. \n",
"We classically wrap already existing codes to access them via Python. \n",
"\n",
"Depending on the language to wrap the tool to use are different. \n"
"Depending on the language to wrap, the tools are different. \n"
]
},
{
......
%% Cell type:markdown id: tags:
# Wrapping codes in static languages
%% Cell type:markdown id: tags:
We consider here wrapping two static languages: C and Fortran.
We mainly consider here wrapping codes in C and Fortran.
We classically wrap already existing code to access them via Python.
We classically wrap already existing codes to access them via Python.
Depending on the language to wrap the tool to use are different.
Depending on the language to wrap, the tools are different.
%% Cell type:markdown id: tags:
## Fortran with [f2py](https://docs.scipy.org/doc/numpy/f2py/)
`f2py` is a tool that allows to call Fortran code into Python. It is a part of `numpy` meaning that to use it, we only need to install and import numpy (which should already be done if you do scientific Python !) :
```bash
pip3 install numpy
```
%% Cell type:markdown id: tags:
### How does it work ?
The documentation gives several ways to wrap Fortran codes but it all boils down to the same thing:
**f2py allows to wrap the Fortran code in a Python module that can be then imported**
Given this simple Fortran (F90) snippet, that computes the sum of squares of the element of an array:
%% Cell type:markdown id: tags:
*pyfiles/f2py/file_to_wrap.f90*
```fortran
subroutine sum_squares(A, res)
implicit none
real, dimension(:) :: A
real :: res
integer :: i, N
N = size(A)
res = 0.
do i=1, N
res = res + A(i)*A(i)
end do
end subroutine
```
%% Cell type:markdown id: tags:
The Fortran code can then be wrapped in one command:
%% Cell type:markdown id: tags:
```bash
# Syntax: python3 -m numpy.f2py -c <Fortran_files> -m <module_name>
python3 -m numpy.f2py -c "../pyfiles/f2py/file_to_wrap.f90" -m wrap_f90
```
This command calls the module `f2py` of `numpy` to compile (`-c`) *file_to_wrap.f90* into a Python module (`-m`) named *wrap_f90*. The module can then be imported in Python:
%% Cell type:code id: tags:
``` python
import numpy as np
import wrap_f90
A = np.ones(10)
result = 0.
wrap_f90.sum_squares(A, result)
print(result)
```
%% Cell type:markdown id: tags:
### With intents
%% Cell type:markdown id: tags:
In Fortran, it is considered best practice to put intents to subroutine arguments. This also helps `f2py` to wrap efficiently the code but also changes the subroutine a bit.
Let's wrap the code updated with intents:
%% Cell type:markdown id: tags:
*pyfiles/f2py/file_to_wrap2.f90*
```fortran
subroutine sum_squares(A, res)
implicit none
real, dimension(:), intent(in) :: A
real, intent(out) :: res
integer :: i, N
N = size(A)
res = 0.
do i=1, N
res = res + A(i)*A(i)
end do
end subroutine
```
%% Cell type:markdown id: tags:
Again, we wrap...
%% Cell type:markdown id: tags:
```bash
python3 -m numpy.f2py -c "../pyfiles/f2py/file_to_wrap2.f90" -m wrap_f90
```
%% Cell type:markdown id: tags:
And we import...
%% Cell type:code id: tags:
``` python
import numpy as np
import wrap_f90
A = np.ones(10)
result = wrap_f90.sum_squares(A)
print(result)
```
%% Cell type:markdown id: tags:
This time, f2py recognized that `result` was a outgoing arg. As a consequence, the subroutine was wrapped smartly and made to return the arg.
Note that using a `function` (in the Fortran sense of the term) leads to the same result (see the other example in *pyfiles/f2py/file_to_wrap2.f90*).
%% Cell type:markdown id: tags:
### With modules
%% Cell type:markdown id: tags:
In Fortran, it is also considered best practice to organize the subroutines in modules. These are highly similar to Python modules and are in fact, intepreted as such by f2py !
Consider the following code that implements the dtw and cort computations in Fortran:
*pyfiles/dtw_cort_dist/V9_fortran/dtw_cort.f90*
```fortran
module dtw_cort
implicit none
contains
subroutine dtwdistance(s1, s2, dtw_result)
! Computes the dtw between s1 and s2 with distance the absolute distance
doubleprecision, intent(in) :: s1(:), s2(:)
doubleprecision, intent(out) :: dtw_result
integer :: i, j
integer :: len_s1, len_s2
doubleprecision :: dist
doubleprecision, allocatable :: dtw_mat(:, :)
len_s1 = size(s1)
len_s2 = size(s1)
allocate(dtw_mat(len_s1, len_s2))
dtw_mat(1, 1) = dabs(s1(1) - s2(1))
do j = 2, len_s2
dist = dabs(s1(1) - s2(j))
dtw_mat(1, j) = dist + dtw_mat(1, j-1)
end do
do i = 2, len_s1
dist = dabs(s1(i) - s2(1))
dtw_mat(i, 1) = dist + dtw_mat(i-1, 1)
end do
! Fill the dtw_matrix
do i = 2, len_s1
do j = 2, len_s2
dist = dabs(s1(i) - s2(j))
dtw_mat(i, j) = dist + dmin1(dtw_mat(i - 1, j), &
dtw_mat(i, j - 1), &
dtw_mat(i - 1, j - 1))
end do
end do
dtw_result = dtw_mat(len_s1, len_s2)
end subroutine dtwdistance
doubleprecision function cort(s1, s2)
! Computes the cort between s1 and s2 (assuming they have the same length)
doubleprecision, intent(in) :: s1(:), s2(:)
integer :: len_s1, t
doubleprecision :: slope_1, slope_2
doubleprecision :: num, sum_square_x, sum_square_y
len_s1 = size(s1)
num = 0
sum_square_x = 0
sum_square_y = 0
do t=1, len_s1 - 1
slope_1 = s1(t + 1) - s1(t)
slope_2 = s2(t + 1) - s2(t)
num = num + slope_1 * slope_2
sum_square_x = sum_square_x + slope_1 * slope_1
sum_square_y = sum_square_y + slope_2 * slope_2
end do
cort = num / (dsqrt(sum_square_x*sum_square_y))
end function cort
end module dtw_cort
```
%% Cell type:markdown id: tags:
The subroutines `dtwdistance` and `cort` are part of the `dtw_cort` module. The file can be wrapped as before
```bash
python3 -m numpy.f2py -c "../pyfiles/dtw_cort_dist/V9_fortran/dtw_cort.f90" -m distances_fort
```
%% Cell type:markdown id: tags:
But the import slighlty changes as `dtw_cort` is now a module of `distances_fort`:
%% Cell type:code id: tags:
``` python
import numpy as np
from distances_fort import dtw_cort
cort_result = dtw_cort.cort(s1, s2)
print(cort_result)
```
%% Cell type:markdown id: tags:
Note that the wrapping integrates the documentation of the function (if written...) !
%% Cell type:code id: tags:
``` python
from distances_fort import dtw_cort
print(dtw_cort.cort.__doc__)
```
%% Cell type:markdown id: tags:
### To go further...
Running the command `python3 -m numpy.f2py` (without arguments) gives a lot of information on the supported arguments for further uses of `f2py`. Know that you can this way:
* Specify the compiler to use
* Give the compiler flags (warnings, optimisations...)
* Specify the functions to wrap
* ...
The documentation of f2py (https://docs.scipy.org/doc/numpy/f2py/) can also help, covering notably:
* `Cf2py` directives to overcome F77 limitations (e.g. intents)
* How to integrate Fortran sources to your Python packages and compile them on install
* How to use `f2py` inside Python scripts
* ...
%% Cell type:markdown id: tags:
Wrapping C code
--------------------------
%% Cell type:markdown id: tags:
They are different ways of wrapping C code. We present CFFI.
The workflow is the following:
1. Get your C code working (with a .c and a .h)
2. Set up your packaging to compile your code as a module
3. Compile your code
4. In the python code, declare the function you will be using
5. In the python code, open/load the compiled module
6. Use your functions
%% Cell type:markdown id: tags:
**1. Get your C code working (with a .c and a .h)**
Ok, supposed to be done
%% Cell type:markdown id: tags:
**2. Set up your packaging to compile your code as a module**
We give the compilation instructions in the file setup.py:
%% Cell type:markdown id: tags:
```python
from setuptools import setup, Extension
version = "0.1"
module_distance = Extension(
name="cdtw",
sources=["cdtw.c"],
)
setup(
name="dtw_cort_dist_mat",
version=version,
description="data scientist tool for time series",
long_description="data scientist tool for time series",
classifiers=[],
author="Robert Bidochon",
author_email="robert@bidochon.fr",
license="GPL",
include_package_data=True,
install_requires=["cffi", "numpy", "setuptools"],
entry_points="",
ext_modules