Pierre Augier (LEGI), Raphaël Bacher (Gipsa), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre), Loïc Huder (ISTerre)
%% Cell type:markdown id: tags:
### Measure ⏱, don't guess! Profile to find the bottlenecks.
### Do not optimize everything!
-*"Premature optimization is the root of all evil"* (Donald Knuth)
- 80 / 20 rule, efficiency important for expensive things and NOT for small things
%% Cell type:markdown id: tags:
## Context: some notes on developing efficient software
Given a problem (e.g. finding which element is missing in a list of elements), we want to code a solution that solves the problem. The classical workflow is:
***design** an algorithm that solves the problem.
- study the input of the algorithm (is it special in one sens?)
- design an algorithms comes (given the specificity of your data)
- choose the adequate data structure, i.e. the data structure that will optimize the relevant operations, a.k.a the operations that takes time or that are repeated a large number of time.
***evaluate** the complexity of the algorithm (theoretical point of view)
***take care** of the special cases (can my list be empty, in such a case what is my strategy ?)
***write your specs**: for example if the list is empty, we raise an exception.
***write some tests** to check your implementation is correct
***code**
***profile**: find where are the bottlenecks
***code**
***profile**
* ...
%% Cell type:markdown id: tags:
## Note 1
If your data is large enough, a basic implementation of an algorithm with low complexity will run faster than a fined tuned implementation of a algorithm with high complexity
### Example
Looking for the missing element problem: we know that all the element for 0 to N should be present. We can compute the sum and calculate the difference between the computed sum and the mathematical sum. This algorithm access only once each element. It thus has an $O(N)$ complexity, where N is the number of elements.
An algorithm that checks if element e belongs to the list, for each e will has an $O(N^2)$ complexity and will be slower that the previous one for **sufficient large value of N**.
%% Cell type:markdown id: tags:
## Note 2
If your data has some specificity, take advantage of it.
### Example
- if your list is sorted, solving the above problem can be done by checking that two consecutive elements in the list are consective numbers. The complexity is thus $O(N)$
- sorting N elements can be done in $O(N)$ in the special case where the $N$ items belongs to a range of size N.
%% Cell type:markdown id: tags:
## Note 3
Complexity analysis is done over the **worst case**, what is the worst input for our algorithm
### Example
Sorting elements:
worst case = elements are already sorted but in reverse order
%% Cell type:markdown id: tags:
# Different types of profiling
## Time profiling
- Small code snippets
- Script based benchmark
- Function based profiling
- Line based profiling
<pclass="small"><br></p>
## Memory profiling
%% Cell type:markdown id: tags:
## Small code snippets
- There is a module [`timeit` in the standard library](https://docs.python.org/3/library/timeit.html).
`python3 -m timeit -s "import math; l=[]" "for x in range(100): l.append(math.pow(x,2))"`
Problem: the module `timeit` does not try to guess how many times to execute the statement.
- In IPython, you can use the magic command `%timeit` that execute a piece of code and stats the time it spends:
%% Cell type:code id: tags:
``` python
importmath
l=[]
%timeitforxinrange(100):l.append(math.pow(x,2))
%timeit[math.pow(x,2)forxinrange(100)]
l=[]
%timeitforxinrange(100):l.append(x*x)
%timeit[x*xforxinrange(100)]
```
%%%% Output: stream
18.7 µs ± 700 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
13.5 µs ± 566 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
7.27 µs ± 38.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
3.74 µs ± 18.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
28.6 µs ± 1.19 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
19.6 µs ± 283 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
13 µs ± 462 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
7.45 µs ± 239 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%% Cell type:markdown id: tags:
-[`pyperf`](https://pypi.org/project/pyperf/) is a more powerful tool but we can also do the same as with the module `timeit`:
`python3 -m pyperf timeit -s "import math; l=[]" "for x in range(100): l.append(math.pow(x,2))"`
%% Cell type:markdown id: tags:
## Do not guess (the return of word counting problem)
%% Cell type:code id: tags:
``` python
defbuild_count_base(t):
d={}
forsint:
ifsind:
d[s]+=1
else:
d[s]=1
returnd
defbuild_count_set(t):
d={k:0forkinset(t)}
forsint:
d[s]+=1
returnd
defbuild_count_count(t):
d={k:t.count(k)forkinset(t)}
returnd
defbuild_count_excpt(t):
d={}
forsint:
try:
d[s]+=1
except:
d[s]=1
returnd
importcollections
defbuild_count_counter(t):
returncollections.Counter(t)
defbuild_count_defaultdict(t):
d=collections.defaultdict(int)
forkins:
d[k]+=1
returnd
s="Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam tristique at velit in varius. Cras ut ultricies orci. Fusce vel consequat ante, vitae luctus tortor. Sed condimentum faucibus enim, sit amet pulvinar ligula feugiat ac. Sed interdum id risus id rhoncus. Nullam nisi justo, ultrices eu est nec, hendrerit maximus lorem. Nam urna eros, accumsan nec magna eu, elementum semper diam. Nulla tempus, nibh id elementum dapibus, ex diam lacinia est, sit amet suscipit nulla nibh eu sapien. Aliquam orci enim, malesuada in facilisis vitae, pharetra sit amet mi. Pellentesque mi tortor, sagittis quis odio quis, fermentum faucibus ex. Aenean sagittis nisl orci. Maecenas tristique velit sed leo facilisis porttitor. "