Vous avez reçu un message "Your GitLab account has been locked ..." ? Pas d'inquiétude : lisez cet article https://docs.gricad-pages.univ-grenoble-alpes.fr/help/unlock/

Commit 85fa4f64 authored by Loic Huder's avatar Loic Huder
Browse files

Updated numpy presentation

parent 0fd19401
......@@ -75,9 +75,7 @@
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"metadata": {},
"outputs": [],
"source": [
"import numpy as np"
......@@ -120,7 +118,7 @@
{
"data": {
"text/plain": [
"array([4.64593213e-310, 6.89924492e-310, 0.00000000e+000, 0.00000000e+000])"
"array([1.07312086e-316, 0.00000000e+000, 5.41759977e-317, 6.93318354e-310])"
]
},
"execution_count": 3,
......@@ -210,7 +208,7 @@
}
],
"source": [
"# like range but produce 1d array\n",
"# like range but produce 1D numpy array\n",
"np.arange(4)"
]
},
......@@ -231,7 +229,7 @@
}
],
"source": [
"# np.arange can produce arrays of float\n",
"# np.arange can produce arrays of floats\n",
"np.arange(4.)"
]
},
......@@ -252,7 +250,7 @@
}
],
"source": [
"# another convenient function 1d array\n",
"# another convenient function to generate 1D arrays\n",
"np.linspace(10, 20, 5)"
]
},
......@@ -260,7 +258,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"An numpy array can be easily converted to a Python list."
"A numpy array can be easily converted to a Python list."
]
},
{
......@@ -292,7 +290,7 @@
}
},
"source": [
"# Why do we have numpy?\n",
"# Numpy efficiency\n",
"Beside some convenient functions for the manipulation of data in arrays of arbritrary dimensions, numpy can be much more efficient than pure Python."
]
},
......@@ -305,7 +303,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"17.7 µs ± 528 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n"
"13.4 µs ± 1.61 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n"
]
}
],
......@@ -318,9 +316,7 @@
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": true
},
"metadata": {},
"outputs": [],
"source": [
"%%capture timeit_python\n",
......@@ -332,9 +328,7 @@
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": true
},
"metadata": {},
"outputs": [],
"source": [
"%%capture timeit_numpy\n",
......@@ -345,9 +339,7 @@
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": true
},
"metadata": {},
"outputs": [],
"source": [
"def compute_time_in_second(timeit_result):\n",
......@@ -383,11 +375,11 @@
"name": "stdout",
"output_type": "stream",
"text": [
"18.6 us +- 1.94 us per loop (mean +- std. dev. of 7 runs, 100000 loops each)\n",
"11.9 us +- 437 ns per loop (mean +- std. dev. of 7 runs, 100000 loops each)\n",
"\n",
"2.19 us +- 272 ns per loop (mean +- std. dev. of 7 runs, 100000 loops each)\n",
"1.34 us +- 93.9 ns per loop (mean +- std. dev. of 7 runs, 1000000 loops each)\n",
"\n",
"Creation of object: ratio times (Python / NumPy): 8.493150684931509\n"
"Creation of object: ratio times (Python / NumPy): 8.880597014925373\n"
]
}
],
......@@ -398,9 +390,7 @@
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": true
},
"metadata": {},
"outputs": [],
"source": [
"n = 200000\n",
......@@ -414,9 +404,7 @@
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": true
},
"metadata": {},
"outputs": [],
"source": [
"%%capture timeit_python\n",
......@@ -434,7 +422,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"22.9 ms +- 667 us per loop (mean +- std. dev. of 7 runs, 10 loops each)\n",
"20.2 ms +- 2.77 ms per loop (mean +- std. dev. of 7 runs, 10 loops each)\n",
"\n"
]
}
......@@ -446,9 +434,7 @@
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"collapsed": true
},
"metadata": {},
"outputs": [],
"source": [
"%%capture timeit_numpy\n",
......@@ -466,7 +452,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"348 us +- 18.7 us per loop (mean +- std. dev. of 7 runs, 1000 loops each)\n",
"207 us +- 15 us per loop (mean +- std. dev. of 7 runs, 1000 loops each)\n",
"\n"
]
}
......@@ -484,11 +470,11 @@
"name": "stdout",
"output_type": "stream",
"text": [
"22.9 ms +- 667 us per loop (mean +- std. dev. of 7 runs, 10 loops each)\n",
"20.2 ms +- 2.77 ms per loop (mean +- std. dev. of 7 runs, 10 loops each)\n",
"\n",
"348 us +- 18.7 us per loop (mean +- std. dev. of 7 runs, 1000 loops each)\n",
"207 us +- 15 us per loop (mean +- std. dev. of 7 runs, 1000 loops each)\n",
"\n",
"Additions: ratio times (Python / NumPy): 65.80459770114942\n"
"Additions: ratio times (Python / NumPy): 97.58454106280193\n"
]
}
],
......@@ -511,139 +497,281 @@
}
},
"source": [
"## How a np.ndarray can be used?"
"# Manipulating numpy arrays"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`numpy` arrays offer also the same functionality available in language such as Fortan or Matlab. \n",
"For example, we can define the slice of the array `A` that comprises its two first rows and the columns with an even index:"
"## Access elements\n",
"Elements in a `numpy` array can be accessed using indexing and slicing in any dimension. It also offers the same functionalities available in Fortan or Matlab.\n",
"\n",
"For example, we can create an array `A` and perform any kind of selection operations on it."
]
},
{
"cell_type": "code",
"execution_count": 69,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[0.6642251 , 0.81037275, 0.82464498, 0.03382786, 0.52260623],\n",
" [0.29404821, 0.59309748, 0.1408015 , 0.24315286, 0.02499713],\n",
" [0.67212693, 0.66863594, 0.61845366, 0.21243744, 0.84314157],\n",
" [0.40933625, 0.77076404, 0.48664432, 0.5823091 , 0.45242895]])"
]
},
"execution_count": 69,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"A = np.random.random([4, 5])\n",
"A"
]
},
{
"cell_type": "code",
"execution_count": 70,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.2940482110988186"
]
},
"execution_count": 70,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Get the element from second line, first column\n",
"A[1, 0]"
]
},
{
"cell_type": "code",
"execution_count": 71,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[0.6642251 , 0.81037275, 0.82464498, 0.03382786, 0.52260623],\n",
" [0.29404821, 0.59309748, 0.1408015 , 0.24315286, 0.02499713]])"
]
},
"execution_count": 71,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Get the first two lines\n",
"A[:2]"
]
},
{
"cell_type": "code",
"execution_count": 72,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0.52260623, 0.02499713, 0.84314157, 0.45242895])"
]
},
"execution_count": 72,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Get the last column\n",
"A[:, -1]"
]
},
{
"cell_type": "code",
"execution_count": 21,
"execution_count": 73,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[1.],\n",
" [1.]])"
"array([[0.6642251 , 0.82464498, 0.52260623],\n",
" [0.29404821, 0.1408015 , 0.02499713]])"
]
},
"execution_count": 21,
"execution_count": 73,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a = np.ones([4, 2])\n",
"a[:2, ::2]"
"# Get the first two lines and the columns with an even index\n",
"A[:2, ::2]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Mask:"
"##### Using a mask to select elements validating a condition:"
]
},
{
"cell_type": "code",
"execution_count": 22,
"execution_count": 74,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ True True]\n",
" [ True True]\n",
" [ True True]\n",
" [ True True]]\n",
"[1. 1. 1. 1. 1. 1. 1. 1.]\n"
"[[ True True True False True]\n",
" [False True False False False]\n",
" [ True True True False True]\n",
" [False True False True False]]\n",
"[0.6642251 0.81037275 0.82464498 0.52260623 0.59309748 0.67212693\n",
" 0.66863594 0.61845366 0.84314157 0.77076404 0.5823091 ]\n"
]
}
],
"source": [
"cond = a > 0.5\n",
"cond = A > 0.5\n",
"print(cond)\n",
"print(a[cond])"
"print(A[cond])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Element wise operations:"
"##### Apply operations to whole arrays (element-wise):"
]
},
{
"cell_type": "code",
"execution_count": 23,
"execution_count": 75,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[2., 2.],\n",
" [2., 2.],\n",
" [2., 2.],\n",
" [2., 2.]])"
"array([[32.08344593, 33.76043147, 33.92648913, 25.33942296, 30.49917958],\n",
" [28.02694646, 31.28273947, 26.4278401 , 27.49065188, 25.2505962 ],\n",
" [32.17302392, 32.13343338, 31.56702157, 27.16950403, 34.14230335],\n",
" [29.2609187 , 33.30171766, 30.10326586, 31.16217489, 29.7289815 ]])"
]
},
"execution_count": 23,
"execution_count": 75,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"2*a**2 "
"(A+5)**2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Apply functions element wise:"
"##### Apply functions element-wise:"
]
},
{
"cell_type": "code",
"execution_count": 24,
"execution_count": 76,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[2.71828183, 2.71828183],\n",
" [2.71828183, 2.71828183],\n",
" [2.71828183, 2.71828183],\n",
" [2.71828183, 2.71828183]])"
"array([[1.94298431, 2.24874605, 2.28107079, 1.03440653, 1.68641712],\n",
" [1.34184859, 1.8095849 , 1.15119612, 1.27526354, 1.02531218],\n",
" [1.95839827, 1.95157343, 1.85605573, 1.23668874, 2.32365544],\n",
" [1.50581797, 2.16141704, 1.62684786, 1.79016734, 1.57212617]])"
]
},
"execution_count": 24,
"execution_count": 76,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.exp(a)"
"np.exp(A) # With numpy arrays, use the functions from numpy !"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Setting parts of arrays"
]
},
{
"cell_type": "code",
"execution_count": 77,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[0. 0.81037275 0.82464498 0.03382786 0.52260623]\n",
" [0. 0.59309748 0.1408015 0.24315286 0.02499713]\n",
" [0. 0.66863594 0.61845366 0.21243744 0.84314157]\n",
" [0. 0.77076404 0.48664432 0.5823091 0.45242895]]\n"
]
}
],
"source": [
"A[:, 0] = 0.\n",
"print(A)"
]
},
{
"cell_type": "code",
"execution_count": 78,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 0. 1.23400004 1.21264305 29.56142899 1.91348656]\n",
" [ 0. 1.68606347 7.10219689 4.11263932 40.00458583]\n",
" [ 0. 1.49558219 1.61693601 4.70726825 1.18604045]\n",
" [ 0. 1.29741392 2.05488889 1.717301 2.21029178]]\n"
]
}
],
"source": [
"# BONUS: Safe element-wise inverse with masks\n",
"cond = (A != 0)\n",
"A[cond] = 1./A[cond]\n",
"print(A)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Methods of np.ndarray"
"### Attributes and methods of np.ndarray (see the [doc](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html#numpy.ndarray))"
]
},
{
"cell_type": "code",
"execution_count": 25,
"execution_count": 79,
"metadata": {},
"outputs": [
{
......@@ -655,7 +783,61 @@
}
],
"source": [
"print([s for s in dir(a) if not s.startswith('__')])"
"print([s for s in dir(A) if not s.startswith('__')])"
]
},
{
"cell_type": "code",
"execution_count": 80,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 0. 1.23400004 1.21264305 29.56142899 1.91348656]\n",
" [ 0. 1.68606347 7.10219689 4.11263932 40.00458583]\n",
" [ 0. 1.49558219 1.61693601 4.70726825 1.18604045]\n",
" [ 0. 1.29741392 2.05488889 1.717301 2.21029178]]\n",
"Mean value 5.155638331549335\n",
"Mean line [ 0. 1.4282649 2.99666621 10.02465939 11.32860116]\n",
"Mean column [ 6.78431173 10.5810971 1.80116538 1.45597912]\n"
]
}
],
"source": [
"# Ex1: Get the mean through different dimensions\n",
"print(A)\n",
"print('Mean value', A.mean())\n",
"print('Mean line', A.mean(axis=0))\n",
"print('Mean column', A.mean(axis=1))"
]
},
{
"cell_type": "code",
"execution_count": 81,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 0. 1.23400004 1.21264305 29.56142899 1.91348656]\n",
" [ 0. 1.68606347 7.10219689 4.11263932 40.00458583]\n",
" [ 0. 1.49558219 1.61693601 4.70726825 1.18604045]\n",
" [ 0. 1.29741392 2.05488889 1.717301 2.21029178]] (4, 5)\n",
"[ 0. 1.23400004 1.21264305 29.56142899 1.91348656 0.\n",
" 1.68606347 7.10219689 4.11263932 40.00458583 0. 1.49558219\n",
" 1.61693601 4.70726825 1.18604045 0. 1.29741392 2.05488889\n",
" 1.717301 2.21029178] (20,)\n"
]
}
],
"source": [
"# Ex2: Convert a 2D array in 1D keeping all elements\n",
"print(A, A.shape)\n",
"A_flat = A.flatten()\n",
"print(A_flat, A_flat.shape)"
]
},
{
......@@ -667,14 +849,25 @@
},
{
"cell_type": "code",
"execution_count": 30,
"execution_count": 82,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.]\n",
"385.0\n"
]
}
],
"source": [
"b = np.linspace(0, 10, 100)\n",
"b = np.linspace(0, 10, 11)\n",
"c = b @ b\n",
"# before 3.5:\n",
"# c = b.dot(b)"
"# c = b.dot(b)\n",
"print(b)\n",
"print(c)"
]
},
{
......@@ -704,27 +897,27 @@
},
{
"cell_type": "code",
"execution_count": 26,
"execution_count": 83,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[0.19375156 0.43818991 0.56220482 0.61151111 0.52704117]\n",
" [0.37458391 0.49837972 0.47439702 0.38486104 0.58566245]\n",
" [0.74819557 0.79487933 0.54072395 0.2959711 0.76028302]\n",
" [0.09396438 0.99097396 0.61193908 0.73813701 0.3097673 ]\n",
" [0.82997499 0.74597254 0.22798877 0.38195223 0.74567004]]\n"
"[[0.7389817 0.71953295 0.52116066 0.28674153 0.98808241]\n",
" [0.75973256 0.82099706 0.44055507 0.70815103 0.15250382]\n",
" [0.01277246 0.45752824 0.32131109 0.2052085 0.50415616]\n",
" [0.62710664 0.2525335 0.55259395 0.50973586 0.9498232 ]\n",
" [0.7880896 0.28579137 0.26261422 0.91569611 0.56513936]]\n"
]
},
{
"data": {
"text/plain": [
"-0.004926566213405665"
"0.06654501241365778"
]
},
"execution_count": 26,
"execution_count": 83,
"metadata": {},
"output_type": "execute_result"
}
......@@ -744,7 +937,7 @@
},
{
"cell_type": "code",
"execution_count": 27,
"execution_count": 84,
"metadata": {},
"outputs": [
{
......@@ -760,7 +953,7 @@
],
"source": [
"dtypes = np.dtype([('country', 'S20'), ('density', 'i4'), \n",
" ('area', 'i4'), ('population', 'i4')])\n",
" ('area', 'i4'), ('population', 'i4')])\n",
"x = np.array([('Netherlands', 393, 41526, 16928800),\n",
" ('Belgium', 337, 30510, 11007020),\n",
" ('United Kingdom', 256, 243610, 62262000),\n",
......@@ -787,7 +980,7 @@
},
{
"cell_type": "code",
"execution_count": 28,
"execution_count": 85,
"metadata": {},
"outputs": [