Newer
Older
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"<img width=\"800px\" src=\"../fidle/img/header.svg\"></img>\n",
"# <!-- TITLE --> [NP1] - A short introduction to Numpy\n",
"<!-- DESC --> Numpy is an essential tool for the Scientific Python.\n",
"<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->\n",
"\n",
"## Objectives :\n",
" - Understand the main principles of Numpy and its potential\n",
"\n",
"Note : This notebook is strongly inspired by the UGA Python Introduction Course \n",
"See : **https://gricad-gitlab.univ-grenoble-alpes.fr/python-uga/py-training-2017**"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"\n",
"Code using `numpy` usually starts with the import statement"
]
},
{
"cell_type": "code",
"metadata": {},
"outputs": [],
"source": [
"import numpy as np"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"NumPy provides the type `np.ndarray`. Such array are multidimensionnal sequences of homogeneous elements. They can be created for example with the commands:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# from a list\n",
"l = [10.0, 12.5, 15.0, 17.5, 20.0]\n",
"np.array(l)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# fast but the values can be anything\n",
"np.empty(4)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# slower than np.empty but the values are all 0.\n",
"np.zeros([2, 6])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# multidimensional array\n",
"a = np.ones([2, 3, 4])\n",
"print(a.shape, a.size, a.dtype)\n",
"a"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# like range but produce 1D numpy array\n",
"np.arange(4)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# np.arange can produce arrays of floats\n",
"np.arange(4.)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# another convenient function to generate 1D arrays\n",
"np.linspace(10, 20, 5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A NumPy array can be easily converted to a Python list."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"a = np.linspace(10, 20 ,5)\n",
"list(a)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Or even better\n",
"a.tolist()"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Elements in a `numpy` array can be accessed using indexing and slicing in any dimension. It also offers the same functionalities available in Fortan or Matlab.\n",
"\n",
"For example, we can create an array `A` and perform any kind of selection operations on it."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"A = np.random.random([4, 5])\n",
"A"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Get the element from second line, first column\n",
"A[1, 0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Get the first two lines\n",
"A[:2]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Get the last column\n",
"A[:, -1]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Get the first two lines and the columns with an even index\n",
"A[:2, ::2]"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### 2.2 - Using a mask to select elements validating a condition:"
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"cond = A > 0.5\n",
"print(cond)\n",
"print(A[cond])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The mask is in fact a particular case of the advanced indexing capabilities provided by NumPy. For example, it is even possible to use lists for indexing:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Selecting only particular columns\n",
"print(A)\n",
"A[:, [0, 1, 4]]"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Step 3 - Perform array manipulations\n",
"### 3.1 - Apply arithmetic operations to whole arrays (element-wise):"
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"(A+5)**2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"np.exp(A) # With numpy arrays, use the functions from numpy !"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"A[:, 0] = 0.\n",
"print(A)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# BONUS: Safe element-wise inverse with masks\n",
"cond = (A != 0)\n",
"A[cond] = 1./A[cond]\n",
"print(A)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Step 4 - Attributes and methods of `np.ndarray` (see the [doc](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html#numpy.ndarray))"
"execution_count": null,
"metadata": {},
"outputs": [],
"for i,v in enumerate([s for s in dir(A) if not s.startswith('__')]):\n",
" print(f'{v:16}', end='')\n",
" if (i+1) % 6 == 0 :print('')"
"execution_count": null,
"metadata": {},
"outputs": [],
"# Ex1: Get the mean through different dimensions\n",
"print('Mean value', A.mean())\n",
"print('Mean line', A.mean(axis=0))\n",
"print('Mean column', A.mean(axis=1))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"# Ex2: Convert a 2D array in 1D keeping all elements\n",
"A_flat = A.flatten()\n",
"print(A_flat, A_flat.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"b = np.linspace(0, 10, 11)\n",
"c = b @ b\n",
"# before 3.5:\n",
"# c = b.dot(b)\n",
"print(b)\n",
"print(c)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"| ` ` | Matlab | Numpy |\n",
"| ------------- | ------ | ----- |\n",
"| element wise | `.*` | `*` |\n",
"| dot product | `*` | `@` |"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`numpy` arrays can also be sorted, even when they are composed of complex data if the type of the columns are explicitly stated with `dtypes`."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"\n",
"We already saw `numpy.random` to generate `numpy` arrays filled with random values. This submodule also provides functions related to distributions (Poisson, gaussian, etc.) and permutations."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To perform linear algebra with dense matrices, we can use the submodule `numpy.linalg`. For instance, in order to compute the determinant of a random matrix, we use the method `det`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"A = np.random.random([5,5])\n",
"print(A)\n",
"np.linalg.det(A)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"squared_subA = A[1:3, 1:3]\n",
"print(squared_subA)\n",
"np.linalg.inv(squared_subA)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### 4.4 - Introduction to Pandas: Python Data Analysis Library\n",
"\n",
"Pandas is an open source library providing high-performance, easy-to-use data structures and data analysis tools for Python.\n",
"\n",
"[Pandas tutorial](https://pandas.pydata.org/pandas-docs/stable/10min.html)\n",
"[Grenoble Python Working Session](https://github.com/iutzeler/Pres_Pandas/)\n",
"[Pandas for SQL Users](http://sergilehkyi.com/translating-sql-to-pandas/)\n",
"[Pandas Introduction Training HPC Python@UGA](https://gricad-gitlab.univ-grenoble-alpes.fr/python-uga/training-hpc/-/blob/master/ipynb/11_pandas.ipynb)"
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"<img width=\"80px\" src=\"../fidle/img/logo-paysage.svg\"></img>"
}
],
"metadata": {
"celltoolbar": "Diaporama",
"kernelspec": {
"display_name": "Python 3.9.2 ('fidle-env')",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.2"
},
"vscode": {
"interpreter": {
"hash": "b3929042cc22c1274d74e3e946c52b845b57cb6d84f2d591ffe0519b38e4896d"
}