{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "<img width=\"800px\" src=\"../fidle/img/header.svg\"></img>\n", "\n", "# <!-- TITLE --> [NP1] - A short introduction to Numpy\n", "<!-- DESC --> Numpy is an essential tool for the Scientific Python.\n", "<!-- AUTHOR : Jean-Luc Parouty (CNRS/SIMaP) -->\n", "\n", "## Objectives :\n", " - Understand the main principles of Numpy and its potential\n", "\n", "Note : This notebook is strongly inspired by the UGA Python Introduction Course \n", "See : **https://gricad-gitlab.univ-grenoble-alpes.fr/python-uga/py-training-2017**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Step 1 - Numpy the beginning\n", "\n", "Code using `numpy` usually starts with the import statement" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "NumPy provides the type `np.ndarray`. Such array are multidimensionnal sequences of homogeneous elements. They can be created for example with the commands:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# from a list\n", "l = [10.0, 12.5, 15.0, 17.5, 20.0]\n", "np.array(l)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# fast but the values can be anything\n", "np.empty(4)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# slower than np.empty but the values are all 0.\n", "np.zeros([2, 6])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# multidimensional array\n", "a = np.ones([2, 3, 4])\n", "print(a.shape, a.size, a.dtype)\n", "a" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# like range but produce 1D numpy array\n", "np.arange(4)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# np.arange can produce arrays of floats\n", "np.arange(4.)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# another convenient function to generate 1D arrays\n", "np.linspace(10, 20, 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A NumPy array can be easily converted to a Python list." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = np.linspace(10, 20 ,5)\n", "list(a)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Or even better\n", "a.tolist()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Step 2 - Access elements\n", "\n", "Elements in a `numpy` array can be accessed using indexing and slicing in any dimension. It also offers the same functionalities available in Fortan or Matlab.\n", "\n", "### 2.1 - Indexes and slices\n", "For example, we can create an array `A` and perform any kind of selection operations on it." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "A = np.random.random([4, 5])\n", "A" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get the element from second line, first column\n", "A[1, 0]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get the first two lines\n", "A[:2]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get the last column\n", "A[:, -1]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get the first two lines and the columns with an even index\n", "A[:2, ::2]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### 2.2 - Using a mask to select elements validating a condition:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cond = A > 0.5\n", "print(cond)\n", "print(A[cond])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The mask is in fact a particular case of the advanced indexing capabilities provided by NumPy. For example, it is even possible to use lists for indexing:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Selecting only particular columns\n", "print(A)\n", "A[:, [0, 1, 4]]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Step 3 - Perform array manipulations\n", "### 3.1 - Apply arithmetic operations to whole arrays (element-wise):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(A+5)**2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.2 - Apply functions element-wise:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.exp(A) # With numpy arrays, use the functions from numpy !" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.3 - Setting parts of arrays" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "A[:, 0] = 0.\n", "print(A)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# BONUS: Safe element-wise inverse with masks\n", "cond = (A != 0)\n", "A[cond] = 1./A[cond]\n", "print(A)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Step 4 - Attributes and methods of `np.ndarray` (see the [doc](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html#numpy.ndarray))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i,v in enumerate([s for s in dir(A) if not s.startswith('__')]):\n", " print(f'{v:16}', end='')\n", " if (i+1) % 6 == 0 :print('')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "# Ex1: Get the mean through different dimensions\n", "\n", "print(A)\n", "print('Mean value', A.mean())\n", "print('Mean line', A.mean(axis=0))\n", "print('Mean column', A.mean(axis=1))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "# Ex2: Convert a 2D array in 1D keeping all elements\n", "\n", "print(A)\n", "print(A.shape)\n", "A_flat = A.flatten()\n", "print(A_flat, A_flat.shape)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### 4.1 - Remark: dot product" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b = np.linspace(0, 10, 11)\n", "c = b @ b\n", "# before 3.5:\n", "# c = b.dot(b)\n", "print(b)\n", "print(c)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.2 - For Matlab users\n", "\n", "| ` ` | Matlab | Numpy |\n", "| ------------- | ------ | ----- |\n", "| element wise | `.*` | `*` |\n", "| dot product | `*` | `@` |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`numpy` arrays can also be sorted, even when they are composed of complex data if the type of the columns are explicitly stated with `dtypes`." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### 4.3 - NumPy and SciPy sub-packages:\n", "\n", "We already saw `numpy.random` to generate `numpy` arrays filled with random values. This submodule also provides functions related to distributions (Poisson, gaussian, etc.) and permutations." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To perform linear algebra with dense matrices, we can use the submodule `numpy.linalg`. For instance, in order to compute the determinant of a random matrix, we use the method `det`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "A = np.random.random([5,5])\n", "print(A)\n", "np.linalg.det(A)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "squared_subA = A[1:3, 1:3]\n", "print(squared_subA)\n", "np.linalg.inv(squared_subA)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### 4.4 - Introduction to Pandas: Python Data Analysis Library\n", "\n", "Pandas is an open source library providing high-performance, easy-to-use data structures and data analysis tools for Python.\n", "\n", "[Pandas tutorial](https://pandas.pydata.org/pandas-docs/stable/10min.html)\n", "[Grenoble Python Working Session](https://github.com/iutzeler/Pres_Pandas/)\n", "[Pandas for SQL Users](http://sergilehkyi.com/translating-sql-to-pandas/)\n", "[Pandas Introduction Training HPC Python@UGA](https://gricad-gitlab.univ-grenoble-alpes.fr/python-uga/training-hpc/-/blob/master/ipynb/11_pandas.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "<img width=\"80px\" src=\"../fidle/img/logo-paysage.svg\"></img>" ] } ], "metadata": { "celltoolbar": "Diaporama", "kernelspec": { "display_name": "Python 3.9.2 ('fidle-env')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.2" }, "vscode": { "interpreter": { "hash": "b3929042cc22c1274d74e3e946c52b845b57cb6d84f2d591ffe0519b38e4896d" } } }, "nbformat": 4, "nbformat_minor": 4 }