Vous avez reçu un message "Your GitLab account has been locked ..." ? Pas d'inquiétude : lisez cet article https://docs.gricad-pages.univ-grenoble-alpes.fr/help/unlock/

Commit 31f4a453 authored by Laurence Viry's avatar Laurence Viry
Browse files

utilisation dplyr 9/07/18

parent 08d46e65
Pipeline #9680 passed with stages
in 1 minute and 5 seconds
...@@ -26,6 +26,8 @@ ...@@ -26,6 +26,8 @@
"* changer le type de certaines variables pour les adapter aux traitements envisagés,\n", "* changer le type de certaines variables pour les adapter aux traitements envisagés,\n",
"* $\\ldots$\n", "* $\\ldots$\n",
"\n", "\n",
"On abordera le concept de *tidy data$, les extensions du tidyverse comme *dplyr* ou *ggplot2* partent du principe que les données sont “bien rangées” sous forme de tidy data.\n",
"\n",
"R fournit des outils et des capacités de programmation pour effectuer ces différentes tâches.\n", "R fournit des outils et des capacités de programmation pour effectuer ces différentes tâches.\n",
"\n", "\n",
"## Importer des données\n", "## Importer des données\n",
...@@ -1078,283 +1080,9 @@ ...@@ -1078,283 +1080,9 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 135, "execution_count": null,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<table>\n",
"<thead><tr><th scope=col>Sepal.Width</th><th scope=col>Petal.Length</th><th scope=col>Petal.Width</th></tr></thead>\n",
"<tbody>\n",
"\t<tr><td>3.5</td><td>1.4</td><td>0.2</td></tr>\n",
"\t<tr><td>3.0</td><td>1.4</td><td>0.2</td></tr>\n",
"\t<tr><td>3.2</td><td>1.3</td><td>0.2</td></tr>\n",
"\t<tr><td>3.1</td><td>1.5</td><td>0.2</td></tr>\n",
"\t<tr><td>3.6</td><td>1.4</td><td>0.2</td></tr>\n",
"\t<tr><td>3.9</td><td>1.7</td><td>0.4</td></tr>\n",
"\t<tr><td>3.4</td><td>1.4</td><td>0.3</td></tr>\n",
"\t<tr><td>3.4</td><td>1.5</td><td>0.2</td></tr>\n",
"\t<tr><td>2.9</td><td>1.4</td><td>0.2</td></tr>\n",
"\t<tr><td>3.1</td><td>1.5</td><td>0.1</td></tr>\n",
"\t<tr><td>3.7</td><td>1.5</td><td>0.2</td></tr>\n",
"\t<tr><td>3.4</td><td>1.6</td><td>0.2</td></tr>\n",
"\t<tr><td>3.0</td><td>1.4</td><td>0.1</td></tr>\n",
"\t<tr><td>3.0</td><td>1.1</td><td>0.1</td></tr>\n",
"\t<tr><td>4.0</td><td>1.2</td><td>0.2</td></tr>\n",
"\t<tr><td>4.4</td><td>1.5</td><td>0.4</td></tr>\n",
"\t<tr><td>3.9</td><td>1.3</td><td>0.4</td></tr>\n",
"\t<tr><td>3.5</td><td>1.4</td><td>0.3</td></tr>\n",
"\t<tr><td>3.8</td><td>1.7</td><td>0.3</td></tr>\n",
"\t<tr><td>3.8</td><td>1.5</td><td>0.3</td></tr>\n",
"\t<tr><td>3.4</td><td>1.7</td><td>0.2</td></tr>\n",
"\t<tr><td>3.7</td><td>1.5</td><td>0.4</td></tr>\n",
"\t<tr><td>3.6</td><td>1.0</td><td>0.2</td></tr>\n",
"\t<tr><td>3.3</td><td>1.7</td><td>0.5</td></tr>\n",
"\t<tr><td>3.4</td><td>1.9</td><td>0.2</td></tr>\n",
"\t<tr><td>3.0</td><td>1.6</td><td>0.2</td></tr>\n",
"\t<tr><td>3.4</td><td>1.6</td><td>0.4</td></tr>\n",
"\t<tr><td>3.5</td><td>1.5</td><td>0.2</td></tr>\n",
"\t<tr><td>3.4</td><td>1.4</td><td>0.2</td></tr>\n",
"\t<tr><td>3.2</td><td>1.6</td><td>0.2</td></tr>\n",
"\t<tr><td>⋮</td><td>⋮</td><td>⋮</td></tr>\n",
"\t<tr><td>3.2</td><td>5.7</td><td>2.3</td></tr>\n",
"\t<tr><td>2.8</td><td>4.9</td><td>2.0</td></tr>\n",
"\t<tr><td>2.8</td><td>6.7</td><td>2.0</td></tr>\n",
"\t<tr><td>2.7</td><td>4.9</td><td>1.8</td></tr>\n",
"\t<tr><td>3.3</td><td>5.7</td><td>2.1</td></tr>\n",
"\t<tr><td>3.2</td><td>6.0</td><td>1.8</td></tr>\n",
"\t<tr><td>2.8</td><td>4.8</td><td>1.8</td></tr>\n",
"\t<tr><td>3.0</td><td>4.9</td><td>1.8</td></tr>\n",
"\t<tr><td>2.8</td><td>5.6</td><td>2.1</td></tr>\n",
"\t<tr><td>3.0</td><td>5.8</td><td>1.6</td></tr>\n",
"\t<tr><td>2.8</td><td>6.1</td><td>1.9</td></tr>\n",
"\t<tr><td>3.8</td><td>6.4</td><td>2.0</td></tr>\n",
"\t<tr><td>2.8</td><td>5.6</td><td>2.2</td></tr>\n",
"\t<tr><td>2.8</td><td>5.1</td><td>1.5</td></tr>\n",
"\t<tr><td>2.6</td><td>5.6</td><td>1.4</td></tr>\n",
"\t<tr><td>3.0</td><td>6.1</td><td>2.3</td></tr>\n",
"\t<tr><td>3.4</td><td>5.6</td><td>2.4</td></tr>\n",
"\t<tr><td>3.1</td><td>5.5</td><td>1.8</td></tr>\n",
"\t<tr><td>3.0</td><td>4.8</td><td>1.8</td></tr>\n",
"\t<tr><td>3.1</td><td>5.4</td><td>2.1</td></tr>\n",
"\t<tr><td>3.1</td><td>5.6</td><td>2.4</td></tr>\n",
"\t<tr><td>3.1</td><td>5.1</td><td>2.3</td></tr>\n",
"\t<tr><td>2.7</td><td>5.1</td><td>1.9</td></tr>\n",
"\t<tr><td>3.2</td><td>5.9</td><td>2.3</td></tr>\n",
"\t<tr><td>3.3</td><td>5.7</td><td>2.5</td></tr>\n",
"\t<tr><td>3.0</td><td>5.2</td><td>2.3</td></tr>\n",
"\t<tr><td>2.5</td><td>5.0</td><td>1.9</td></tr>\n",
"\t<tr><td>3.0</td><td>5.2</td><td>2.0</td></tr>\n",
"\t<tr><td>3.4</td><td>5.4</td><td>2.3</td></tr>\n",
"\t<tr><td>3.0</td><td>5.1</td><td>1.8</td></tr>\n",
"</tbody>\n",
"</table>\n"
],
"text/latex": [
"\\begin{tabular}{r|lll}\n",
" Sepal.Width & Petal.Length & Petal.Width\\\\\n",
"\\hline\n",
"\t 3.5 & 1.4 & 0.2\\\\\n",
"\t 3.0 & 1.4 & 0.2\\\\\n",
"\t 3.2 & 1.3 & 0.2\\\\\n",
"\t 3.1 & 1.5 & 0.2\\\\\n",
"\t 3.6 & 1.4 & 0.2\\\\\n",
"\t 3.9 & 1.7 & 0.4\\\\\n",
"\t 3.4 & 1.4 & 0.3\\\\\n",
"\t 3.4 & 1.5 & 0.2\\\\\n",
"\t 2.9 & 1.4 & 0.2\\\\\n",
"\t 3.1 & 1.5 & 0.1\\\\\n",
"\t 3.7 & 1.5 & 0.2\\\\\n",
"\t 3.4 & 1.6 & 0.2\\\\\n",
"\t 3.0 & 1.4 & 0.1\\\\\n",
"\t 3.0 & 1.1 & 0.1\\\\\n",
"\t 4.0 & 1.2 & 0.2\\\\\n",
"\t 4.4 & 1.5 & 0.4\\\\\n",
"\t 3.9 & 1.3 & 0.4\\\\\n",
"\t 3.5 & 1.4 & 0.3\\\\\n",
"\t 3.8 & 1.7 & 0.3\\\\\n",
"\t 3.8 & 1.5 & 0.3\\\\\n",
"\t 3.4 & 1.7 & 0.2\\\\\n",
"\t 3.7 & 1.5 & 0.4\\\\\n",
"\t 3.6 & 1.0 & 0.2\\\\\n",
"\t 3.3 & 1.7 & 0.5\\\\\n",
"\t 3.4 & 1.9 & 0.2\\\\\n",
"\t 3.0 & 1.6 & 0.2\\\\\n",
"\t 3.4 & 1.6 & 0.4\\\\\n",
"\t 3.5 & 1.5 & 0.2\\\\\n",
"\t 3.4 & 1.4 & 0.2\\\\\n",
"\t 3.2 & 1.6 & 0.2\\\\\n",
"\t ⋮ & ⋮ & ⋮\\\\\n",
"\t 3.2 & 5.7 & 2.3\\\\\n",
"\t 2.8 & 4.9 & 2.0\\\\\n",
"\t 2.8 & 6.7 & 2.0\\\\\n",
"\t 2.7 & 4.9 & 1.8\\\\\n",
"\t 3.3 & 5.7 & 2.1\\\\\n",
"\t 3.2 & 6.0 & 1.8\\\\\n",
"\t 2.8 & 4.8 & 1.8\\\\\n",
"\t 3.0 & 4.9 & 1.8\\\\\n",
"\t 2.8 & 5.6 & 2.1\\\\\n",
"\t 3.0 & 5.8 & 1.6\\\\\n",
"\t 2.8 & 6.1 & 1.9\\\\\n",
"\t 3.8 & 6.4 & 2.0\\\\\n",
"\t 2.8 & 5.6 & 2.2\\\\\n",
"\t 2.8 & 5.1 & 1.5\\\\\n",
"\t 2.6 & 5.6 & 1.4\\\\\n",
"\t 3.0 & 6.1 & 2.3\\\\\n",
"\t 3.4 & 5.6 & 2.4\\\\\n",
"\t 3.1 & 5.5 & 1.8\\\\\n",
"\t 3.0 & 4.8 & 1.8\\\\\n",
"\t 3.1 & 5.4 & 2.1\\\\\n",
"\t 3.1 & 5.6 & 2.4\\\\\n",
"\t 3.1 & 5.1 & 2.3\\\\\n",
"\t 2.7 & 5.1 & 1.9\\\\\n",
"\t 3.2 & 5.9 & 2.3\\\\\n",
"\t 3.3 & 5.7 & 2.5\\\\\n",
"\t 3.0 & 5.2 & 2.3\\\\\n",
"\t 2.5 & 5.0 & 1.9\\\\\n",
"\t 3.0 & 5.2 & 2.0\\\\\n",
"\t 3.4 & 5.4 & 2.3\\\\\n",
"\t 3.0 & 5.1 & 1.8\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"Sepal.Width | Petal.Length | Petal.Width | \n",
"|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|\n",
"| 3.5 | 1.4 | 0.2 | \n",
"| 3.0 | 1.4 | 0.2 | \n",
"| 3.2 | 1.3 | 0.2 | \n",
"| 3.1 | 1.5 | 0.2 | \n",
"| 3.6 | 1.4 | 0.2 | \n",
"| 3.9 | 1.7 | 0.4 | \n",
"| 3.4 | 1.4 | 0.3 | \n",
"| 3.4 | 1.5 | 0.2 | \n",
"| 2.9 | 1.4 | 0.2 | \n",
"| 3.1 | 1.5 | 0.1 | \n",
"| 3.7 | 1.5 | 0.2 | \n",
"| 3.4 | 1.6 | 0.2 | \n",
"| 3.0 | 1.4 | 0.1 | \n",
"| 3.0 | 1.1 | 0.1 | \n",
"| 4.0 | 1.2 | 0.2 | \n",
"| 4.4 | 1.5 | 0.4 | \n",
"| 3.9 | 1.3 | 0.4 | \n",
"| 3.5 | 1.4 | 0.3 | \n",
"| 3.8 | 1.7 | 0.3 | \n",
"| 3.8 | 1.5 | 0.3 | \n",
"| 3.4 | 1.7 | 0.2 | \n",
"| 3.7 | 1.5 | 0.4 | \n",
"| 3.6 | 1.0 | 0.2 | \n",
"| 3.3 | 1.7 | 0.5 | \n",
"| 3.4 | 1.9 | 0.2 | \n",
"| 3.0 | 1.6 | 0.2 | \n",
"| 3.4 | 1.6 | 0.4 | \n",
"| 3.5 | 1.5 | 0.2 | \n",
"| 3.4 | 1.4 | 0.2 | \n",
"| 3.2 | 1.6 | 0.2 | \n",
"| ⋮ | ⋮ | ⋮ | \n",
"| 3.2 | 5.7 | 2.3 | \n",
"| 2.8 | 4.9 | 2.0 | \n",
"| 2.8 | 6.7 | 2.0 | \n",
"| 2.7 | 4.9 | 1.8 | \n",
"| 3.3 | 5.7 | 2.1 | \n",
"| 3.2 | 6.0 | 1.8 | \n",
"| 2.8 | 4.8 | 1.8 | \n",
"| 3.0 | 4.9 | 1.8 | \n",
"| 2.8 | 5.6 | 2.1 | \n",
"| 3.0 | 5.8 | 1.6 | \n",
"| 2.8 | 6.1 | 1.9 | \n",
"| 3.8 | 6.4 | 2.0 | \n",
"| 2.8 | 5.6 | 2.2 | \n",
"| 2.8 | 5.1 | 1.5 | \n",
"| 2.6 | 5.6 | 1.4 | \n",
"| 3.0 | 6.1 | 2.3 | \n",
"| 3.4 | 5.6 | 2.4 | \n",
"| 3.1 | 5.5 | 1.8 | \n",
"| 3.0 | 4.8 | 1.8 | \n",
"| 3.1 | 5.4 | 2.1 | \n",
"| 3.1 | 5.6 | 2.4 | \n",
"| 3.1 | 5.1 | 2.3 | \n",
"| 2.7 | 5.1 | 1.9 | \n",
"| 3.2 | 5.9 | 2.3 | \n",
"| 3.3 | 5.7 | 2.5 | \n",
"| 3.0 | 5.2 | 2.3 | \n",
"| 2.5 | 5.0 | 1.9 | \n",
"| 3.0 | 5.2 | 2.0 | \n",
"| 3.4 | 5.4 | 2.3 | \n",
"| 3.0 | 5.1 | 1.8 | \n",
"\n",
"\n"
],
"text/plain": [
" Sepal.Width Petal.Length Petal.Width\n",
"1 3.5 1.4 0.2 \n",
"2 3.0 1.4 0.2 \n",
"3 3.2 1.3 0.2 \n",
"4 3.1 1.5 0.2 \n",
"5 3.6 1.4 0.2 \n",
"6 3.9 1.7 0.4 \n",
"7 3.4 1.4 0.3 \n",
"8 3.4 1.5 0.2 \n",
"9 2.9 1.4 0.2 \n",
"10 3.1 1.5 0.1 \n",
"11 3.7 1.5 0.2 \n",
"12 3.4 1.6 0.2 \n",
"13 3.0 1.4 0.1 \n",
"14 3.0 1.1 0.1 \n",
"15 4.0 1.2 0.2 \n",
"16 4.4 1.5 0.4 \n",
"17 3.9 1.3 0.4 \n",
"18 3.5 1.4 0.3 \n",
"19 3.8 1.7 0.3 \n",
"20 3.8 1.5 0.3 \n",
"21 3.4 1.7 0.2 \n",
"22 3.7 1.5 0.4 \n",
"23 3.6 1.0 0.2 \n",
"24 3.3 1.7 0.5 \n",
"25 3.4 1.9 0.2 \n",
"26 3.0 1.6 0.2 \n",
"27 3.4 1.6 0.4 \n",
"28 3.5 1.5 0.2 \n",
"29 3.4 1.4 0.2 \n",
"30 3.2 1.6 0.2 \n",
"⋮ ⋮ ⋮ ⋮ \n",
"121 3.2 5.7 2.3 \n",
"122 2.8 4.9 2.0 \n",
"123 2.8 6.7 2.0 \n",
"124 2.7 4.9 1.8 \n",
"125 3.3 5.7 2.1 \n",
"126 3.2 6.0 1.8 \n",
"127 2.8 4.8 1.8 \n",
"128 3.0 4.9 1.8 \n",
"129 2.8 5.6 2.1 \n",
"130 3.0 5.8 1.6 \n",
"131 2.8 6.1 1.9 \n",
"132 3.8 6.4 2.0 \n",
"133 2.8 5.6 2.2 \n",
"134 2.8 5.1 1.5 \n",
"135 2.6 5.6 1.4 \n",
"136 3.0 6.1 2.3 \n",
"137 3.4 5.6 2.4 \n",
"138 3.1 5.5 1.8 \n",
"139 3.0 4.8 1.8 \n",
"140 3.1 5.4 2.1 \n",
"141 3.1 5.6 2.4 \n",
"142 3.1 5.1 2.3 \n",
"143 2.7 5.1 1.9 \n",
"144 3.2 5.9 2.3 \n",
"145 3.3 5.7 2.5 \n",
"146 3.0 5.2 2.3 \n",
"147 2.5 5.0 1.9 \n",
"148 3.0 5.2 2.0 \n",
"149 3.4 5.4 2.3 \n",
"150 3.0 5.1 1.8 "
]
},
"metadata": {}, "metadata": {},
"output_type": "display_data" "outputs": [],
}
],
"source": [ "source": [
"read_excel(datasets,sheet = \"iris\",range = cell_cols(\"B:D\"))" "read_excel(datasets,sheet = \"iris\",range = cell_cols(\"B:D\"))"
] ]
...@@ -2358,29 +2086,258 @@ ...@@ -2358,29 +2086,258 @@
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": { "metadata": {},
"collapsed": true "source": [
"## Manipuler des données avec dplyr\n",
"\n",
"**dplyr** est un package facilitant le traitement et la manipulation de données contenues dans une ou plusieurs tables, la manipulation de données se fait en utilisant un nombre réduit de **verbes**, qui correspondent chacun à une action différente appliquée à un tableau de données.\n",
"\n",
"Les fonctions de dplyr sont en général plus rapides que leur équivalent sous R de base, elles permettent donc de traiter des données de grande dimension.\n",
"\n",
"**dplyr** fait partie du coeur du **tidyverse**, elle est donc chargée automatiquement avec :"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"library(tidyverse)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# lecture des donnees\n",
"library(nycflights13)\n",
"data(flights)\n",
"#data(airports)\n",
"#data(airlines)\n",
"#\n",
"arrange(select(filter(flights, dest == \"LAX\"), dep_delay, arr_delay), dep_delay)"
]
}, },
{
"cell_type": "markdown",
"metadata": {},
"source": [ "source": [
"Pour en savoir plus:\n", "### Les verbes de dplyr\n",
"\n", "\n",
"* [Gestion des données avec R](https://www.fun-mooc.fr/c4x/UPSUD/42001S03/asset/data-management.html) (Christophe Lalanne & Bruno Falissard -MOOC \"Introduction à la statistique avec R\").\n", "* **SLICE** : sélectionne des lignes du tableau selon leur position."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Selectionner la 345eme ligne du tableau airports\n",
"slice(airports, 345)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* **filter** : sélectionne des lignes d’un tableau de données selon une condition."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Selectionner les vols du mois de janvier en filtrant sur la variable month\n",
"filter(flights, month == 1)\n",
"# Vols avec un retard au départ (variable dep_delay) compris entre 10 et 15 minutes \n",
"filter(flights, dep_delay >= 10 & dep_delay <= 15)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* **select** : permet de sélectionner des colonnes d’un tableau de données. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Extraire les colonnes lat et lon du tableau airports\n",
"select(airports, lat, lon)\n",
"# Eliminer les colonnes lat et lon du tableau airport\n",
"select(airports, -lat, -lon)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* **rename** : permet de renommer facilement des colonnes d'un tableau de données (nouveau_nom = ancien_nom)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Renommer les colonnes lon et lat de airports en longitude et latitude \n",
"rename(airports, longitude = lon, latitude = lat)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* ** arrange** : réordonne les lignes d’un tableau selon une ou plusieurs colonnes."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Trier le tableau flights selon le retard au départ croissant \n",
"arrange(flights, dep_delay)\n",
"# Trier le tableau flights selon selon le mois, puis selon le retard au départ\n",
"arrange(flights, month, dep_delay)\n",
"# Trier selon une colonne par ordre décroissant\n",
"arrange(flights, desc(dep_delay))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* **mutate** : permet de créer de nouvelles colonnes dans le tableau de données, en général à partir de variables existantes."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# airports contient l’altitude en pieds, créer une nouvelle variable alt_m avec l’altitude en mètres\n",
"airports <- mutate(airports, alt_m = alt / 3.2808)\n",
"select(airports, name, alt, alt_m)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**mutate** est compatible avec les fonctions de recodages : forcats, if_else, case_when …"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"flights <- mutate(flights,\n",
" type_retard = case_when(\n",
" dep_delay > 0 & arr_delay > 0 ~ \"Retard départ et arrivée\",\n",
" dep_delay > 0 & arr_delay <= 0 ~ \"Retard départ\",\n",
" dep_delay <= 0 & arr_delay > 0 ~ \"Retard arrivée\",\n",
" TRUE ~ \"Aucun retard\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Enchaîner les opérations avec le pipe\n",
"\n", "\n",
"* [Begin'R (Bordeaux INP](http://beginr.u-bordeaux.fr/index.html#sommaire))\n", "Quand on manipule un tableau de données, il est très fréquent d’enchaîner plusieurs opérations. On va par exemple filtrer pour extraire une sous-population, sélectionner des colonnes puis trier selon une variable.\n",
"\n", "\n",
"* [Cookbook for R](http://www.cookbook-r.com/Manipulating_data/)\n", "#### Plusieurs méthodes\n",
"\n", "\n",
"* [Introduction à R et au tidyverse](https://juba.github.io/tidyverse/index.html)" "* Effectuer toutes les opérations en une fois en les “emboîtant” "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* Effectuer les opérations les unes après les autres, en stockant les résultats intermédiaires dans un objet temporaire."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"tmp <- filter(flights, dest == \"LAX\")\n",
"tmp <- select(tmp, dep_delay, arr_delay)\n",
"arrange(tmp, dep_delay)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* Utiliser l'opérateur **pipe** noté **%>%**\n",
"\n",
"**expr %>% f**, le résultat de l’expression *expr*, à gauche du pipe, sera passé comme premier argument à la fonction *f*, à droite du pipe, ce qui revient à exécuter **f(expr)**."
] ]
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# filter(flights, dest == \"LAX\")\n",
"flights %>% filter(dest == \"LAX\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# select(filter(flights, dest == \"LAX\"), dep_delay, arr_delay)\n",
"flights %>% filter(dest == \"LAX\") %>% select(dep_delay, arr_delay)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Pour en savoir plus consulter : https://juba.github.io/tidyverse/10-dplyr.html#preparation-2"
]
},
{
"cell_type": "markdown",
"metadata": { "metadata": {
"collapsed": true "collapsed": true
}, },
"outputs": [], "source": [
"source": [] "### Pour en savoir plus:\n",
"\n",
"* [Introduction à R et au tidyverse](https://juba.github.io/tidyverse/index.html)\n",
"\n",
"* [Gestion des données avec R](https://www.fun-mooc.fr/c4x/UPSUD/42001S03/asset/data-management.html) (Christophe Lalanne & Bruno Falissard -MOOC \"Introduction à la statistique avec R\").\n",
"\n",
"* [Begin'R (Bordeaux INP](http://beginr.u-bordeaux.fr/index.html#sommaire))\n",
"\n",
"* [Cookbook for R](http://www.cookbook-r.com/Manipulating_data/)\n"
]
} }
], ],
"metadata": { "metadata": {
...@@ -2395,7 +2352,7 @@ ...@@ -2395,7 +2352,7 @@
"mimetype": "text/x-r-source", "mimetype": "text/x-r-source",
"name": "R", "name": "R",
"pygments_lexer": "r", "pygments_lexer": "r",
"version": "3.3.2" "version": "3.4.3"
} }
}, },
"nbformat": 4, "nbformat": 4,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment