Commit 062c751a authored by Jean-Marc Vincent's avatar Jean-Marc Vincent
Browse files

Update TD5.espec.md

parent 6ca2bdc3
## TD5: Data Manipulation with dplyr ## TD5: Data Manipulation with dplyr
Using the [first name data Using the [first name data
set of INSEE](https://www.insee.fr/fr/statistiques/fichier/2540004/dpt2015_txt.zip), answer some of the following questions (you also have to use set of INSEE](https://www.insee.fr/fr/statistiques/fichier/2540004/dpt2017_txt.zip), answer some of the following questions (you also have to use
**ggplot2**): **ggplot2**):
- First name frequency evolves along time? - First name frequency evolves along time?
...@@ -13,24 +13,35 @@ set of INSEE](https://www.insee.fr/fr/statistiques/fichier/2540004/dpt2015_txt.z ...@@ -13,24 +13,35 @@ set of INSEE](https://www.insee.fr/fr/statistiques/fichier/2540004/dpt2015_txt.z
We have demonstrated the use of the six verbs in [our demonstration of We have demonstrated the use of the six verbs in [our demonstration of
dplyr](./TD5/TD5.Rmd). dplyr](./TD5/TD5.Rmd).
1. Install dplyr and magrittr 1. Using the [given names data set of INSEE](https://www.insee.fr/fr/statistiques/fichier/2540004/dpt2017_txt.zip), answer some of the following questions:
``` - First name frequency evolves along time?
install.packages("ggplot2"); - What can we say about ``Your name here'' (for each state, all the country)?
install.packages("magrittr"); - Is there some sort of geographical correlation with the data?
``` - Which department has a larger variety of names along time?
- _your own question_ (be creative)
You need to use the _tidyverse_ for this analysis. Unzip the file _dpt2016_txt.zip_ (to get the **dpt2017.txt**). Read in R with this code. Note that you might need to install the `readr` package with the appropriate command.
2. Download Raw Data from the website
```{r}
file = "dpt2017_txt.zip"
if(!file.exists(file)){
download.file("https://www.insee.fr/fr/statistiques/fichier/2540004/dpt2017_txt.zip",
destfile=file)
}
unzip(file)
```
3. Unzip the file _dpt2015_txt.zip_ (to get the **dpt2015.txt**) 3. Build the Dataframe from file
4. Read in R with this code. Note that you might need to ```{r}
install the `readr` package with the appropriate command. library(tidyverse)
library(ggplot2)
``` df <- read_tsv("dpt2017.txt", locale = locale(encoding = "ISO-8859-1"));
library(readr); df %>% head(n=10)
df <- read_tsv("dpt2015.txt", locale = locale(encoding = "ISO-8859-1")); ```
```
4. Create your own R markdown (Rmd). Follow the PL guidelines, 4. Create your own R markdown (Rmd). Follow the PL guidelines,
explaining your data manipulation and plot. explaining your data manipulation and plot.
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment