Commit 062c751a authored by Jean-Marc Vincent's avatar Jean-Marc Vincent
Browse files

Update TD5.espec.md

parent 6ca2bdc3
## TD5: Data Manipulation with dplyr
Using the [first name data
set of INSEE](https://www.insee.fr/fr/statistiques/fichier/2540004/dpt2015_txt.zip), answer some of the following questions (you also have to use
set of INSEE](https://www.insee.fr/fr/statistiques/fichier/2540004/dpt2017_txt.zip), answer some of the following questions (you also have to use
**ggplot2**):
- First name frequency evolves along time?
......@@ -13,24 +13,35 @@ set of INSEE](https://www.insee.fr/fr/statistiques/fichier/2540004/dpt2015_txt.z
We have demonstrated the use of the six verbs in [our demonstration of
dplyr](./TD5/TD5.Rmd).
1. Install dplyr and magrittr
1. Using the [given names data set of INSEE](https://www.insee.fr/fr/statistiques/fichier/2540004/dpt2017_txt.zip), answer some of the following questions:
```
install.packages("ggplot2");
install.packages("magrittr");
```
- First name frequency evolves along time?
- What can we say about ``Your name here'' (for each state, all the country)?
- Is there some sort of geographical correlation with the data?
- Which department has a larger variety of names along time?
- _your own question_ (be creative)
You need to use the _tidyverse_ for this analysis. Unzip the file _dpt2016_txt.zip_ (to get the **dpt2017.txt**). Read in R with this code. Note that you might need to install the `readr` package with the appropriate command.
2. Download Raw Data from the website
```{r}
file = "dpt2017_txt.zip"
if(!file.exists(file)){
download.file("https://www.insee.fr/fr/statistiques/fichier/2540004/dpt2017_txt.zip",
destfile=file)
}
unzip(file)
```
3. Unzip the file _dpt2015_txt.zip_ (to get the **dpt2015.txt**)
3. Build the Dataframe from file
4. Read in R with this code. Note that you might need to
install the `readr` package with the appropriate command.
```{r}
library(tidyverse)
library(ggplot2)
```
library(readr);
df <- read_tsv("dpt2015.txt", locale = locale(encoding = "ISO-8859-1"));
```
df <- read_tsv("dpt2017.txt", locale = locale(encoding = "ISO-8859-1"));
df %>% head(n=10)
```
4. Create your own R markdown (Rmd). Follow the PL guidelines,
explaining your data manipulation and plot.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment