Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • hasdi/mspl-2018-2019
  • elbahrim/mspl-2018-2019
  • hasdit/mspl-2018-2019
  • aydine/mspl-2018-2019
  • firsovol/mspl-2018-2019
  • lemairlo/mspl-2018-2019
  • perrinal/mspl-2018-2019
  • jlassii/mspl-2018-2019
  • fallaly/mspl-2018-2019
  • hannequc/mspl-2018-2019
  • jeffersc/mspl-2018-2019
  • thiachei/mspl-2018-2019
  • meliardq/mspl-2018-2019
  • yaminiz/mspl-2018-2019
  • kabads/mspl-2018-2019
  • waelm/mspl-2018-2019
  • bouazizl/mspl-2018-2019
  • ghammaza/mspl-2018-2019
  • coulibka/mspl-2018-2019
  • housbans/mspl-2018-2019
  • cissed/mspl-2018-2019
  • dothit/mspl-2018-2019
  • bekkouca/mspl-2018-2019
  • khadirm/mspl-2018-2019
  • daumasj/mspl-2018-2019
  • baillsr/mspl-2018-2019
  • sajides/mspl-2018-2019
  • bernesj/mspl-2018-2019
  • madidea/mspl-2018-2019
  • vailhert/mspl-2018-2019
  • lestanir/mspl-2018-2019
  • tetrell/mspl-2018-2019
  • mourthas/mspl-2018-2019
  • alaimoj/mspl-2018-2019
  • martinsb/mspl-2018-2019
  • mathieya/mspl-2018-2019
36 results
Show changes
Commits on Source (16)
......@@ -4,6 +4,7 @@
- Run `mkdir TD2`
- You are going to put your TD2 on that directory
1. Select a plot
- Select any plot you have seen in journals/articles/...
- Create a text with your **critical view** on that plot
......@@ -19,4 +20,4 @@
- `git commit -m "my TD2 has been completed"`
- `git push`
Congratulations, you're done for the TD2.
\ No newline at end of file
Congratulations, you're done for the TD2.
This diff is collapsed.
## TD5: Data Manipulation with dplyr
Using the [first name data
set of INSEE](https://www.insee.fr/fr/statistiques/fichier/2540004/dpt2017_txt.zip),
answer some of the following questions (you also have to use
**ggplot2**):
- First name frequency evolves along time?
- What can we say about ``Your name here'' (for each state, FR)?
- Is there some sort of geographical correlation with the data?
- Which state has a larger variety of names along time?
- _your own question_ (be creative)
We have demonstrated the use of the six verbs in [our demonstration of
dplyr](./TD5/TD5.Rmd).
1. Install tidyverse ecosystem
```
install.packages("tidyverse");
```
3. Unzip the file _dpt2017_txt.zip_ (to get the **dpt2017.txt**)
4. Read in R with this code. Note that you might need to
install the `readr` package with the appropriate command.
```{r}
file = "dpt2017_txt.zip"
if(!file.exists(file)){
download.file("https://www.insee.fr/fr/statistiques/fichier/2540004/dpt2017_txt.zip",
destfile=file)
}
unzip(file)
```
4. Create your own R markdown (Rmd). Follow the PL guidelines,
explaining your data manipulation and plot.
- Use all the six verbs: _select()_, _filter()_, _arrange()_, _mutate()_, _group_by()_, _summarize()_
- Plot data with _ggplot()_
---
title: "Data Manipulation with dplyr"
author: "Lucas Mello Schnorr, Jean-Marc Vincent"
date: "February 28, 2017"
output: pdf_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
This is a demonstration of how dplyr works.
First, we need some data.
```{r}
df <- data.frame(
name = c("John", "Mary", "Alice", "Peter", "Roger", "Phyllis"),
age = c(13, 15, 14, 13, 14, 13),
sex = c("Male", "Female", "Female", "Male", "Male", "Female")
)
head(df);
```
Load the necessary packages:
```{r}
library(dplyr);
library(magrittr);
```
Learning how to use the pipe operator (%>%):
```{r}
df %>% head(n=2);
```
You can chain multiple pipes:
```{r}
df %>% head(n=2) %>% nrow;
```
Now, let's use dplyr verbs, starting with _select()_ to select only one column:
```{r}
df %>% select(name);
```
You can also remove some column using the minus operator:
```{r}
df %>% select(-sex);
```
Now, let's use the _filter()_ verb (let's say we need only those of age 13):
```{r}
df %>% filter(age == 13);
```
If we need to re-order, we can use _arrange()_ (you can pass multiple variables as well):
```{r}
df %>% arrange(age);
```
And to create new columns, we can _mutate()_ (you can use existing variable names):
```{r}
df %>% mutate(birth = 2017 - age);
```
The _summarize()_ verb can be used to reduce data (to know the average age, for instance):
```{r}
df %>% summarize(mean_age = mean(age));
```
The _group_by()_ verb can be used to split data, apply some function and combine results (for example, to know the average age depending on sex):
```{r}
df %>% group_by(sex) %>% summarize(mean_age = mean(age));
```
We can also _group_by()_ by multiple variables (to see the number of occurrences of age/sex combinations):
```{r}
df %>% group_by(age, sex) %>% summarize(N=n());
```
How to use in combination with _ggplot()_:
```{r, fig.width=2.5, fig.height=3}
library(ggplot2);
df %>%
group_by(sex) %>%
summarize(mean_age = mean(age)) %>%
ggplot(aes(x=sex, y=mean_age)) +
geom_point() + ylim(0,NA) + theme_bw();
```
File added
File added
File added
File added
File added
File added
# Mini-Projet
The Mini-Projet has to be developed in *groups of two students*.
## Datasets
List where data sets have to be chosen from:
- FR Data.gouv: https://www.data.gouv.fr/fr/
- Insee: https://www.insee.fr/fr/accueil
- Kaggle: https://www.kaggle.com/datasets
- US Data.gov: https://catalog.data.gov/dataset
## Guidelines
1. Justify the data set selection and described it
2. Formulate three questions about the selected data set
3. Discuss with the professors and choose one question among these
4. Propose a methodology to answer that question
5. Implement this methodology using Literate Programming
6. The report must be acessible online (in your git repository)
## Report
The R Markdown (to be written in RStudio) must contain:
- Frontpage
- _Title_
- Student's name
- Table of contents
- Introduction
- Context / Dataset description
- How the dataset has been obtained?
- Description of the question
- Methodology
- Data clean-up procedures
- Scientific workflow
- Data representation choices
- Analysis in Literate Programming
- Conclusion
- References
## Important notes
The question might be from simple to complex. More the question is
simple, more deep should be the analysis.
## Groups
Your project defence will be composed of (strictly) 5 minutes of presentation
and 2 minutes of questions. The presentation support could be anything: if you
have a PDF, you have to provide it to us in advance, if you need your own
machine be prepared (setup time is part of your defence time).
| id | Group | Time | Teams |Theme | Repository | Rapport
|----|-------|-------|----------------------------------------------|---------------|-------------------------|-----
|1|2| 09:42|Wael Mazen/Tetrel Loan | Vente de jeux video | https://gricad-gitlab.univ-grenoble-alpes.fr/waelm/mspl-2018-2019 | https://gricad-gitlab.univ-grenoble-alpes.fr/waelm/mspl-2018-2019/blob/master/Activities/Projet%20WAEL-TETREL/L3-MIAGE-MSPL-WAEL-TETREL.pdf
|2|2| 10:00| Lemaire Loïc / Lestani Robinson | Apps Googlestore| https://gricad-gitlab.univ-grenoble-alpes.fr/lestanir/mspl-2018-2019|https://gricad-gitlab.univ-grenoble-alpes.fr/lestanir/mspl-2018-2019/blob/master/Projet/L3-MIAGE-MSPL-LESTANI-LEMAIRE.pdf
|3|2| 10:07| Vailhère Théo /Meliard Quentin | Consommation d'alcool | https://gricad-gitlab.univ-grenoble-alpes.fr/meliardq/mspl-2018-2019
|4|2| 10:14| Ghammaz Ayoub / Orand Régis | Séismes | https://gricad-gitlab.univ-grenoble-alpes.fr/orandr/mspl-2018-2019 | CF mail ...
|5|2| 10:21| Madide Adam / Phan Dhanh |Indice du boheur | https://gricad-gitlab.univ-grenoble-alpes.fr/madidea/mspl-2018-2019 | https://gricad-gitlab.univ-grenoble-alpes.fr/madidea/mspl-2018-2019/blob/master/Projet/L3-MIAGE-MSPL-MADIDE-PHAN.pdf
|6|2| 10:28| Fall Ayy / Sagara Idriss | Performance étudiants| https://gricad-gitlab.univ-grenoble-alpes.fr/sagarai/MSPL | https://gricad-gitlab.univ-grenoble-alpes.fr/sagarai/MSPL/blob/master/Projet/ProjetR_Student_performance.Rmd
|7|2| 10:35| Mathieu Yacine / Martins Benoit | Taux de chomage | https://gricad-gitlab.univ-grenoble-alpes.fr/mathieya/mspl-2018-2019 | https://gricad-gitlab.univ-grenoble-alpes.fr/mathieya/mspl-2018-2019/blob/master/Projet/Projet_Stats_MARTINS_MATHIEU.pdf
|8|2| 10:42| Diagne Khadidiatou / Thiam Cheikh | Fifa-sport |https://gricad-gitlab.univ-grenoble-alpes.fr/diagnek/mspl-2018-2019 | CF mail
|9|2| 10:49| Sajides Soufiane / Bernes Jules | Evolution des crimes en France| https://gricad-gitlab.univ-grenoble-alpes.fr/bernesj/mspl-2018-2019 | https://gricad-gitlab.univ-grenoble-alpes.fr/bernesj/mspl-2018-2019/blob/master/Activities/Project/script.pdf
|10|1| 08:00| Jlassis Imene / Kadidiatou Coulibaly | Taux de Chômage | https://gricad-gitlab.univ-grenobles-alpes.fr/coulikb/mspl-2018-2019 |
|11|1| 08:07| Romain Baills / Firsov Oleksandr | Google play store |https://gricad-gitlab.univ-grenoble-alpes.fr/baillsr/mspl-2018-2019 |
|12|1| 08:14| Khadir Mehdi / Zouir Yamini | Accident corporel de la circulation | https://gricad-gitlab.univ-grenobles-alpes.fr/khadirm/mspl-2018-2019 | https://gricad-gitlab.univ-grenoble-alpes.fr/khadirm/mspl-2018-2019/blob/master/projet/projet_stat_Mehdi_Zoher.pdf
|13|1| 08:21| Jolan Daumas / Blaye Thibault | Kickstatrter | https://gricad-gitlab.univ-grenobles-alpes.fr/daumasj/mspl-2018-2019 | https://gricad-gitlab.univ-grenoble-alpes.fr/daumasj/mspl-2018-2019/blob/master/Projet/L3-MIAGE-MSPL-DAUMAS-BLAYE.html
|14|2| 10:56| Jovanovic Loic / Mourthadhoi Sultan | Les jeux olympiques | https://gricad-gitlab.univ-grenobles-alpes.fr/mourthas/mspl-2018-2019 | https://gricad-gitlab.univ-grenoble-alpes.fr/mourthas/mspl-2018-2019/blob/master/Projet/L3-MIAGE-MSPL-MOURTHADHOI-JOVANOVIC.pdf
|15|?| 08:28| Lucille Guillaume / Cedric Hannequin | Etude films | https://gricad-gitlab.univ-grenoble-alpes.fr/guillalu/mspl-2018-2019 | https://gricad-gitlab.univ-grenoble-alpes.fr/guillalu/mspl-2018-2019/blob/master/Projet/L3-MIAGE-MSPL-Hannequin-Guillaume.pdf.
|16|?| 08:35| Hanh Do / ??? | Changements climatiques | https://gricad-gitlab.univ-grenoble-alpes.fr/dothit/mspl-2018-2019 |https://gricad-gitlab.univ-grenoble-alpes.fr/dothit/mspl-2018-2019/blob/master/Projet/L3-MIAGE-MSPL-DO-CISSE/L3-MIAGE-MSPL-DO-CISSE.pdf
|17|?| 08:42| Emre AYDIN / elhadj amadou korka bah | ?? | https://gricad-gitlab.univ-grenoble-alpes.fr/aydine/mspl-2018-2019/ | https://gricad-gitlab.univ-grenoble-alpes.fr/aydine/mspl-2018-2019/blob/8a8dbdebabd466f48e059db6f7f1c2f580124066/Projet/L3-MIAGE-MSPL-Aydin-Elhadj.pdf
|18|?| 09:00| Mathis Ruffieux | Rejet de CO2 & dérèglement climatique | https://gricad-gitlab.univ-grenoble-alpes.fr/ruffieum/mspl-2018-2019 | https://gricad-gitlab.univ-grenoble-alpes.fr/ruffieum/mspl-2018-2019/blob/master/L3-MIAGE-MSPL-RUFFIEUX.Rmd
|19|?| 09:07| Massinissa Bitous / BOUAZIZ Lamine | Taux de suicide en France | ?? | https://gricad-gitlab.univ-grenoble-alpes.fr/bitousm/projetstatfinal/blob/master/L3-MIAGE-MSPL-Bouaziz_Bitous.pdf
|20|?| 09:14| Belahadji Ilyes / ElBahraoui Imade | Niveaux scolaire | https://gricad-gitlab.univ-grenoble-alpes.fr/belahadi/ | https://gricad-gitlab.univ-grenoble-alpes.fr/belahadi/projet-stat/blob/master/L3-MIAGE-MSPL-Belahadji-ElBahraoui._Rapport_Stats.Rmd
|21|?| 09:21| PERRIN Alexandre / MENDES Julien | Jeux vidéos | https://gricad-gitlab.univ-grenoble-alpes.fr/perrinal/mspl-2018-2019/ | https://gricad-gitlab.univ-grenoble-alpes.fr/perrinal/mspl-2018-2019/blob/master/PROJET/L3-MIAGE-MSPL-PERRIN-MENDES.pdf
|22|?| 09:28| Tariq HASDI | Films | ??? | CF mail
|23|?| 09:35| Abderrahmane Bekkouch / Soufiane Kabad | Nombre Enteprise | https://gricad-gitlab.univ-grenoble-alpes.fr/bekkouca/mspl-2018-2019 | https://gricad-gitlab.univ-grenoble-alpes.fr/bekkouca/mspl-2018-2019/blob/master/Projet%20MSPL/Projet2019-AB-SK.pdf
|24|?| 10:56| Lamine Imad Eddine / Housbane Soraya | Éducation dans le monde | https://gricad-gitlab.univ-grenoble-alpes.fr/laminei/mspl-2018-2019 | https://gricad-gitlab.univ-grenoble-alpes.fr/laminei/mspl-2018-2019/blob/master/LAMINE_HOUSBANE/Rapport.pdf
......@@ -5,6 +5,32 @@ Modèles statistiques et Programmation Lettrée Licence MIAGE3 2018-2019
Welcome to the Literate Programming public repository, part of the
Miage L3.
## Final Project
Le mini-projet, à faire en binome, doit être rendu le **29 mars avant minuit**,
sur le git dans un dossier spécifique intitulé Projet contenant le fichier R markdown (Rmd), le fichier pdf généré par knitr, les données auxiliaires si besoin)
utiliser L3-MIAGE-MSPL-Nom1-Nom2.* comme nom de fichiers.
Vous devez nous signaler que les fichiers ont été déposés par un mail à l'attention de
Jean-Marc.Vincent@univ-grenoble.fr et Adrien.Faure@inria.fr
entête du message : [L3-MIAGE:MSPL] mini-projet Nom1 Nom2
Corps du message : (en plus du contenu traditionnel)
lien vers le Rmd
lien vers le pdf
adresse d'envoi : etu.univ-grenoble-alpes.fr
Nous envisageons une
présentation "ultra-courte" (5 minutes) pour convaincre le mardi 2 avril.
- [Specification of the **Mini-Projet**](./Project.espec.md)
The course is organized by sessions mixing fundamentals and activities
## Fundamentals
......@@ -22,12 +48,13 @@ The course is organized by sessions mixing fundamentals and activities
3. [Using RStudio for running a Statistical Analysis](./TD3.espec.md)
4. [Combining RStudio and The Grammar of Graphics (of ggplot2)](./TD4.espec.md)
5. [Data Manipulation with dplyr](./TD5.espec.md)
6. [Galilée et le Paradoxe du Duc de Toscane](./TD6.espec.md)
7. [Hypothesis Test : le Paradoxe du Duc de Toscane (la suite)](./TD7/TD7.Rmd)
<!-- 6. [Galilée et le Paradoxe du Duc de Toscane](./TD6.espec.md) -->
<!-- 7. [Hypothesis Test : le Paradoxe du Duc de Toscane (la suite)](./TD7/TD7.Rmd) -->
## Additional resources
- [RStudio Cheat Sheets](https://www.rstudio.com/resources/cheatsheets/)
- [Interesting approach to understand ggplot2](https://evamaerey.github.io/ggplot_flipbook/ggplot_flipbook_xaringan.html#1)
## Contact Information
......