Commit 8cbc88fe authored by Loic Huder's avatar Loic Huder
Browse files

Adding myself :)

parent eaba0705
......@@ -12,7 +12,7 @@
"\n",
"**A training to acquire strong basis in Python to use it efficiently**\n",
"\n",
"Pierre Augier (LEGI), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTERRE), Christophe Picard (LJK)\n",
"Pierre Augier (LEGI), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre), Christophe Picard (LJK), Loïc Huder (ISTerre)\n",
"\n",
"## Practical session 1\n",
"\n",
......
%% Cell type:markdown id: tags:
# Python training UGA 2017
**A training to acquire strong basis in Python to use it efficiently**
Pierre Augier (LEGI), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTERRE), Christophe Picard (LJK)
Pierre Augier (LEGI), Cyrille Bonamy (LEGI), Eric Maldonado (Irstea), Franck Thollard (ISTerre), Christophe Picard (LJK), Loïc Huder (ISTerre)
## Practical session 1
File parsing and dictionary usage
%% Cell type:markdown id: tags:
# Goal
The goal of this session is to end up with a script that computes some simple statistics from Meteo open data files. The file was modified and reduced for this exercice (just 1 station with data in just one year : 2016). In future, you can download other data here : https://public.opendatasoft.com/explore/dataset/donnees-synop-essentielles-omm/export/
## Material
The file contains lines of the form:
ID OMM station,Date,Average wind 10 mn,Temperature,Humidity,Rainfall 3 last hours,Station
7761,2016-01-01T01:00:00+01:00,2.0,283.75,94,0.2,AJACCIO
7761,2016-01-01T04:00:00+01:00,2.2,283.95,91,0.2,AJACCIO
7761,2016-01-01T07:00:00+01:00,1.7,284.05,88,0.2,AJACCIO
7761,2016-01-01T10:00:00+01:00,1.6,287.05,75,0.2,AJACCIO
7761,2016-01-01T13:00:00+01:00,3.1,289.55,73,0.0,AJACCIO
This is a classic csv file with separated data by ","
The first line is the header.
## Information to extract
We want to compute, some statistics for this station
## Warning
The temperature measurement is in kelvin (273,15 K $\leftrightarrow$ 0 °C)
%% Cell type:markdown id: tags:
# Step 1: load data
Write a script with a function `load_data()` that
- open the file
- load data in one of the following structures (more details below):
- 1.1 Single dictionnary
- 1.2 Multiple structures
- 1.3 Class instance (object-oriented)
## 1.1: Single dictionnary (pres07)
Load in the data in a single dictionnary of this structure:
```python
{'Date': [wind,temperature,humidity,rainfall]}
```
For example
```python
{
'2016-01-01T01': [2.0,283.75,94,0.2],
'2016-01-01T04': [2.2,283.95,91,0.2]
}
```
In this case, we can consider YYYY-MM-DDTHH as the key for the station dictionary.
Split each line and extract data.
#### Hint
You can use the method *split* from the str class.
%% Cell type:code id: tags:
``` python
s = "I am lucky"
l = s.split()
print(l)
```
%%%% Output: stream
['I', 'am', 'lucky']
%% Cell type:markdown id: tags:
## 1.2: Multiple structures (pres07)
Load the data in multiple dictionnaries or lists (one per field).
#### Example for dictionnaries:
You can use the following structure
```python3
wind = {'Date1': wind_value1, 'Date2: wind_value2, ...}
temperature = {'Date1': temperature1, 'Date2: temperature2, ...}
...
```
#### Example for lists:
You can use the following structure
```python3
dates = ['Date1', 'Date2', ...]
wind = [wind_value1, wind_value2, ...]
temperature = [temperature1, temperature2, ...]
...
```
%% Cell type:markdown id: tags:
## 1.3: Class instance (pres08)
Load the data in an instance of a class `WeatherStation` that you will define yourself. `load_data()` can therefore be a method of this class.
#### Hint :
This is very similar as 1.2. The only difference is that the structures storing the data are attributes of a class.
%% Cell type:markdown id: tags:
# Step 2: Compute max temperature and average temperature for the station
Write 2 functions `get_max_temperature()` and `get_average_temperature()` that:
- return a float
%% Cell type:markdown id: tags:
# Step 3: Compute sum of the rainfall for one station
Write 1 function `get_sum_rainfall()` that sum the rainfall.
- return a float
Be careful, some measurement have no rainfall data.
%% Cell type:markdown id: tags:
# Step 4: Search max period without rainfall
Write 1 function `period_without_rainfall()`
- return the beginning date, the ending date and the number of days without rainfall
## Hint
This is the syntax to return multiple values in a function:
```
return date_min, date_max, period_max / 8
```
%% Cell type:markdown id: tags:
# Step 5: How many hours with humidity rate < 60
Write 1 function `get_hours_humidity(rate)`
- takes 1 parameter : the humidity rate
- returns the number of days
%% Cell type:markdown id: tags:
## Final remark: Pandas
To do such data analysis, ones should not use pure Python code without external library! The library Pandas has been written to do this in few lines:
%% Cell type:code id: tags:
``` python
import pandas
df = pandas.read_csv(
'../TP/TP1_MeteoData/data/synop-2016.csv', sep=',', header=0)
# print(df.columns)
# print(df.age)
# print(df['Temperature'])
# print(df[(df['Station']=='AJACCIO')])
# print(df[(df['Station'] == 'AJACCIO')]['Rainfall 3 last hours'].sum())
temp = df[(df['Station'] == 'AJACCIO')]['Temperature'].mean()-273.15
print(f'The average of temperature at Ajaccio is {temp:.1f} °C.')
```
%%%% Output: stream
The average of temperature at Ajaccio is 16.3 °C.
%% Cell type:code id: tags:
``` python
```
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment