Data assimilation algorithms applied to geomagnetic data in the spectral domain.
...
...
@@ -12,7 +12,7 @@ From Olivier Barrois, Nicolas Gillet, Loïc Huder, Yannick Martin and Franck Tho
## Installation
### Requirements
The installation of pygeodyn_DA requires Python 3 (tested under 3.5) to be installed with the following packages:
The installation of pygeodyn requires Python 3 (tested under 3.5) to be installed with the following packages:
* _setuptools_ (tested against 40.4.3): to be able to run `setup.py`.
* _numpy_ (at least 1.7): to be able to wrap Fortran files with _f2py_.
...
...
@@ -25,21 +25,21 @@ The other dependencies will be automatically installed by the next step but are
* _matplotlib_
* _hypothesis_ (only for the tests)
### Install the pygeodyn_DA package
### Install the pygeodyn package
Note: Two installation strategies are proposed: a classical installation and an installation in a virtual envrionnement. The latter is preferred as it gives more control on the Python environment used to run the program. The rest of the README is independent of the chosen strategy.
#### Classical installation
If the requirements are satisfied, pygeodyn_DA can be installed by running
If the requirements are satisfied, pygeodyn can be installed by running
```sh
python3 setup.py develop --user
```
> The ```develop``` option links the current folder in the python library. Any modification will be immediately taken into account without the need to install the package again. It is also needed for relative path resolution.
Test if the installation succeeded by importing pygeodyn_DA in python3
Test if the installation succeeded by importing pygeodyn in python3
This should print the current version of pygeodyn_DA if the installation succeeded.
This should print the current version of pygeodyn if the installation succeeded.
#### How to install in a virtual environment
The virtual environnement is a closed space that allows to control the Python packages' installation. This is therefore the preferred installation strategy as it also works on clusters where not all rights are given.
...
...
@@ -60,19 +60,19 @@ Once the virtual environment is set up, all package installations and launching
Being inside the virtual env is notified by the `(<folder_name>)` (with <folder_name> the name of the folder of the virtual env) at the left of the command line. If so, pygeodyn_DA and its dependencies can now be installed.
Being inside the virtual env is notified by the `(<folder_name>)` (with <folder_name> the name of the folder of the virtual env) at the left of the command line. If so, pygeodyn and its dependencies can now be installed.
_Setuptools_ and _pip_ should be installed in the virtual env allowing to install the numpy requirement:
```bash
pip3 install numpy
```
Once this is completed, we can install pygeodyn_DA as previously:
Once this is completed, we can install pygeodyn as previously:
```bash
python3 setup develop
```
Once again, test if the installation succeeded by importing pygeodyn_DA in python3
Once again, test if the installation succeeded by importing pygeodyn in python3
Please report any failed tests using the contact info given at the end of the README file.
Detailed information on the coverage of the tests can be found on [GRICAD](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn_DA/htmlcov/).
Detailed information on the coverage of the tests can be found on [GRICAD](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/htmlcov/).
### Reinstallation
```sh
...
...
@@ -98,7 +98,7 @@ python3 setup.py develop --user
## Usage
pygeodyn_DA uses configuration files (extension: `.conf`) to set the computation parameters. See [example.conf](./example.conf) for an example of such a configuration file.
pygeodyn uses configuration files (extension: `.conf`) to set the computation parameters. See [example.conf](./example.conf) for an example of such a configuration file.
### Launching the algorithm
The computation is launched by running `run_algo.py`. Several parameters can be given in arguments to tune the computation. The only one needed is `-conf` whereas default behaviours are set if the other arguments are not given.
...
...
@@ -108,7 +108,7 @@ The computation is launched by running `run_algo.py`. Several parameters can be
*`-algo`: the name of the algorithm to use. As for now, the only supported algorithm is **augkf** (Augmented state Kalman Filter implemented in Python) which is the default value.
*`-seed`: a number used for the generation of random quantities. Set this to get reproducible results (default: generate a new one at each execution).
##### Output files
*`-path`: path where the folder results will be created (default: `~/pygeodyn_DA_results/`).
*`-path`: path where the folder results will be created (default: `~/pygeodyn_results/`).
*`-cname`: name of the computation. Used to create a folder in `-path` where the results will be saved (default: `Current_computation`).
##### Logging
*`-v`: level of the logging: 1 DEBUG, 2 INFO (default), 3 WARNING, 4 ERROR, 5 CRITICAL.
When a new computation is launched, pygeodyn_DA creates a folder of name `-cname` (default: `Current_computation`) at the location given in `-path` (default: `~/pygeodyn_DA_results/`). All result files will be created in this output folder.
When a new computation is launched, pygeodyn creates a folder of name `-cname` (default: `Current_computation`) at the location given in `-path` (default: `~/pygeodyn_results/`). All result files will be created in this output folder.
#### Hdf5 file
The default behaviour of pygeodyn_DA is to automatically save the results at each time iteration in an hdf5 file of name `-cname`. This way, even in case of a crash, all previous iterations should be saved.
The default behaviour of pygeodyn is to automatically save the results at each time iteration in an hdf5 file of name `-cname`. This way, even in case of a crash, all previous iterations should be saved.
## For more information
### In-depth guide
We only presented here the high-level use of pygeodyn_DA that is encapusated in `run_algo.py` with supplied data. If you wish to use your own data, have more information on the implementation or use low-level features, please refer to the [In-depth guide](doc/guide.md) in the `doc` folder.
We only presented here the high-level use of pygeodyn that is encapusated in `run_algo.py` with supplied data. If you wish to use your own data, have more information on the implementation or use low-level features, please refer to the [In-depth guide](doc/guide.md) in the `doc` folder.
### Developer documentation
Documentation of the submodules and methods of the package are available on [GRICAD](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn_DA/index.html).
Documentation of the submodules and methods of the package are available on [GRICAD](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/index.html).
If [Sphinx](http://www.sphinx-doc.org/) is installed, it is possible to generate the documentation locally using:
```sh
sphinx-apidoc -e-o doc -f pygeodyn_DA/
sphinx-apidoc -e-o doc -f pygeodyn/
cd doc && sphinx-build -b html . html &&cd ..
```
The doc will then be available in HTML format at doc/html/index.html.
...
...
@@ -169,7 +169,7 @@ If the use of this code led to a scientific output, please cite:
-[Huder, Gillet and Thollard, submitted (2019)]()
### Git repository
The source code is stored on a Git repository (https://gricad-gitlab.univ-grenoble-alpes.fr/Geodynamo/pygeodyn_DA) which can also be used to give feedbacks through [Issues](https://gricad-gitlab.univ-grenoble-alpes.fr/Geodynamo/pygeodyn_DA/issues).
The source code is stored on a Git repository (https://gricad-gitlab.univ-grenoble-alpes.fr/Geodynamo/pygeodyn) which can also be used to give feedbacks through [Issues](https://gricad-gitlab.univ-grenoble-alpes.fr/Geodynamo/pygeodyn/issues).
### Contact information
For scientific inquiries, contact [Nicolas Gillet](mailto:nicolas.gillet@univ-grenoble-alpes.fr). For technical problems (including failed tests), contact [Loïc Huder](mailto:loic.huder@univ-grenoble-alpes.fr) or [Franck Thollard](mailto:franck.thollard@univ-grenoble-alpes.fr).
In README.md was presented a way to use GeDACCMU through the use of `run_algo.py`. The `run_algo.py` program takes care of chaining and looping the data assimilation steps (and of numerous handy things such as the uses of multiple processes with mpi4py, the saving of output files and of logs). For more details, we present here the structure of the said program:
In README.md was presented a way to use pygeodyn through the use of `run_algo.py`. The `run_algo.py` program takes care of chaining and looping the data assimilation steps (and of numerous handy things such as the uses of multiple processes with mpi4py, the saving of output files and of logs). For more details, we present here the structure of the said program:
The initialisation of the algorithm is done by the [init_algorithm](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/GeDACCMU/GeDACCMU.run.html) method. This is a two-step method:
* First, `read_conf` reads the configuration file (see [example.conf](../example.conf)) to build a [ComputationConfig](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/GeDACCMU/GeDACCMU.inout.config.html) object named `config`.
The initialisation of the algorithm is done by the [init_algorithm](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/pygeodyn.run.html) method. This is a two-step method:
* First, `read_conf` reads the configuration file (see [example.conf](../example.conf)) to build a [ComputationConfig](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/pygeodyn.inout.config.html) object named `config`.
* Secondly, the `config` object is passed to the `launch_setup` method that will generate a _Forecaster_ object and an _Analyser_ object. The _Forecaster_ and _Analyser_ respectively implement the methods _forecast_step_ and _analysis_step_ that are the main components of the data assimilation process. The `run_algo.py` runs forecast and analysis steps on `CoreStates` (see below) at a frequency given by the user in the configuration file.
The two objects _Forecaster_ and _Analyser_ integrate all the input data that result from the algorithm initialisation:
...
...
@@ -24,7 +24,7 @@ For examples, you can refer to the priors embarked with the program (that are ou
The reading methods are found in [`GeDACCMU/inout/priors.py`](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/GeDACCMU/GeDACCMU.inout.priors.html)
The reading methods are found in [`pygeodyn/inout/priors.py`](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/pygeodyn.inout.priors.html)
. To use you own data, you must implement your reading method in the same file and design it with the same format. Namely:
```python
# The method name must have this name format and signature
...
...
@@ -42,7 +42,7 @@ Examples of valid reading methods are given for the embarked observations:
*[Ground and Virtual observatories](https://doi.org/10.1093/gji/ggy297)(obs_type:`go_vo`)
The strategy to use another type of observations follows the same paradigm as for priors: you must define a new observation type by implementing its reading method in [`GeDACCMU/inout/observations.py`](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/GeDACCMU/GeDACCMU.inout.observations.html). In more details, the format must be:
The strategy to use another type of observations follows the same paradigm as for priors: you must define a new observation type by implementing its reading method in [`pygeodyn/inout/observations.py`](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/pygeodyn.inout.observations.html). In more details, the format must be:
```python
# The method name must have this name format and signature
The implementation is trickier in this case and several points need to be highlighted:
* Your new reading method must take the measure_type as arg that will be equal to either 'MF' (for magnetic field observations) or 'SV' (for secular variation observations).
* The returned object must be a dictionary with numpy.datetime64 objects as keys that correspond to the date of the observations
* The said dictionary must contain [_Observation_ objects](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/GeDACCMU/GeDACCMU.inout.observations.html) that are a handy way to store all the observation parameters:
* The said dictionary must contain [_Observation_ objects](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/pygeodyn.inout.observations.html) that are a handy way to store all the observation parameters:
* The data `Y` as 2D numpy.ndarrays (nb_realisations x nb_observed_coefs)
* The observation operator `H` (nb_observed_coefs x nb_coefs) that verifies `Y = HX` where X is a vector containing the spectral coefficients of the magnetic field or secular variation (size: nb_coefs).
* The observation errors matrices `R` (nb_observed_coefs x nb_observed_coefs)
...
...
@@ -73,7 +73,7 @@ After having implemented your new method, set `obs_types` in the config file to
## Using low-level forecast/analysis steps
### The CoreState object
A [_CoreState_](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/GeDACCMU/GeDACCMU.corestates.html) is the object on which the computation steps (forecasts and analysis) are performed. It provides an interface to store and act on the data of the involved physical quantities (called *measures*) such as the magnetic field ('MF'), the secular variation ('SV'), the core flow ('U') and the subgrid errors ('ER').
A [_CoreState_](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/pygeodyn.corestates.html) is the object on which the computation steps (forecasts and analysis) are performed. It provides an interface to store and act on the data of the involved physical quantities (called *measures*) such as the magnetic field ('MF'), the secular variation ('SV'), the core flow ('U') and the subgrid errors ('ER').
The usual way to use it is to store measure data as 3D NumPy arrays under the format (nb_realisations, nb_times, nb_coeffs) (as used for _computed_states_, _forecast_states_ and _analysed_states_ in `run_algo.py`) but it can handle 1D and 2D NumPy arrays as well.
...
...
@@ -86,7 +86,7 @@ The usual way to use it is to store measure data as 3D NumPy arrays under the fo
cs_1D.Lu # Returns the max_degree of 'U' (here: 18 the forced value)
```
Other convenience getters can be found in the complete [documentation](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/GeDACCMU/GeDACCMU.corestates.html).
Other convenience getters can be found in the complete [documentation](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/pygeodyn.corestates.html).
* 3D :
It works exactly the same but CoreState also implements NumPy-like indexing:
We saw in the beginning of this section that the _forecast_step_ (resp. _analysis_step_) was implemented by the _Forecaster_ (resp. _Analyser_) object. The easiest way to set up these objects is to use to the _init_algorithm_ method with the path to the desired configuration file:
```python
fromGeDACCMU.runimportinit_algorithm
frompygeodyn.runimportinit_algorithm
algo_dict={'name':'augkf','nb_models':10}
config_path='augkf.conf'
...
...
@@ -193,4 +193,4 @@ cs.U[0, 2] # Analysed values of all core flow coefs for the realisation 0
# and so on...
```
More information can be found in the documentation of the [forecast_step](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/GeDACCMU/GeDACCMU.augkf.forecaster.html)[analysis_step](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/GeDACCMU/GeDACCMU.augkf.analyser.html) for the [augkf module](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/GeDACCMU/GeDACCMU.augkf.html) for example.
More information can be found in the documentation of the [forecast_step](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/pygeodyn.augkf.forecaster.html)[analysis_step](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/pygeodyn.augkf.analyser.html) for the [augkf module](https://geodynamo.gricad-pages.univ-grenoble-alpes.fr/pygeodyn/pygeodyn.augkf.html) for example.