Skip to content

improvement of load, preprocessing and store_data modules.

This merge closes #24 (closed). A full set of modules (mostly load, preprocessing and store_data) have been developed and tested in order to allow for a new full catalog download approach :

Implementation

  1. The search criteria defined by the csv file of the user are used to fill an intake-esgf catalog.
  2. This catalog is used to download one model.variant couple at a time.
  3. The downloaded dictionaries are preprocessed before saving :
    • the climatology can be computed and the user chooses its frequency. (solves in part #22 (closed) and closes #26 (closed))
    • the keys of the dictionary are reduced according to a default method or the wanted keys of the user (solves #33 (closed))
    • the datasets of variable of the dictionary are condensed for each independent entry (for example two experiment_id for the same model.variant generate two independent entries) (solves in part #22 (closed))
  4. The condensed dictionaries are saved and their path is preserved in a saved pandas DataFrame.

These implementations are summed up by the following issue that are now closed : #18 (closed), #25 (closed), #26 (closed), #27 (closed), #28 (closed) and #33 (closed).

The internal documentation of the edited and newly created modules have been done which closes : #13 (closed) and its sub-issues #14 (closed), #15 (closed), #16 (closed), #17 (closed), #31 (closed), #32 (closed).

Following work

The main notebook needs to be documented and completed which is linked to main issue #19 (closed) that needs a dedicated branch.

The README.md files of the modules need to be written which is linked to main issue #2 (closed) that needs a dedicated branch.

In particular

  • The new options have brought the need to solve issue #20 (closed) by finding a way to describe and define these options.
  • Verify that all the changes made are incorporated in a good way in the notebook which is linked to issue #21 (closed).

Merge request reports

Loading