Jobs

Primary jobs

Depending on the workflow mode selected, different primary jobs will be executed. The jobs are what Autosubmit submits to the remote platforms. They are a combination of templates (bash scripts) and the configuration selected by the user.

The templates are located in /workflow/templates.

  • local_setup: performs basic checks as well as compressing the workflow project in order to be sent through the network.

  • synchronize: syncs the workflow project with the remote platform.

  • remote_setup:
    • loads the necessary environment and then compiles the different models/applications.

    • Performs checks in the remote platform.

    • Installs the running applications and the GSV interface.

  • ini: prepares any necessary initial data for the climate model runs. Runs in the login node of the HPC.

  • sim: runs one chunk of climate simulation. Runs in the HPC.

  • dqc: performs basic checks on the data produced by the simulation. Runs in the HPC. It has two modes: BASIC and FULL.

  • dn: notifies when the wanted data is already produced by the model. Runs in the login node of the HPC.

  • opa: creates the statistics required by the data consumers (Apps). Runs in the HPC.

  • applications: creates usable output using the applications from the different use cases. Runs in the HPC.

Platform in which each one of the primary jobs run, as well as relation of jobs with each one of the workflow modes.

  • Login nodes for LUMI, interactive partition for MareNostrum5

Additional jobs

The additional jobs are optional.

  • transfer: transfers the data produced in the simulation to the Data Bridge. Runs in the HPC or the client machine in MN5.

  • backup: copies the rundir and certain restarts to another partition. Runs in the HPC.

  • check_mem: monitors the memory consumption of the SIM jobs. Runs in the login node of the HPC.

  • wipe: contains two jobs:
    • wipe-check: checks which data has already been transferred to the HPC-FDB. Runs in the HPC or the client machine in MN5.

    • wipe: wipes already transferred data from the HPC-FDB. Runs in the HPC.

  • clean: compresses the rundir and the logs from the HPC. Purges the data of the FDB (deletes repeated entries). Runs in the HPC.

  • clean_restarts: deletes certain restart directories on the remote platform. Runs in the HPC.
    • KEEP_EVERY: this variable determines which restart directories to keep. The 1st is always saved then each restart directory at the frequency indicated with this variable.

  • scaling: performs a scaling test of the model. Runs in the HPC.

  • aqua: contains 3 jobs:
    • LRA_GENERATOR: generates the LRA (Low resolution archive) files

    • AQUA_ANALYSIS: Performs the analysis of the AQUA files

    • AQUA_PUSH: Pushes AQUA plots to LUMI-O

    • AQUA_PUSH and the second one performs the analysis of the AQUA files

  • postprocessing for application: It is actually a job per application (POSTPROCESS_${APPNAME}), and allows for postprocessing scripts outside of the core streaming. It runs at the end of the chunk in app or end-to-end mode.(to be used in app or end-to-end modes) to run postprocessing scripts outside the core streaming.

JOB

PLATFORM

transfer

HPC

backup

HPC

check_mem

HPC Login node

wipe

HPC

clean

HPC

clean-restarts

HPC

scaling

HPC

LRA_GENERATOR

HPC

AQUA_ANALYSIS

HPC

AQUA_PUSH

Autosubnut VM

POSTPROCESS_${APPNAME} | HPC

Selectable configuration

We are now running the workflow with the new version of the CUSTOM_CONFIG and the minimal configuration new features of Autosubmit. This new configuration scheme allows for a distributed, hierarchical parametrization of the workflow, thereby providing a more customizable, modular, and user-friendly workflow. The structure, domain and use of this new configuration scheme will likely evolve as it adapts to the needs of other work packages.

In the file main.yml the user will decide the parameters of the simulation. Depending on what the user selects, one set or another of configurations will be loaded.

The following parameters will be used to load the configuration files:

  • RUN.WORKFLOW

  • MODEL.NAME

  • MODEL.SIMULATION

  • MODEL.GRID_ATM

  • CONFIGURATION.ADDITIONAL_JOBS.*

  • APP

The user can overwrite any parameter defininig it in the main.yml file. This will have priority over the default configuration files loaded previously.

Note

For a comprehensive list of the allowed values, see Configuration keys.

In the minimal.yml the basic information of the experiment is defined. It is the last file loaded in the configuration process. For more information: Autosubmit documentation on minimal experiments.