Run ACCESS-ESM
Prerequisites
General prerequisites
Before running ACCESS-ESM, you need to fulfil general prerequisites outlined in the First Steps section.
If you are unsure whether ACCESS-ESM is the right choice for your experiment, take a look at the overview of ACCESS Models.
Model-specific prerequisites
ACCESS-ESM is installed on NCI's supercomputer Gadi and uses payu, a tool for running and managing model experiments. Following these prerequisites ensures you have access to this infrastructure.
-
Join the access and hh5 projects at NCI
To join these projects, request membership on the respective access and hh5 NCI project pages.
For more information on joining specific NCI projects, refer to How to connect to a project. -
Payu
Payu on Gadi is available through theconda/analysis3
environment in the hh5 project.
After obtaining hh5 project membership, load theconda/analysis3
environment to automatically retrieve payu as follows:
To check that payu is available, run:module use /g/data/hh5/public/modules module load conda/analysis3
payu --version
payu --version 1.0.19
Get ACCESS-ESM configuration
ACCESS-ESM configurations are available on the coecms GitHub, collated in a single repository.
To get it on Gadi, clone the ACCESS-ESM GitHub repo by running:
git clone https://github.com/coecms/access-esm.git
This will create the access-esm
folder.
git
commands (e.g., matlab/R2018a). If you have trouble cloning the repository, run the following command before trying again: module purge
After this step, don't forget to reload the conda/analysis3
module to retrieve payu
, as specified in the Model-specific prerequisites section.
Different ACCESS-ESM configurations are stored in different branches of the ACCESS-ESM GitHub repo.
To check all the available branches on the repo, run the following command inside the newly-created access-esm
folder:
git branch -a
*
) indicates the local branch you are currently in.
The red-coloured branches are the available remote branches, formatted as remotes/origin/<branch-name>
.
To switch to a specific branch you can run the following command:
git checkout <branch-name>
For example, the pre-industrial configuration of ACCESS-ESM is available in the pre-industrial
branch. To use the pre-industrial configuration, run:
git checkout pre-industrial
Edit ACCESS-ESM configuration
It is good practice to create a new git branch to store all your modifications for a particular run, so as not to modify the reference configuration.
To create a new branch called "example_run", as a copy of the pre-industrial
branch, from within the access-esm
directory execute:
git checkout -b example_run --no-track origin/pre-industrial
This command will also switch to the new example_run
branch.
Payu
Payu is a workflow management tool for running numerical models in supercomputing environments.
The general layout of a payu-supported model run consists of two main directories:
-
The laboratory directory, where all the model components reside. For ACCESS-ESM, it is typically
/scratch/$PROJECT/$USER/access-esm
. -
The control directory, where the model configuration resides and from where the model is run (in this example, the cloned directory
~/access-esm
).
This distinction of directories separates the small-size configuration files from the larger binary outputs and inputs. In this way, the configuration files can be placed in the $HOME
directory (as it is the only filesystem actively backed-up on Gadi), without overloading it with too much data.
Furthermore, this separation allows multiple self-resubmitting experiments that share common executables and input data to be run simultaneously.
To setup the laboratory directory, run the following command from the control directory:
payu init
This creates the laboratory directory, together with relevant subdirectories, depending on the configuration. The main subdirectories of interest are:
work
→ a temporary directory where the model is run. It gets cleaned after each run.archive
→ the directory where output is stored after each run.
Edit the Master Configuration file
The config.yaml
file located in the control directory, is the Master Configuration file.
This file, which controls the general model configuration, contains several parts:
-
PBS resources
These lines can be edited to change the PBS directives for the PBS job.jobname: pre-industrial queue: normal walltime: 3:10:00
For example, to run ACCESS-ESM under thetm70
project (ACCESS-NRI), add the following line:project: tm70
Theproject
entry should always refer to a project with allocated Service Units (SU), that you are a member of. If not set explicitly, ACCESS-ESM will run using your default project (this default project still needs to have allocated SU). For more information, check how to join relevant NCI projects. -
Link to the laboratory directory
These lines set the laboratory directory path, which is relative to# note: if laboratory is relative path, it is relative to /scratch/$PROJECT/$USER laboratory: access-esm
/scratch/$PROJECT/$USER
. Absolute paths can also be specified. -
Model
This line tells payu which driver to use for the main model (model: access
access
refers to ACCESS-ESM). -
Submodels
ACCESS-ESM is a coupled model deploying multiple submodels (i.e. model components). This section specifies the submodels and configuration options required to execute the model correctly.submodels: - name: atmosphere model: um ncpus: 192 exe: /g/data/access/payu/access-esm/bin/coe/um7.3x input: - /g/data/access/payu/access-esm/input/pre-industrial/atmosphere - /g/data/access/payu/access-esm/input/pre-industrial/start_dump
- name: ocean model: mom ncpus: 180 exe: /g/data/access/payu/access-esm/bin/coe/mom5xx input: - /g/data/access/payu/access-esm/input/pre-industrial/ocean/common - /g/data/access/payu/access-esm/input/pre-industrial/ocean/pre-industrial
- name: ice model: cice ncpus: 12 exe: /g/data/access/payu/access-esm/bin/coe/cicexx input: - /g/data/access/payu/access-esm/input/pre-industrial/ice
- name: coupler model: oasis ncpus: 0 input: - /g/data/access/payu/access-esm/input/pre-industrial/coupler
Each submodel contains additional configuration options that are read in when the submodel is running. These options are specified in the subfolder of the control directory, whose name matches the submodel's name (e.g., configuration options for theatmosphere
submodel are in the~/access-esm/esm-pre-industrial/atmosphere
directory). -
Collate
Thecollate: exe: /g/data/access/payu/access-esm/bin/mppnccombine restart: true mem: 4GB
collate
process combines a number of smaller files, which contain different parts of the model grid, into target output files. Restart files are typically tiled in the same way and will also be combined together if therestart
option is set totrue
. -
Restart
This is the location of the files used for a warm restart.restart: /g/data/access/payu/access-esm/restart/pre-industrial
-
Start date and internal run length
This section specifies the start date and internal run length.calendar: start: year: 101 month: 1 days: 1
runtime: years: 1 months: 0 days: 0The internal run length (controlled byruntime
) can be different from the total run length. Also, whileruntime
can be reduced, it should not be increased to more than 1 year to avoid errors. For more information about the difference between internal run and total run lengths, or how to run the model for more than 1 year, refer to the section Run configuration for multiple years. -
Number of runs per PBS submission
ACCESS-ESM configurations are often run in multiple steps (or cycles), with payu running a maximum ofrunspersub: 1
runspersub
internal runs for every PBS job submission.If you increaserunspersub
, you may need to increase the walltime in the PBS resources.
To find out more about other configuration settings for the config.yaml
file, check out how to configure your experiment with payu.
Edit a single ACCESS-ESM component configuration
Each of ACCESS-ESM components contains additional configuration options that are read in when the model component is running. These options are typically useful to modify the physics used in the model or the input data.
They are specified in the subfolder of the control directory, whose name matches the submodel's name as specified in the config.yaml
submodel
section (e.g., configuration options for the atmosphere
submodel are in the ~/access-esm/esm-pre-industrial/atmosphere
directory).
To modify these options please refer to the User Guide of each individual model component.
Run ACCESS-ESM configuration
After editing the configuration, you are ready to run ACCESS-ESM.
ACCESS-ESM suites run on Gadi through a PBS job submission managed by payu.
Payu setup (optional)
As a first step, from within the control directory, it is good practice to run:
payu setup
This will prepare the model run, based on the experiment configuration.
Run configuration
To run ACCESS-ESM configuration for one internal run length (controlled by runtime
in the config.yaml
file), execute:
payu run -f
This will submit a single job to the queue with a total run length of runtime
. If there is no previous run, it will start from the start
date indicated in the config.yaml
file. Otherwise, it will perform a warm restart from a previously saved restart file.
-f
option ensures that payu will run even if there is an existing non-empty work directory created from a previous failed run.
Run configuration for multiple years
If you want to run ACCESS-ESM configuration for multiple internal run lengths (controlled by runtime
in the config.yaml
file), use the option -n
:
payu run -f -n <number-of-runs>
This will run the configuration number-of-runs
times with a total run length of runtime * number-of-runs
. The number of consecutive PBS jobs submitted to the queue depends on the runspersub
value specified in the config.yaml
file.
Understand runtime
, runspersub
, and -n
parameters
With the correct use of runtime
, runspersub
and -n
parameters, you can have full control of your run.
-
runtime
defines the internal run length. -
runspersub
defines the maximum number of internal runs for every PBS job submission. -
-n
sets the number of internal runs to be performed.
Now some practical examples:
-
Run 20 years of simulation with resubmission every 5 years
To have a total run length of 20 years with a 5-year resubmission cycle, leaveruntime
as the default value of1 year
and setrunspersub
to5
. Then, run the configuration with-n
set to20
:
This will submit subsequent jobs for the following years: 1 to 5, 6 to 10, 11 to 15, and 16 to 20, which is a total of 4 PBS jobs.payu run-f -n 20
-
Run 7 years of simulation with resubmission every 3 years
To have a total run length of 7 years with a 3-year resubmission cycle, leaveruntime
as the default value of1 year
and setrunspersub
to3
. Then, run the configuration with-n
set to7
:
This will submit subsequent jobs for the following years: 1 to 3, 4 to 6, and 7, which is a total of 3 PBS jobs.payu run -f -n 7
-
Run 3 months and 10 days of simulation in a single submission
To have a total run length of 3 months and 10 days in a single submission, set theruntime
as follows:
Setyears: 0 months: 3 days: 10
runspersub
to1
(or any value > 1) and run the configuration without option-n
(or with-n
set to1
):payu run -f
-
Run 1 year and 4 months of simulation with resubmission every 4 months
To have a total run length of 1 year and 4 months (16 months), you need to split it into multiple internal runs. For example, 4 internal runs of 4 months each. In this case, set theruntime
as follows:
Since the internal run length is set to 4 months, setyears: 0 months: 4 days: 0
runspersub
to1
to resubmit your jobs every 4 months (i.e. every internal run). Then, run the configuration with-n
set to4
:payu run -f -n 4
Monitor ACCESS-ESM runs
Currently, there is no specific tool to monitor ACCESS-ESM runs.
You can execute the following command to show the status of all your submitted PBS jobs:
qstat -u $USER
jobname
in the PBS resources of the Master Configuration file, that will appear as your job's Name instead of pre-industrial
.
S indicates the status of your run, where:
- Q → Job waiting in the queue to start
- R → Job running
- E → Job ending
If there are no jobs listed with your jobname
(or if no job is listed), your run either successfully completed or was terminated due to an error.
Stop a run
If you want to manually terminate a run, you can do so by executing:
qdel <job-ID>
Error and output log files
While the model is running, payu saves the standard output and standard error in the respective access.out
and access.err
files in the control directory. You can examine the contents of these files to check on the status of a run as it progresses.
When the model completes its run, or if it crashes, the output and error log files are by default renamed as jobname.o<job-ID>
and jobname.e<job-ID>
, respectively.
Model Live Diagnostics
ACCESS-NRI developed the Model Live Diagnostics framework to check, monitor, visualise, and evaluate model behaviour and progress of ACCESS models currently running on Gadi.
For a complete documentation on how to use this framework, check the Model Diagnostics documentation.
ACCESS-ESM outputs
At the end of the model run, output files (and restart files) are moved from the work
directory into the archive
directory under /scratch/$PROJECT/$USER/access-esm/archive/access-esm
, where they are further subdivided for each internal run. They are also symlinked in the control directory to ~/access-esm/archive
The naming format for a typical output folder is outputXXX
and for a restart folder restartXXX
, where XXX is the internal run number starting from 000
.
Thus, if output folders already exist, the internal number of the new output folder will be set to the first available XXX number.
Outputs and restarts are separated in the respective folders for each model component.
For the atmospheric output data, the files are usually UM fieldsfile, formatted as <UM-suite-identifier>a.p<output-stream-identifier><time-identifier>
.