Skip to content

Model Output: Formats and Variables

For climate modelling, we need to store multidimensional scientific data (variables) such as temperature, humidity, pressure, wind speed and direction.

Example of a three-dimensional data array

As comparable model outputs simplify Model evaluation, ACCESS-NRI supports Coupled Model Intercomparison Projects (CMIP) and the use of common data formats and variables.

Network Common Data Format (NetCDF)

Numerous organisations and scientific groups worldwide have adopted a file format called NetCDF as a standard way to store multidimensional scientific data.

NetCDF, which has the file extension *.nc, is a self-describing, machine-independent data format of array-oriented scientific data.

  • Self-describing
    *.nc files include not only the data, but also a header with metadata that describes the data layout.
  • Machine-independent
    *.nc files can be accessed by computers with different ways of storing integers, characters and floating-point numbers.
  • Array-oriented
    *.nc data typically spans multiple dimensions with the same lengths (e.g., latitude, longitude and time) and variables (e.g., temperature and humidity), which are stored in arrays.

    Schematic of a NetCDF file with data (temperature and pressure as variables stored over the dimensions latitude, longitude, and time) and metadata

NetCDF metadata

Metadata, which is typically described as information about the data, enables users of data from different sources to decide which quantities are comparable. This facilitates building applications with powerful extraction, regridding and display capabilities.

To facilitate this process, there are conventions for Climate and Forecast metadata. These are designed to promote the processing and sharing of NetCDF files. The conventions define metadata that provide a definitive description of what the data in each variable represents, and the spatial and temporal properties of the data.

NetCDF data and variables

Data in a NetCDF file is stored in the form of arrays, where each NetCDF dimension has a name and a length.
For example, temperature variation over time at a fixed location is stored as a one-dimensional array, whereas temperature over a region (i.e. varying location) at a fixed time is stored as a two-dimensional array. Thus, three-dimensional (3D) data would be temperature varying with time over a region, and four-dimensional (4D) data would be temperature varying with time over a region with varying altitude.

Common variables

Variables used in climate modelling can differ in terms of naming conventions, units, etc. While this may be for historical reasons, the use of common variables is key not only for ease and compatibilty when working with the data, but also to unite the climate modelling community. Hence, there are collated lists of different widely used variable formats, such as:

CMIP6 variables

You can search the extensive list of Coupled Model Intercomparison Project phase 6 (CMIP6) variables by either the MIP variable name or associated CF standard name.

ERA5 variables

ERA5 is the fifth generation European Centre for Medium-Range Weather Forecasts (ECMWF) atmospheric reanalysis of the global climate, which spans a period from January 1940 to present. ERA5 provides hourly estimates of a significant number of atmospheric, land and oceanic climate variables.

A full list of ERA5 parameters is available on the ECMWF database. It covers both the ERA5 parameter listings as well as the ERA5-LAND parameter listings.

Loading NetCDF files

Our Model Evaluation and Diagnostics tools are based on the reading and storing of files via the Python package xarray.
For more information, refer to a quick overview of xarray and xarray tutorials.

xarray is a python package avaliable through the conda environment on NCI.
Hence, you can either use it directly (as shown below) or through the dataset capabilities of the ACCESS-NRI Model Intake Catalog Tool.

import xarray as xr
dataset = xr.open_dataset("")
Example of an actual NetCDF file with data (precipitation/rainfall over the dimensions latitude, longitude, and time) and metadata.

Last update: July 12, 2024