Skip to content

Data Processing Tools

We need you to contribute!

The ACCESS-Hive is a community resource that is a work in progress. Can you help to add content to this page? We’d love to receive your contribution. See our contributing guidelines for details of how to provide content. You can also open an issue highlighting any content you’d like us to provide but aren’t able to contribute yourself.

Tools

Kerchunk

Documentation | Sources

Kerchunk is a library that provides a unified way to represent a variety of chunked, compressed data formats (e.g. NetCDF/HDF5, GRIB2, TIFF, …), allowing efficient access to the data from traditional file systems or cloud object storage. It also provides a flexible way to create virtual datasets from multiple files.

CMOR3

Climate Model Output Rewriter Version 3

Documentation | Sources

CMOR is used to produce CF-compliant netCDF files. The structure of the files created by CMOR and the metadata they contain fulfill the requirements of many of the climate community’s standard model experiments (which are referred to here as “MIPs” and include, for example, AMIP, PMIP, APE, and IPCC scenario runs).

xMIP

Documentation | Tutorial on NCI | Sources

This package facilitates the cleaning, organization and interactive analysis of Model Intercomparison Projects (MIPs) within the Pangeo software stack.

APP4 (The ACCESS Post Processor)

Documentation | Sources

The APP4 is a CMORisation tool designed to convert ACCESS model output to ESGF-compliant formats, primarily for publication to CMIP6. The code was originally built for CMIP5, and was further developed for CMIP6-era activities. Uses CMOR3 and files created with the CMIP6 data request to generate CF-compliant files according to the CMIP6 data standards.

ACCESS-Archiver

Documentation | Sources

The ACCESS Archiver is designed to archive model output from ACCESS simulations. It's focus is to copy ACCESS model output from its initial location to a secondary location (typically from /scratch to /g/data), while converting UM files to netCDF, compressing MOM/CICE files, and culling restart files to 10-yearly. Saves 50-80% of storage space due to conversion and compression.

Synda

synda is a command line tool to search and download files from the Earth System Grid Federation (ESGF) archive.

FluxnetLSM

Citation 1 | Sources

R package for post-processing FLUXNET datasets for use in land surface modelling. Performs quality control and data conversion of FLUXNET data and collated site metadata. Supports FLUXNET2015, La Thuile, OzFlux and ICOS data releases.

xskillscore

Documentation | Sources

xskillscore is a Python library for computing a wide variety of skill metrics. Its typical application is to verify deterministic and probabilistic forecasts relative to observations.

Analysis blogposts and tutorials

Accessing NetCDF and GRIB file collections as cloud-native virtual datasets using Kerchunk, Peter March, Sep 2022


  1. A. M. Ukkola, N. Haughton, M. G. De Kauwe, G. Abramowitz, and A. J. Pitman. Fluxnetlsm r package (v1.0): a community tool for processing fluxnet data for use in land surface modelling. Geoscientific Model Development, 10(9):3379–3390, 2017. URL: https://gmd.copernicus.org/articles/10/3379/2017/, doi:10.5194/gmd-10-3379-2017


Last update: May 31, 2023
Created: May 31, 2023