Introduction to nctoolkit

nctoolkit is a multi-purpose tool for analyzing and post-processing netCDF files. It is designed to carry out almost all analysis and post-processing chains, and to do so easily and efficiently. It is designed explicitly with climate change and oceanographic work in mind. Under the hood, it uses Climate Data Operators (CDO), but it operates as a stand-alone package with no knowledge of CDO being required to use it.

Let’s look at what it can do using a historical global dataset of sea surface temperature, which you can find here.

The preferred way to import nctoolkit is:

[1]:
import nctoolkit as nc
nctoolkit is using Climate Data Operators version 1.9.10

It let’s you quickly visualize data

nctoolkit offers plotting functionality that will let you automatically plot data from almost any type of netCDF file. It’s as simple as the following, which calculates mean historical sea surface temperature and then plots it:

[2]:
ds = nc.open_data("sst.mon.mean.nc")
ds.plot()
[2]:

It let’s you calculate spatial averages

Calculating the spatial mean

[3]:
ds = nc.open_data("sst.mon.mean.nc")
ds.spatial_mean()
ds.plot()
[3]:

It let’s you do mathematical operations

nctoolkit offers an ‘assign’ method for performing mathematical operations on variables. This works in a way that will be familiar to users of Pandas. The method is illustrated below in a processing chain that works out how much warmer each part of the ocean is than the global mean.

[4]:
ds = nc.open_data("sst.mon.mean.nc")
ds.tmean()
ds.assign(delta = lambda x: x.sst - spatial_mean(x.sst), drop = True)
ds.plot()
[4]:

It let’s you crop data

We can crop to a specific region using the crop method. To get a region covering most of Europe, we could do this:

[5]:
ds = nc.open_data("sst.mon.mean.nc")
ds.crop(lon = [-13, 38], lat = [30, 67])
ds.plot()
[5]:

It let’s you regrid data

nctoolkit has built-in methods for regridding data to user-specified grids. One of the most useful is to_latlon. This let’s you regrid to a regular latlon grid. You just need to specify the extent of the new grid, the resolution and the regridding method.

[6]:
ds = nc.open_data("sst.mon.mean.nc")
ds.to_latlon(lon = [-13, 38], lat = [30, 67], res = 0.5, method = "nn")
ds.plot()
[6]:

It let’s you calculate temporal averages

nctoolkit features a suite of methods, beginning with the letter t, that let you calculate temporal statistics. For example, if we wanted to calculate how much sea surface temperature varies each year, we could do this:

[7]:
ds = nc.open_data("sst.mon.mean.nc")
ds.trange("year")
ds.tmean()
ds.plot()
[7]:

It let’s you calculate anomalies

In an example above we calculated the global mean sea surface temperature every month since 1850. But calculate the anomaly might be more interesting. The code below will calculate the change in global annual mean sea surface temperature since 1850-1969. The window argument let’s you calculate it on a rolling basis.

[8]:
ds = nc.open_data("sst.mon.mean.nc")
ds.spatial_mean()
ds.annual_anomaly(baseline = [1850, 1869], window= 20)
ds.plot("sst")
[8]:

It let’s you calculate zonal averages

It is easy to calculate zonal averages using nctoolkit. In the example below change in temperature since 1850-1869 in each latitude band is calculated:

[9]:
ds = nc.open_data("sst.mon.mean.nc")
ds.annual_anomaly(baseline = [1850, 1869], window= 20)
ds.zonal_mean()
ds.plot()
[9]:

Getting started with nctoolkit

To get started with nctoolkit it is best to start here, and to consider getting the cheatsheet.