Introduction to nctoolkit
nctoolkit is a multi-purpose tool for analyzing and post-processing netCDF files. It is designed explicitly with climate change and oceanographic work in mind. Under the hood, it uses Climate Data Operators (CDO), but it operates as a stand-alone package with no knowledge of CDO being required to use it.
Let’s look at what it can do using a historical global dataset of sea surface temperature, which you learn about here.
Here we will use monthly average temperature for the years 1991-2020 and extract data using a thredds server.
The preferred way to import nctoolkit is:
import nctoolkit as nc
It lets you quickly visualize data
nctoolkit offers plotting functionality that will let you automatically plot data from almost any type of netCDF file. It’s as simple as the following, which calculates mean historical sea surface temperature and then plots it:
ds = nc.open_thredds("https://psl.noaa.gov/thredds/dodsC/Datasets/COBE2/sst.mon.ltm.1991-2020.nc")
ds.plot()
It lets you easily subset data
If we want to look at a particular region, we can subset the data using the ‘subset’ method, and further select a particular year and month, we can do this as follows:
ds = nc.open_thredds("https://psl.noaa.gov/thredds/dodsC/Datasets/COBE2/sst.mon.ltm.1991-2020.nc")
ds.subset(month = 1, lon = [-13, 38], lat = [30, 67])
ds.plot()
It lets you calculate temporal averages
nctoolkit features a suite of methods, beginning with the letter t, that let you calculate temporal statistics. For example, if we wanted to calculate a seasonal average, we could do this:
ds = nc.open_thredds("https://psl.noaa.gov/thredds/dodsC/Datasets/COBE2/sst.mon.ltm.1991-2020.nc")
ds.tmean("season")
ds.plot()
It lets you calculate spatial averages
Calculating the spatial average of a variable is as simple as:
ds = nc.open_thredds("https://psl.noaa.gov/thredds/dodsC/Datasets/COBE2/sst.mon.ltm.1991-2020.nc")
ds.subset(variables = "sst")
ds.subset(lon = [-13, 38], lat = [30, 67])
ds.spatial_mean()
ds.plot()
It lets you do mathematical operations
nctoolkit offers an ‘assign’ method for performing mathematical operations on variables. This works in a way that will be familiar to users of Pandas. The method is illustrated below in a processing chain that works out how much warmer each part of the ocean is than the global mean.
ds = nc.open_thredds("https://psl.noaa.gov/thredds/dodsC/Datasets/COBE2/sst.mon.ltm.1991-2020.nc")
ds.tmean()
ds.assign(delta = lambda x: x.sst - spatial_mean(x.sst), drop = True)
ds.plot()
It lets you regrid data
nctoolkit has built-in methods for regridding data to user-specified grids. One of the most useful is to_latlon. This let’s you regrid to a regular latlon grid. You just need to specify the extent of the new grid, the resolution and the regridding method.
ds = nc.open_thredds("https://psl.noaa.gov/thredds/dodsC/Datasets/COBE2/sst.mon.ltm.1991-2020.nc")
ds.to_latlon(lon = [-13, 38], lat = [30, 67], res = 1, method = "bil")
ds.plot()
It lets you calculate zonal averages
It is easy to calculate zonal averages using nctoolkit using the zonal_mean method.
ds = nc.open_thredds("https://psl.noaa.gov/thredds/dodsC/Datasets/COBE2/sst.mon.ltm.1991-2020.nc")
ds.zonal_mean()
ds.plot()