Speeding up code

Lazy evaluation

Under the hood nctoolkit relies mostly on CDO to carry out the specified manipulation of netcdf files. Each time CDO is called a new temporary file is generated. This has the potential to result in slower than necessary processing chains, as IO takes up far too much time.

I will demonstrate this using a netcdf file os sea surface temperature. To download the file we can just use wget:

[1]:
import nctoolkit as nc
import warnings
warnings.filterwarnings('ignore')
from IPython.display import clear_output
!wget ftp://ftp.cdc.noaa.gov/Datasets/COBE2/sst.mon.ltm.1981-2010.nc
clear_output()

We can then set up the dataset which we will use for manipulating the SST climatology.

[2]:
ff =  "sst.mon.ltm.1981-2010.nc"
sst = nc.open_data(ff)

Now, let’s select the variable sst, clip the file to the northern hemisphere, calculate the mean value in each grid cell for the first half of the year, and then calculate the spatial mean.

[3]:
sst.select_variables("sst")
sst.clip(lat = [0,90])
sst.select_months(list(range(1,7)))
sst.mean()
sst.spatial_mean()

The dataset’s history is as follows:

[4]:
sst.history
[4]:
['cdo -L -selname,sst sst.mon.ltm.1981-2010.nc /tmp/nctoolkitqhgujflsnctoolkittmpipj7up1l.nc',
 'cdo -L  -sellonlatbox,-180,180,0,90 /tmp/nctoolkitqhgujflsnctoolkittmpipj7up1l.nc /tmp/nctoolkitqhgujflsnctoolkittmp920v1_r7.nc',
 'cdo -L -selmonth,1,2,3,4,5,6 /tmp/nctoolkitqhgujflsnctoolkittmp920v1_r7.nc /tmp/nctoolkitqhgujflsnctoolkittmpbnck_dy2.nc',
 'cdo -L -timmean /tmp/nctoolkitqhgujflsnctoolkittmpbnck_dy2.nc /tmp/nctoolkitqhgujflsnctoolkittmpjmzt1l67.nc',
 'cdo -L -fldmean /tmp/nctoolkitqhgujflsnctoolkittmpjmzt1l67.nc /tmp/nctoolkitqhgujflsnctoolkittmpdus63y8i.nc']

In total, there are 5 operations, with temporary files created each time. However, we only want to generate one temporary file. So, can we do that? Yes, thanks to CDO’s method chaining ability. If we want to utilize this we need to set the session’s evaluation to lazy, using options. Once this is done nctoolkit will only evaluate things either when it needs to, e.g. you call a method that cannot possibly be chained, or if you evaluate it using run. This works as follows:

[5]:
ff =  "sst.mon.ltm.1981-2010.nc"
nc.options(lazy = True)
sst = nc.open_data(ff)
sst.select_variables("sst")
sst.clip(lat = [0,90])
sst.select_months(list(range(1,7)))
sst.mean()
sst.spatial_mean()
sst.run()

We can now see that the history is much cleaner, with only one command.

[6]:
sst.history
[6]:
['cdo -L -fldmean -timmean -selmonth,1,2,3,4,5,6  -sellonlatbox,-180,180,0,90 -selname,sst sst.mon.ltm.1981-2010.nc /tmp/nctoolkitqhgujflsnctoolkittmpkdkiwey2.nc']

How does this impact run time? Let’s time the original, unchained method.

[7]:
%%time
nc.options(lazy = False)
ff =  "sst.mon.ltm.1981-2010.nc"
sst = nc.open_data(ff)
sst.select_variables("sst")
sst.clip(lat = [0,90])
sst.select_months(list(range(1,7)))
sst.mean()
sst.spatial_mean()
CPU times: user 37.2 ms, sys: 61.6 ms, total: 98.7 ms
Wall time: 667 ms
[8]:
%%time
nc.options(lazy = True)
ff =  "sst.mon.ltm.1981-2010.nc"
sst = nc.open_data(ff)
sst.select_variables("sst")
sst.clip(lat = [0,90])
sst.select_months(list(range(1,7)))
sst.mean()
sst.spatial_mean()
sst.run()
CPU times: user 17.3 ms, sys: 4.28 ms, total: 21.6 ms
Wall time: 161 ms

This was almost 4 times faster. Exact speed improvements, will of course depend on specific IO requirements, and some times using lazy evaluation will make negligible impact, but in others can make code over 10 times fasteExact speed improvements, will of course depend on specific IO requirements, and some times using lazy evaluation will make negligible impact, but in others can make code over 10 times faster.

Processing files in parallel

When processing a dataset made up of multiple files, it is possible carry out the processing in parallel for more or less all of the methods available in nctoolkit. To carry out processing in parallel with 6 cores, we would use options as follows:

[9]:
nc.options(cores = 6)

By default the number of cores in use is 1. Of course, this can result in you crashing your computer if the total RAM in use is excessive, so it’s best practise to check RAM used with one core first.

Using thread-safe libraries

If the CDO installation being called by nctoolkit is compiled with threadsafe hdf5, then you can achieve potentially significant speed ups with the following command:

[10]:
nc.options(thread_safe = True)

If you are not sure, if hdf5 has been built thread safe, a simple way to find this out is to run the code below. If it fails, you can be more or less certain it is not threadsafe.

[11]:
nc.options(lazy = True)
nc.options(thread_safe = True)
ff =  "sst.mon.ltm.1981-2010.nc"
sst = nc.open_data(ff)
sst.select_variables("sst")
sst.clip(lat = [0,90])
sst.select_months(list(range(1,7)))
sst.mean()
sst.spatial_mean()
sst.run()