API Reference

Session options

options(**kwargs)

Define session options.

Opening/copying data

open_data([x, checks])

Read netCDF data as a Dataset object

open_url([x, ftp_details, wait, file_stop])

Read netCDF data from a url as a DataSet object

open_thredds([x, wait, checks])

Read thredds data as a Dataset object

open_geotiff([x])

Open a geotiff and convert to a Dataset This requires rioxarray to be installed.

from_xarray(ds)

Convert an xarray dataset to an nctoolkit dataset This will first save the xarray dataset as a temporary netCDF file.

DataSet.copy()

Make a deep copy of an DataSet object.

Merging or analyzing multiple datasets

merge(*datasets[, match])

Merge datasets

cor_time([x, y])

Calculate the temporal correlation coefficient between two datasets This will calculate the temporal correlation coefficient, for each time step, between two datasets.

cor_space([x, y])

Calculate the spatial correlation coefficient between two datasets This will calculate the spatial correlation coefficient, for each time step, between two datasets.

Adding and removing files to a dataset

DataSet.append([x])

append: Add new file(s) to a dataset.

DataSet.remove([x])

remove: Remove file(s) from a dataset

Accessing attributes

DataSet.variables

List variables contained in a dataset

DataSet.contents

Detailed list of variables contained in a dataset.

DataSet.times

List times contained in a dataset

DataSet.years

List years contained in a dataset

DataSet.months

List months contained in a dataset

DataSet.levels

List levels contained in a dataset

DataSet.size

The size of an object This will print the number of files, total size, and smallest and largest files in an DataSet object.

DataSet.current

The current file or files in the DataSet object

DataSet.history

The history of operations on the DataSet

DataSet.start

The starting file or files of the DataSet object

DataSet.calendar

List calendars of dataset files

DataSet.ncformat

List formats of files contained in a dataset

Plotting

DataSet.plot([vars, autoscale, out, coast])

plot: Automatically plot a dataset.

Variable modification

DataSet.assign([drop])

assign: Create new variables using mathematical operations on existing variables.

DataSet.rename([newnames])

rename: Rename variables in a dataset

DataSet.as_missing([value])

Change a range or individual value to missing.

DataSet.missing_as([value])

Convert missing values to a constant

DataSet.set_fill([value])

Set the fill value

DataSet.sum_all([drop, new_name])

sum_all: Calculate the sum of all variables for each time step

netCDF file attribute modification

DataSet.set_longnames([name_dict])

Set the long names of variables

DataSet.set_units([unit_dict])

Set the units for variables

Vertical/level methods

DataSet.top()

top: Extract the top/surface level from a dataset

DataSet.bottom()

bottom: Extract the bottom level from a dataset

DataSet.vertical_interp([levels, fixed, ...])

vertical_interp: Verticaly interpolate a dataset based on given vertical levels

DataSet.vertical_mean([thickness, ...])

vertical_mean: Calculate the depth-averaged mean for each variable.

DataSet.vertical_min()

vertical_min: Calculate the vertical minimum of variable values.

DataSet.vertical_max()

vertical_max: Calculate the vertical maximum of variable values.

DataSet.vertical_range()

vertical_range: Calculate the vertical range of variable values.

DataSet.vertical_sum()

vertical_sum: Calculate the vertical sum of variable values.

DataSet.vertical_integration([thickness, ...])

vertical_integration: Calculate the vertically integrated sum over the water column.

DataSet.vertical_cumsum()

vertical_cumsum: Calculate the vertical sum of variable values.

DataSet.invert_levels()

Invert the levels of 3D variables.

DataSet.bottom_mask()

bottom_mask: Create a mask identifying the deepest cell without missing values..

Rolling methods

DataSet.rolling_mean([window, align])

rolling_mean: Calculate a rolling mean based on a window

DataSet.rolling_min([window, align])

rolling_min: Calculate a rolling minimum based on a window

DataSet.rolling_max([window, align])

rolling_max: Calculate a rolling maximum based on a window

DataSet.rolling_sum([window, align])

rolling_sum: Calculate a rolling sum based on a window

DataSet.rolling_range([window, align])

rolling_range: Calculate a rolling range based on a window

DataSet.rolling_stdev([window, align])

rolling_stdev: Calculate a rolling standard deviation based on a window

DataSet.rolling_var([window, align])

rolling_var: Calculate a rolling variance based on a window

Evaluation setting

DataSet.run()

Run all stored commands in a dataset

Cleaning functions


Ensemble creation

create_ensemble([path, recursive])

create_ensemble: Generate an ensemble of files from a directory.

Arithemetic methods

DataSet.abs()

abs: Method to get the absolute value of variables

DataSet.add([x, var])

add: Add to a dataset

DataSet.assign([drop])

assign: Create new variables using mathematical operations on existing variables.

DataSet.exp()

exp: Method to get the exponential of variables

DataSet.log()

log: Method to get the natural log, ln, of variables

DataSet.log10()

log10: Method to get the base 10 log, log10, of variables

DataSet.multiply([x, var])

multiply: Multiply a dataset.

DataSet.power([x])

power: Powers of variables in dataset

DataSet.sqrt()

sqrt: Method to get the square root of variables

DataSet.square()

square: Method to get the square of variables

DataSet.subtract([x, var])

subtract: Subtract from a dataset.

DataSet.divide([x, var])

divide: Divide the data.

Ensemble statistics

DataSet.ensemble_mean([nco, ignore_time])

ensemble_mean: Calculate an ensemble mean

DataSet.ensemble_min([nco, ignore_time])

ensemble_min: Calculate an ensemble minimum.

DataSet.ensemble_max([nco, ignore_time])

ensemble_max: Calculate an ensemble maximum

DataSet.ensemble_percentile([p])

ensemble_percentile: Calculate an ensemble percentile.

DataSet.ensemble_range()

ensemble_range: Calculate an ensemble range

DataSet.ensemble_stdev()

ensemble_stdev: Calculate an ensemble standard deviation

DataSet.ensemble_sum()

ensemble_sum: Calculate an ensemble sum

DataSet.ensemble_var()

ensemble_var: Calculate an ensemble variance

Subsetting operations

DataSet.subset(**kwargs)

subset: A method for subsetting datasets to specific variables, years, longitudes etc.

DataSet.crop([lon, lat, nco, nco_vars])

crop: Crop to a rectangular longitude and latitude box

DataSet.drop(**kwargs)

drop: Remove variables, days, months, years or time steps from a dataset

Time-based methods

DataSet.set_date([year, month, day, base_year])

Set the date in a dataset

DataSet.set_day(x)

Set the day for each time step in a dataset

DataSet.shift(**kwargs)

shift: Shift times in dataset by a number of hours, days, months, or years.

Interpolation, matching and resampling methods

DataSet.regrid([grid, method, recycle, one_grid])

regrid: Regrid a dataset to a target grid

DataSet.to_latlon([lon, lat, res, method, ...])

to_latlon: Regrid a dataset to a regular latlon grid

DataSet.match_points([df, variables, ...])

match_points: Match dataset to a spatiotemporal points dataframe

DataSet.resample_grid([factor])

resample_grid: Resample the horizontal grid of a dataset

DataSet.time_interp([start, end, resolution])

time_interp: Temporally interpolate variables based on date range and time resolution

DataSet.timestep_interp([steps])

timestep_interp: Temporally interpolate a dataset to given number of time steps between existing time steps

DataSet.fill_na([n])

fill_na: Fill missing values with a distance-weighted average.

DataSet.box_mean([x, y])

box_mean: Calculate the grid box mean for all variables.

DataSet.box_max([x, y])

box_max: Calculate the grid box max for all variables.

DataSet.box_min([x, y])

box_min: Calculate the grid box min for all variables.

DataSet.box_sum([x, y])

box_sum: Calculate the grid box sum for all variables.

DataSet.box_range([x, y])

box_range: Calculate the grid box range for all variables.

Masking methods

DataSet.mask_box([lon, lat])

mask_box: Mask a lon/lat box

Anomaly methods

DataSet.annual_anomaly([baseline, metric, ...])

annual_anomaly: Calculate annual anomalies for each variable based on a baseline period.

DataSet.monthly_anomaly([baseline])

monthly:anomaly: Calculate monthly anomalies based on a baseline period.

Statistical methods

DataSet.tmean([over, align, window])

tmean: Calculate the temporal mean of all variables.

DataSet.tmin([over, align, window])

tmin: Calculate the temporal minimum of all variables.

DataSet.tmedian([over, align])

tmedian: Calculate the temporal median of all variables.

DataSet.tpercentile([p, over, align])

tpercentile: Calculate the temporal percentile of all variables Useful for monthly percentile, annual/yearly percentile, seasonal percentile, daily percentile, daily climatology, monthly climatology, seasonal climatology

DataSet.tmax([over, align, window])

tmax: Calculate the temporal maximum of all variables.

DataSet.tsum([over, align, window])

tsum: Calculate the temporal sum of all variables.

DataSet.trange([over, align, window])

trange: Calculate the temporal range of all variables Useful for: monthly range, annual/yearly range, seasonal range, daily range, daily climatology, monthly climatology, seasonal climatology

DataSet.tstdev([over, align, window])

tstdev: Calculate the temporal standard deviation of all variables Useful for: monthly standard deviation, annual/yearly standard deviation, seasonal standard deviation, daily standard deviation, daily climatology, monthly climatology, seasonal climatology

DataSet.tcumsum([align])

tcumsum: Calculate the temporal cumulative sum of all variables

DataSet.tvar([over, align, window])

tvar: Calculate the temporal variance of all variables Useful for: monthly variance, annual/yearly variance, seasonal variance, daily variance, daily climatology, monthly climatology, seasonal climatology

DataSet.cor_space([var1, var2])

cor_space: Calculate the correlation correct between two variables in space.

DataSet.cor_time([var1, var2])

cor_time: Calculate the correlation correct in time between two variables

DataSet.spatial_mean()

spatial_mean: Calculate the area weighted spatial mean for all variables.

DataSet.spatial_min()

spatial_min: Calculate the spatial minimum for all variables.

DataSet.spatial_max()

spatial_max: Calculate the spatial maximum for all variables.

DataSet.spatial_percentile([p])

spatial_percentile: Calculate the spatial percentile for all variables

DataSet.spatial_range()

spatial_range: Calculate the spatial range for all variables.

DataSet.spatial_sum([by_area])

spatial_sum: Calculate the spatial sum for all variables.

DataSet.spatial_stdev()

spatial_stdev: Calculate the spatial standard deviation for all variables.

DataSet.spatial_var()

spatial_var: Calculate the spatial variance for all variables.

DataSet.centre([by, by_area])

centre: Calculate the latitudinal or longitudinal centre for each year/month combination in files.

DataSet.zonal_mean()

zonal_mean: Calculate the zonal mean for each time step

DataSet.zonal_min()

zonal_min: Calculate the zonal minimum for each time step

DataSet.zonal_max()

zonal_max: Calculate the zonal maximum for each time step

DataSet.zonal_range()

zonal_range: Calculate the zonal range for each time step

DataSet.zonal_sum([by_area])

zonal_sum: Calculate the zonal sum for each time step

DataSet.meridonial_mean()

meridonial_mean: Calculate the meridonial mean for each year/month combination in files.

DataSet.meridonial_min()

meridonial_min: Calculate the meridonial minimum for each year/month combination in files.

DataSet.meridonial_max()

meridonial_max: Calculate the meridonial maximum for each year/month combination in files.

DataSet.meridonial_range()

meridonial_range: Calculate the meridonial range for each year/month combination in files.

Merging methods

DataSet.merge([join, match, check])

merge: Merge a multi-file ensemble into a single file

Splitting methods

DataSet.split([by])

split: Split the dataset

Output and formatting methods

DataSet.to_nc(out[, zip, overwrite])

to_nc: Save a dataset to a named file.

DataSet.to_xarray([decode_times])

to_xarray: Open a dataset as an xarray object

DataSet.to_dataframe([decode_times])

to_dataframe: Convert a dataset to a pandas data frame

DataSet.zip()

zip: Zip the dataset

DataSet.format([ext])

format: Change the netCDF format of a dataset.

Miscellaneous methods

DataSet.na_count([over, align, window])

na_count: Calculate the number of missing values.

DataSet.na_frac([over, align, window])

na_frac: Calculate the fraction of missing values in each grid cell across all time steps.

DataSet.distribute([m, n])

distribute: Split the dataset into multiple evenly sized horizontal and vertical new files

DataSet.collect()

Collect a dataset that has been split using distribute

DataSet.cell_area([join])

cell_area: Calculate the area of grid cells.

DataSet.first_above([x])

first_above: Identify the time step when a value is first above a threshold.

DataSet.first_below([x])

first_below: Identify the time step when a value is first below a threshold This will do the comparison with either a number, a Dataset or a netCDF file.

DataSet.last_above([x])

last_above: Identify the final time step when a value is above a threshold This will do the comparison with either a number, a Dataset or a netCDF file.

DataSet.last_below([x])

last_below: Identify the last time step when a value is below a threshold This will do the comparison with either a number, a Dataset or a netCDF file.

DataSet.cdo_command([command, ensemble, check])

cdo_command: Apply a cdo command

DataSet.nco_command([command, ensemble])

Apply an nco command

DataSet.compare([expression])

Compare all variables to a constant

DataSet.gt(x)

Method to calculate if variable in dataset is greater than that in another file or dataset This currently only works with single file datasets

DataSet.lt(x)

Method to calculate if variable in dataset is less than that in another file or dataset This currently only works with single file datasets

DataSet.reduce_dims()

reduce_dims: Reduce dimensions of data

DataSet.reduce_grid([mask])

reduce_grid: Reduce the dataset to non-zero locations in a mask

DataSet.set_precision(x)

Set the precision in a dataset

DataSet.check()

check: Check contents of files for common data problems.

DataSet.is_corrupt()

is_corrupt: Check if files are corrupt

DataSet.fix_nemo_ersem_grid()

A quick hack to change the grid file in North West European shelf Nemo grids.

DataSet.set_gridtype(grid)

Set the grid type.

DataSet.surface_mask()

surface_mask: Create a mask identifying the shallowest cell without missing values.

DataSet.strip_variables([vars])

strip_variables: Remove any variables, such as bnds etc., from variables.

DataSet.no_leaps()

Remove leap years.

DataSet.as_double(x)

Set a variable/dimension to double This is mostly useful for cases when time is stored as an int, but you need a double

DataSet.as_type(x)

Set a variable/dimension to double This is mostly useful for cases when time is stored as an int, but you need a double

DataSet.reset()

Simple method to fully reset a datset

Ecological methods

DataSet.phenology([var, metric, p])

phenology: Calculate phenologies from a dataset