API Reference¶

Session options¶

options(**kwargs)

Define session options.

Opening/copying data¶

`open_data`([x, checks])	Read netCDF data as a Dataset object
`open_url`([x, ftp_details, wait, file_stop])	Read netCDF data from a url as a DataSet object
`open_thredds`([x, wait, checks])	Read thredds data as a Dataset object
`open_geotiff`([x])	Open a geotiff and convert to a Dataset This requires rioxarray to be installed.
`from_xarray`(ds)	Convert an xarray dataset to an nctoolkit dataset This will first save the xarray dataset as a temporary netCDF file.
`DataSet.copy`()	Make a deep copy of an DataSet object.

Merging or analyzing multiple datasets¶

`merge`(*datasets[, match])	Merge datasets
`cor_time`([x, y])	Calculate the temporal correlation coefficient between two datasets This will calculate the temporal correlation coefficient, for each time step, between two datasets.
`cor_space`([x, y])	Calculate the spatial correlation coefficient between two datasets This will calculate the spatial correlation coefficient, for each time step, between two datasets.

Adding and removing files to a dataset¶

`DataSet.append`([x])	append: Add new file(s) to a dataset.
`DataSet.remove`([x])	remove: Remove file(s) from a dataset

Accessing attributes¶

`DataSet.variables`	List variables contained in a dataset
`DataSet.contents`	Detailed list of variables contained in a dataset.
`DataSet.times`	List times contained in a dataset
`DataSet.years`	List years contained in a dataset
`DataSet.months`	List months contained in a dataset
`DataSet.levels`	List levels contained in a dataset
`DataSet.size`	The size of an object This will print the number of files, total size, and smallest and largest files in an DataSet object.
`DataSet.current`	The current file or files in the DataSet object
`DataSet.history`	The history of operations on the DataSet
`DataSet.start`	The starting file or files of the DataSet object
`DataSet.calendar`	List calendars of dataset files
`DataSet.ncformat`	List formats of files contained in a dataset

Plotting¶

`DataSet.plot`([vars, autoscale, out, coast])	plot: Automatically plot a dataset.
`DataSet.pub_plot`([var, extent, title, ...])	pub_plot: Static plotting.

Variable modification¶

`DataSet.assign`([drop])	assign: Create new variables using mathematical operations on existing variables.
`DataSet.rename`([newnames])	rename: Rename variables in a dataset
`DataSet.as_missing`([value])	Change a range or individual value to missing.
`DataSet.missing_as`([value])	Convert missing values to a constant
`DataSet.set_fill`([value])	Set the fill value
`DataSet.sum_all`([drop, new_name])	sum_all: Calculate the sum of all variables for each time step

netCDF file attribute modification¶

`DataSet.set_longnames`([name_dict])	Set the long names of variables
`DataSet.set_units`([unit_dict])	Set the units for variables

Vertical/level methods¶

`DataSet.top`()	top: Extract the top/surface level from a dataset
`DataSet.bottom`([choice])	bottom: Extract the bottom level or value from a dataset
`DataSet.vertical_interp`([levels, fixed, ...])	vertical_interp: Verticaly interpolate a dataset based on given vertical levels
`DataSet.vertical_mean`([thickness, ...])	vertical_mean: Calculate the depth-averaged mean for each variable.
`DataSet.vertical_min`()	vertical_min: Calculate the vertical minimum of variable values.
`DataSet.vertical_max`()	vertical_max: Calculate the vertical maximum of variable values.
`DataSet.vertical_range`()	vertical_range: Calculate the vertical range of variable values.
`DataSet.vertical_sum`()	vertical_sum: Calculate the vertical sum of variable values.
`DataSet.vertical_integration`([thickness, ...])	vertical_integration: Calculate the vertically integrated sum over the water column.
`DataSet.vertical_cumsum`()	vertical_cumsum: Calculate the vertical sum of variable values.
`DataSet.invert_levels`()	Invert the levels of 3D variables.
`DataSet.bottom_mask`()	bottom_mask: Create a mask identifying the deepest cell without missing values..

Rolling methods¶

`DataSet.rolling_mean`([window, align])	rolling_mean: Calculate a rolling mean based on a window
`DataSet.rolling_min`([window, align])	rolling_min: Calculate a rolling minimum based on a window
`DataSet.rolling_max`([window, align])	rolling_max: Calculate a rolling maximum based on a window
`DataSet.rolling_sum`([window, align])	rolling_sum: Calculate a rolling sum based on a window
`DataSet.rolling_range`([window, align])	rolling_range: Calculate a rolling range based on a window
`DataSet.rolling_stdev`([window, align])	rolling_stdev: Calculate a rolling standard deviation based on a window
`DataSet.rolling_var`([window, align])	rolling_var: Calculate a rolling variance based on a window

Evaluation setting¶

DataSet.run()

Run all stored commands in a dataset

Cleaning functions¶

Ensemble creation¶

create_ensemble([path, recursive])

create_ensemble: Generate an ensemble of files from a directory.

Arithemetic methods¶

`DataSet.abs`()	abs: Method to get the absolute value of variables
`DataSet.add`([x, var])	add: Add to a dataset
`DataSet.assign`([drop])	assign: Create new variables using mathematical operations on existing variables.
`DataSet.exp`()	exp: Method to get the exponential of variables
`DataSet.log`()	log: Method to get the natural log, ln, of variables
`DataSet.log10`()	log10: Method to get the base 10 log, log10, of variables
`DataSet.multiply`([x, var])	multiply: Multiply a dataset.
`DataSet.power`([x])	power: Powers of variables in dataset
`DataSet.sqrt`()	sqrt: Method to get the square root of variables
`DataSet.square`()	square: Method to get the square of variables
`DataSet.subtract`([x, var])	subtract: Subtract from a dataset.
`DataSet.divide`([x, var])	divide: Divide the data.

Ensemble statistics¶

`DataSet.ensemble_mean`([nco, ignore_time])	ensemble_mean: Calculate an ensemble mean
`DataSet.ensemble_min`([nco, ignore_time])	ensemble_min: Calculate an ensemble minimum.
`DataSet.ensemble_max`([nco, ignore_time])	ensemble_max: Calculate an ensemble maximum
`DataSet.ensemble_percentile`([p])	ensemble_percentile: Calculate an ensemble percentile.
`DataSet.ensemble_range`()	ensemble_range: Calculate an ensemble range
`DataSet.ensemble_stdev`()	ensemble_stdev: Calculate an ensemble standard deviation
`DataSet.ensemble_sum`()	ensemble_sum: Calculate an ensemble sum
`DataSet.ensemble_var`()	ensemble_var: Calculate an ensemble variance

Subsetting operations¶

`DataSet.subset`(**kwargs)	subset: A method for subsetting datasets to specific variables, years, longitudes etc.
`DataSet.crop`([lon, lat, nco, nco_vars])	crop: Crop to a rectangular longitude and latitude box
`DataSet.drop`(**kwargs)	drop: Remove variables, days, months, years or time steps from a dataset

Time-based methods¶

`DataSet.set_date`([year, month, day, base_year])	Set the date in a dataset
`DataSet.set_day`(x)	Set the day for each time step in a dataset
`DataSet.shift`(**kwargs)	shift: Shift times in dataset by a number of hours, days, months, or years.

Interpolation, matching and resampling methods¶

`DataSet.regrid`([grid, method, recycle, one_grid])	regrid: Regrid a dataset to a target grid
`DataSet.to_latlon`([lon, lat, res, method, ...])	to_latlon: Regrid a dataset to a regular latlon grid
`DataSet.match_points`([df, variables, ...])	match_points: Match dataset to a spatiotemporal points dataframe
`DataSet.resample_grid`([factor])	resample_grid: Resample the horizontal grid of a dataset
`DataSet.time_interp`([start, end, resolution])	time_interp: Temporally interpolate variables based on date range and time resolution
`DataSet.timestep_interp`([steps])	timestep_interp: Temporally interpolate a dataset to given number of time steps between existing time steps
`DataSet.fill_na`([n])	fill_na: Fill missing values with a distance-weighted average.
`DataSet.box_mean`([x, y])	box_mean: Calculate the grid box mean for all variables.
`DataSet.box_max`([x, y])	box_max: Calculate the grid box max for all variables.
`DataSet.box_min`([x, y])	box_min: Calculate the grid box min for all variables.
`DataSet.box_sum`([x, y])	box_sum: Calculate the grid box sum for all variables.
`DataSet.box_range`([x, y])	box_range: Calculate the grid box range for all variables.

Masking methods¶

DataSet.mask_box([lon, lat])

mask_box: Mask a lon/lat box

Anomaly methods¶

`DataSet.annual_anomaly`([baseline, metric, ...])	annual_anomaly: Calculate annual anomalies for each variable based on a baseline period.
`DataSet.monthly_anomaly`([baseline])	monthly:anomaly: Calculate monthly anomalies based on a baseline period.

Statistical methods¶

`DataSet.tmean`([over, align, window])	tmean: Calculate the temporal mean of all variables.
`DataSet.tmin`([over, align, window])	tmin: Calculate the temporal minimum of all variables.
`DataSet.tmedian`([over, align])	tmedian: Calculate the temporal median of all variables.
`DataSet.tpercentile`([p, over, align])	tpercentile: Calculate the temporal percentile of all variables Useful for monthly percentile, annual/yearly percentile, seasonal percentile, daily percentile, daily climatology, monthly climatology, seasonal climatology
`DataSet.tmax`([over, align, window])	tmax: Calculate the temporal maximum of all variables.
`DataSet.tsum`([over, align, window])	tsum: Calculate the temporal sum of all variables.
`DataSet.trange`([over, align, window])	trange: Calculate the temporal range of all variables Useful for: monthly range, annual/yearly range, seasonal range, daily range, daily climatology, monthly climatology, seasonal climatology
`DataSet.tstdev`([over, align, window])	tstdev: Calculate the temporal standard deviation of all variables Useful for: monthly standard deviation, annual/yearly standard deviation, seasonal standard deviation, daily standard deviation, daily climatology, monthly climatology, seasonal climatology
`DataSet.tcumsum`([align])	tcumsum: Calculate the temporal cumulative sum of all variables
`DataSet.tvar`([over, align, window])	tvar: Calculate the temporal variance of all variables Useful for: monthly variance, annual/yearly variance, seasonal variance, daily variance, daily climatology, monthly climatology, seasonal climatology
`DataSet.cor_space`([var1, var2])	cor_space: Calculate the correlation correct between two variables in space.
`DataSet.cor_time`([var1, var2])	cor_time: Calculate the correlation correct in time between two variables
`DataSet.spatial_mean`()	spatial_mean: Calculate the area weighted spatial mean for all variables.
`DataSet.spatial_min`()	spatial_min: Calculate the spatial minimum for all variables.
`DataSet.spatial_max`()	spatial_max: Calculate the spatial maximum for all variables.
`DataSet.spatial_percentile`([p])	spatial_percentile: Calculate the spatial percentile for all variables
`DataSet.spatial_range`()	spatial_range: Calculate the spatial range for all variables.
`DataSet.spatial_sum`([by_area])	spatial_sum: Calculate the spatial sum for all variables.
`DataSet.spatial_stdev`()	spatial_stdev: Calculate the spatial standard deviation for all variables.
`DataSet.spatial_var`()	spatial_var: Calculate the spatial variance for all variables.
`DataSet.centre`([by, by_area])	centre: Calculate the latitudinal or longitudinal centre for each year/month combination in files.
`DataSet.zonal_mean`()	zonal_mean: Calculate the zonal mean for each time step
`DataSet.zonal_min`()	zonal_min: Calculate the zonal minimum for each time step
`DataSet.zonal_max`()	zonal_max: Calculate the zonal maximum for each time step
`DataSet.zonal_range`()	zonal_range: Calculate the zonal range for each time step
`DataSet.zonal_sum`([by_area])	zonal_sum: Calculate the zonal sum for each time step
`DataSet.meridonial_mean`()	meridonial_mean: Calculate the meridonial mean for each year/month combination in files.
`DataSet.meridonial_min`()	meridonial_min: Calculate the meridonial minimum for each year/month combination in files.
`DataSet.meridonial_max`()	meridonial_max: Calculate the meridonial maximum for each year/month combination in files.
`DataSet.meridonial_range`()	meridonial_range: Calculate the meridonial range for each year/month combination in files.

Merging methods¶

DataSet.merge([join, match, check])

merge: Merge a multi-file ensemble into a single file

Splitting methods¶

DataSet.split([by])

split: Split the dataset

Output and formatting methods¶

`DataSet.to_nc`(out[, zip, overwrite])	to_nc: Save a dataset to a named file.
`DataSet.to_xarray`([decode_times])	to_xarray: Open a dataset as an xarray object
`DataSet.to_dataframe`([decode_times, drop_bnds])	to_dataframe: Convert a dataset to a pandas data frame
`DataSet.zip`()	zip: Zip the dataset
`DataSet.format`([ext])	format: Change the netCDF format of a dataset.

Miscellaneous methods¶

`DataSet.na_count`([over, align, window])	na_count: Calculate the number of missing values.
`DataSet.na_frac`([over, align, window])	na_frac: Calculate the fraction of missing values in each grid cell across all time steps.
`DataSet.distribute`([m, n])	distribute: Split the dataset into multiple evenly sized horizontal and vertical new files
`DataSet.collect`()	Collect a dataset that has been split using distribute
`DataSet.cell_area`([join])	cell_area: Calculate the area of grid cells.
`DataSet.first_above`([x])	first_above: Identify the time step when a value is first above a threshold.
`DataSet.first_below`([x])	first_below: Identify the time step when a value is first below a threshold This will do the comparison with either a number, a Dataset or a netCDF file.
`DataSet.last_above`([x])	last_above: Identify the final time step when a value is above a threshold This will do the comparison with either a number, a Dataset or a netCDF file.
`DataSet.last_below`([x])	last_below: Identify the last time step when a value is below a threshold This will do the comparison with either a number, a Dataset or a netCDF file.
`DataSet.cdo_command`([command, ensemble, check])	cdo_command: Apply a cdo command
`DataSet.nco_command`([command, ensemble])	Apply an nco command
`DataSet.compare`([expression])	Compare all variables to a constant
`DataSet.gt`(x)	Method to calculate if variable in dataset is greater than that in another file or dataset This currently only works with single file datasets
`DataSet.lt`(x)	Method to calculate if variable in dataset is less than that in another file or dataset This currently only works with single file datasets
`DataSet.reduce_dims`()	reduce_dims: Reduce dimensions of data
`DataSet.reduce_grid`([mask])	reduce_grid: Reduce the dataset to non-zero locations in a mask
`DataSet.set_precision`(x)	Set the precision in a dataset
`DataSet.check`()	check: Check contents of files for common data problems.
`DataSet.is_corrupt`()	is_corrupt: Check if files are corrupt
`DataSet.fix_nemo_ersem_grid`()	A quick hack to change the grid file in North West European shelf Nemo grids.
`DataSet.set_gridtype`(grid)	Set the grid type.
`DataSet.surface_mask`()	surface_mask: Create a mask identifying the shallowest cell without missing values.
`DataSet.strip_variables`([vars])	strip_variables: Remove any variables, such as bnds etc., from variables.
`DataSet.no_leaps`()	Remove leap years.
`DataSet.as_double`(x)	Set a variable/dimension to double This is mostly useful for cases when time is stored as an int, but you need a double
`DataSet.as_type`(x)	Set a variable/dimension to double This is mostly useful for cases when time is stored as an int, but you need a double
`DataSet.reset`()	Simple method to fully reset a datset

Ecological methods¶

DataSet.phenology([var, metric, p])

phenology: Calculate phenologies from a dataset