API Reference

Session options

options(\*\*kwargs)

Define session options.

Reading/copying data

open_data([x, suppress_messages, checks])

Read netcdf data as a DataSet object

DataSet.copy(self)

Make a deep copy of an DataSet object

Merging or analyzing multiple datasets

merge(\*datasets[, match])

Merge datasets

cor_time([x, y])

Calculate the temporal correlation coefficient between two datasets This will calculate the temporal correlation coefficient, in each grid cell, between two datasets

cor_space([x, y])

Calculate the spatial correlation coefficient between two datasets This will calculate the spatial correlation coefficient, for each time step, between two datasets

Adding file(s) to a dataset

append

Accessing attributes

DataSet.variables

List variables contained in a dataset

DataSet.years

List years contained in a dataset

DataSet.months

List months contained in a dataset

DataSet.times

List times contained in a dataset

DataSet.levels

List levels contained in a dataset

DataSet.size

The size of an object This will print the number of files, total size, and smallest and largest files in an DataSet object.

DataSet.current

The current file or files in the DataSet object

DataSet.history

The history of operations on the DataSet

DataSet.start

The starting file or files of the DataSet object

Plotting

DataSet.plot(self[, log, vars, panel])

Autoplotting method.

DataSet.view(self)

Open the current dataset’s file in ncview

Variable modification

DataSet.mutate(self[, operations])

Create new variables using mathematical expressions, and keep original variables

DataSet.transmute(self[, operations])

Create new variables using mathematical expressions, and drop original variables

DataSet.rename(self, newnames)

Rename variables in a dataset

DataSet.set_missing(self[, value])

Set the missing value for a single number or a range

DataSet.sum_all(self[, drop])

Calculate the sum of all variables for each time step

NetCDF file attribute modification

DataSet.set_longnames(self[, name_dict])

Set the long names of variables

DataSet.set_units(self[, unit_dict])

Set the units for variables

Vertical/level methods

DataSet.surface(self)

Extract the top/surface level from a dataset This extracts the first vertical level from each file in a dataset.

DataSet.bottom(self)

Extract the bottom level from a dataset This extracts the bottom level from each NetCDF file.

DataSet.vertical_interp(self[, levels])

Verticaly interpolate a dataset based on given vertical levels This is calculated for each time step and grid cell

DataSet.vertical_mean(self)

Calculate the depth-averaged mean for each variable This is calculated for each time step and grid cell

DataSet.vertical_min(self)

Calculate the vertical minimum of variable values This is calculated for each time step and grid cell

DataSet.vertical_max(self)

Calculate the vertical maximum of variable values This is calculated for each time step and grid cell

DataSet.vertical_range(self)

Calculate the vertical range of variable values This is calculated for each time step and grid cell

DataSet.vertical_sum(self)

Calculate the vertical sum of variable values This is calculated for each time step and grid cell

DataSet.vertical_cum_sum(self)

Calculate the vertical sum of variable values This is calculated for each time step and grid cell

DataSet.invert_levels(self)

Invert the levels of 3D variables This is calculated for each time step and grid cell

DataSet.bottom_mask(self)

Create a mask identifying the deepest cell without missing values.

Rolling methods

DataSet.rolling_mean(self[, window])

Calculate a rolling mean based on a window

DataSet.rolling_min(self[, window])

Calculate a rolling minimum based on a window

DataSet.rolling_max(self[, window])

Calculate a rolling maximum based on a window

DataSet.rolling_sum(self[, window])

Calculate a rolling sum based on a window

DataSet.rolling_range(self[, window])

Calculate a rolling range based on a window

Evaluation setting

DataSet.run(self)

Run all stored commands in a dataset

Cleaning functions


Ensemble creation

create_ensemble([path, var, recursive])

Generate an ensemble

Arithemetic methods

DataSet.mutate(self[, operations])

Create new variables using mathematical expressions, and keep original variables

DataSet.transmute(self[, operations])

Create new variables using mathematical expressions, and drop original variables

DataSet.add(self[, x, var])

Add to a dataset This will add a constant, another dataset or a NetCDF file to the dataset.

DataSet.subtract(self[, x, var])

Subtract from a dataset This will subtract a constant, another dataset or a NetCDF file from the dataset.

DataSet.multiply(self[, x, var])

Multiply a dataset This will multiply a dataset by a constant, another dataset or a NetCDF file.

DataSet.divide(self[, x, var])

Divide the data This will divide the dataset by a constant, another dataset or a NetCDF file.

Ensemble statistics

DataSet.ensemble_mean(self[, nco, ignore_time])

Calculate an ensemble mean

DataSet.ensemble_min(self[, nco, ignore_time])

Calculate an ensemble min

DataSet.ensemble_max(self[, nco, ignore_time])

Calculate an ensemble maximum

DataSet.ensemble_percentile(self[, p])

Calculate an ensemble percentile This will calculate the percentles for each time step in the files.

DataSet.ensemble_range(self)

Calculate an ensemble range The range is calculated for each time step; for example, if each file in the ensemble has 12 months of data the statistic will be calculated for each month.

Subsetting operations

DataSet.clip(self[, lon, lat, nco])

Clip to a rectangular longitude and latitude box

DataSet.select_variables(self[, vars])

Select variables from a dataset

DataSet.remove_variables(self[, vars])

Remove variables This will remove stated variables from files in the dataset.

DataSet.select_years(self[, years])

Select years from a dataset This method will subset the dataset to only contain years within the list given.

DataSet.select_months(self[, months])

Select months from a dataset This method will subset the dataset to only contain months within the list given.

DataSet.select_season(self[, season])

Select season from a dataset

DataSet.select_timestep(self[, times])

Select timesteps from a dataset

Time-based methods

DataSet.set_date(self[, year, month, day, …])

Set the date in a dataset You should only do this if you have to fix/change a dataset with a single, not multiple dates.

DataSet.select_months(self[, months])

Select months from a dataset This method will subset the dataset to only contain months within the list given.

DataSet.select_season(self[, season])

Select season from a dataset

DataSet.select_years(self[, years])

Select years from a dataset This method will subset the dataset to only contain years within the list given.

DataSet.shift_hours(self[, shift])

Shift times in dataset by a number of hours

DataSet.shift_days(self[, shift])

Shift times in dataset by a number of days

Interpolation methods

DataSet.regrid(self[, grid, method])

Regrid a dataset to a target grid

DataSet.to_latlon(self[, lon, lat, res, method])

Regrid a dataset to a regular latlon grid

DataSet.time_interp(self[, start, end, …])

Temporally interpolate variables based on date range and time resolution

Masking methods

DataSet.mask_box(self[, lon, lat])

Mask a lon/lat box

Summary methods

DataSet.annual_anomaly(self[, baseline, …])

Calculate annual anomalies for each variable based on a baseline period The anomaly is derived by first calculating the climatological annual mean for the given baseline period.

DataSet.monthly_anomaly(self[, baseline])

Calculate monthly anomalies based on a baseline period The anomaly is derived by first calculating the climatological monthly mean for the given baseline period.

DataSet.phenology(self[, var, metric, p])

Calculate phenologies from a dataset Each file in an ensemble must only cover a single year, and ideally have all days.

Statistical methods

DataSet.mean(self)

Calculate the temporal mean of all variables

DataSet.min(self)

Calculate the temporal minimum of all variables

DataSet.percentile(self[, p])

Calculate the temporal percentile of all variables

DataSet.max(self)

Calculate the temporal maximum of all variables

DataSet.sum(self)

Calculate the temporal sum of all variables

DataSet.range(self)

Calculate the temporal range of all variables

DataSet.var(self)

Calculate the temporal variance of all variables

DataSet.cum_sum(self)

Calculate the temporal cumulative sum of all variables

DataSet.cor_space(self[, var1, var2])

Calculate the correlation correct between two variables in space This is calculated for each time step.

DataSet.cor_time(self[, var1, var2])

Calculate the correlation correct in time between two variables The correlation is calculated for each grid cell, ignoring missing values.

DataSet.spatial_mean(self)

Calculate the area weighted spatial mean for all variables This is performed for each time step.

DataSet.spatial_min(self)

Calculate the spatial minimum for all variables This is performed for each time step.

DataSet.spatial_max(self)

Calculate the spatial maximum for all variables This is performed for each time step.

DataSet.spatial_percentile(self[, p])

Calculate the spatial sum for all variables This is performed for each time step.

DataSet.spatial_range(self)

Calculate the spatial range for all variables This is performed for each time step.

DataSet.spatial_sum(self[, by_area])

Calculate the spatial sum for all variables This is performed for each time step.

DataSet.monthly_mean(self)

Calculate the monthly mean for each year/month combination in files.

DataSet.monthly_min(self)

Calculate the monthly minimum for each year/month combination in files.

DataSet.monthly_max(self)

Calculate the monthly maximum for each year/month combination in files.

DataSet.monthly_range(self)

Calculate the monthly range for each year/month combination in files.

DataSet.daily_mean_climatology(self)

Calculate a daily mean climatology

DataSet.daily_min_climatology(self)

Calculate a daily minimum climatology

DataSet.daily_max_climatology(self)

Calculate a daily maximum climatology

DataSet.daily_mean_climatology(self)

Calculate a daily mean climatology

DataSet.daily_range_climatology(self)

Calculate a daily range climatology

DataSet.monthly_mean_climatology(self)

Calculate the monthly mean climatologies Defined as the minimum value in each month across all years.

DataSet.monthly_min_climatology(self)

Calculate the monthly minimum climatologies Defined as the minimum value in each month across all years.

DataSet.monthly_max_climatology(self)

Calculate the monthly maximum climatologies Defined as the maximum value in each month across all years.

DataSet.monthly_range_climatology(self)

Calculate the monthly range climatologies Defined as the range of value in each month across all years.

DataSet.annual_mean(self)

Calculate the annual mean for each variable

DataSet.annual_min(self)

Calculate the annual minimum for each variable

DataSet.annual_max(self)

Calculate the annual maximum for each variable

DataSet.annual_range(self)

Calculate the annual range for each variable

DataSet.seasonal_mean(self)

Calculate the seasonal mean for each year.

DataSet.seasonal_min(self)

Calculate the seasonal minimum for each year.

DataSet.seasonal_max(self)

Calculate the seasonal maximum for each year.

DataSet.seasonal_range(self)

Calculate the seasonal range for each year.

DataSet.seasonal_mean_climatology(self)

Calculate a climatological seasonal mean

DataSet.seasonal_min_climatology(self)

Calculate a climatological seasonal min This is defined as the minimum value in each season across all years.

DataSet.seasonal_max_climatology(self)

Calculate a climatological seasonal max This is defined as the maximum value in each season across all years.

DataSet.seasonal_range_climatology(self)

Calculate a climatological seasonal range This is defined as the range of values in each season across all years.

DataSet.zonal_mean(self)

Calculate the zonal mean for each year/month combination in files.

DataSet.zonal_min(self)

Calculate the zonal minimum for each year/month combination in files.

DataSet.zonal_max(self)

Calculate the zonal maximum for each year/month combination in files.

DataSet.zonal_range(self)

Calculate the zonal range for each year/month combination in files.

Seasonal methods

DataSet.seasonal_mean(self)

Calculate the seasonal mean for each year.

DataSet.seasonal_min(self)

Calculate the seasonal minimum for each year.

DataSet.seasonal_max(self)

Calculate the seasonal maximum for each year.

DataSet.seasonal_range(self)

Calculate the seasonal range for each year.

DataSet.seasonal_mean_climatology(self)

Calculate a climatological seasonal mean

DataSet.seasonal_min_climatology(self)

Calculate a climatological seasonal min This is defined as the minimum value in each season across all years.

DataSet.seasonal_max_climatology(self)

Calculate a climatological seasonal max This is defined as the maximum value in each season across all years.

DataSet.seasonal_range_climatology(self)

Calculate a climatological seasonal range This is defined as the range of values in each season across all years.

DataSet.select_season(self[, season])

Select season from a dataset

Merging methods

DataSet.merge(self[, match])

Merge a multi-file ensemble into a single file Merging will occur based on the time steps in the first file.

DataSet.merge_time(self)

Time-based merging of a multi-file ensemble into a single file This method is ideal if you have the same data split over multiple files covering different data sets.

Climatology methods

DataSet.daily_mean_climatology(self)

Calculate a daily mean climatology

DataSet.daily_min_climatology(self)

Calculate a daily minimum climatology

DataSet.daily_max_climatology(self)

Calculate a daily maximum climatology

DataSet.daily_mean_climatology(self)

Calculate a daily mean climatology

DataSet.daily_range_climatology(self)

Calculate a daily range climatology

DataSet.monthly_mean_climatology(self)

Calculate the monthly mean climatologies Defined as the minimum value in each month across all years.

DataSet.monthly_min_climatology(self)

Calculate the monthly minimum climatologies Defined as the minimum value in each month across all years.

DataSet.monthly_max_climatology(self)

Calculate the monthly maximum climatologies Defined as the maximum value in each month across all years.

DataSet.monthly_range_climatology(self)

Calculate the monthly range climatologies Defined as the range of value in each month across all years.

Splitting methods

DataSet.split(self[, by])

Split the dataset Each file in the ensemble will be separated into new files based on the splitting argument.

Output methods

DataSet.write_nc(self, out[, zip, overwrite])

Save a dataset to a named file This will only work with single file datasets.

DataSet.to_xarray(self[, decode_times])

Open a dataset as an xarray object

DataSet.to_dataframe(self[, decode_times])

Open a dataset as a pandas data frame

DataSet.zip(self)

Zip the dataset This will compress the files within the dataset.

Miscellaneous methods

DataSet.cell_areas(self[, join])

Calculate the area of grid cells.

DataSet.cdo_command(self[, command])

Apply a cdo command

DataSet.nco_command(self[, command, ensemble])

Apply an nco command

DataSet.compare_all(self[, expression])

Compare all variables to a constant

DataSet.reduce_dims(self)

Reduce dimensions of data This will remove any dimensions with only one value.

DataSet.reduce_grid(self, mask)

Reduce the dataset to non-zero locations in a mask :param mask: single variable dataset or path to .nc file.