nctoolkit.DataSet.merge

Contents

nctoolkit.DataSet.merge#

DataSet.merge(join='variables', match=['year', 'month', 'day'], check=True)#

merge: Merge a multi-file ensemble into a single file

2 methods are available. 1) merging files with different variables, but the same time steps. 2) merging files with the same variables, with different times.

Parameters:
  • join (str) – This defines the type of merging to carry out. “variables”: this will merge by variable, so that an ensemble with different variables, but the same number of time steps is merged to a single file. “time”: this will merge files with the same variables, but different times to a single file, into a single file with ordered times. join defaults to “variables”, and uses partial matches, so “var” will give variable based merging.

  • match (list, str) – Optional argument when join = ‘variables’. A list or str stating what must match in the netCDF files. Defaults to year/month/day. This list must be some combination of year/month/day. An error will be thrown if the elements of time in match do not match across all netCDF files. The only exception is if there is a single date file in the ensemble.

  • check (bool) – By default nctoolkit out checks in case files do not have the same variables etc. Set check to False if you are confident merging will be problem free. If you are unsure if files have the same variables, set check to True to find out. Note: if you do not explicitly provide check and there are more than 30 files in a dataset, checks will be turned off.

Examples

If you wanted to merge files with the same variables, but different time steps, you would do: >>> ds.merge(join=’time’) If you wanted to merge files with different variables, but the same time steps, you would do: >>> ds.merge(join=’variables’)

If you wanted to merge files with different variables, but the same time steps, but only needed to ensure that the month in each time step matched, you would do:

>>> ds.merge(join='variables', match='month')

The above may be useful if you have a dataset with monthly data, but some files have the first of the month, and some have the 15th of the month.