Subsetting data
nctoolkit has many built in methods for subsetting data. The main method
is subset
. This let’s you select specific variables, years, months,
seasons and timesteps.
Selecting variables
If you want to select specific variables, you would do the following:
ds.subset(variables = ["var1", "var2"])
If you only want to select one variable, you can do this:
ds.subset(variables = "var1")
Selecting years
If you want to select specific years, you can do the following:
ds.subset(years = [2000, 2001])
Again, if you want a single year the following will work:
ds.subset(years = 2000)
The select
method allows partial matches for its arguments. So if we
want to select the year 2000, the following will work:
ds.subset(year = 2000)
In this case we can also select a range. So the following will work:
ds.subset(years = range(2000, 2010))
Selecting months
You can select months in the same way as years. The following examples will all do the same thing:
ds.subset(months = [1,2,3,4])
ds.subset(months = range(1,5))
ds.subset(mon = [1,2,3,4])
Selecting seasons
You can easily select seasons. For example if you wanted to select winter, you would do the following:
ds.subset(season = "DJF")
Selecting timesteps
You can select specific timesteps from a dataset in a similar manner. For example if you wanted to select the first two timesteps in a dataset the following two methods will work:
ds.subset(time = [0,1])
ds.subset(time = range(0,2))
Geographic subsetting
If you want to select a geographic subregion of a dataset, you can use subset. This method will select all data within a specific longitude/latitude box. You just need to supply the minimum longitude and latitude required. In the example below, a dataset is cropped with longitudes between -80 and 90 and latitudes between 50 and 80:
ds.subset(lon = [-80, 90], lat = [50, 80])