Matchups with point data

A common challenge when working with netCDF data is matching up with point data. This is often difficult because point data is sparse both spatially and temporally, and when working in the ocean this data can be at varying depths. From version 0.4.4 on, nctoolkit has a dedicated class, Matchpoint for dealing with this problem. Here we will provide an overview of how to do this.

Matching data at specific locations

First, we will illustrate how matchpoint works for data at spatial locations and depths. After this we will deal with different times. The data will be ocean nitrate from NOAA’s World Ocean Atlas.

We can download part of it as follows:

[1]:
import nctoolkit as nc
ds = nc.open_thredds('https://data.nodc.noaa.gov/thredds/dodsC/ncei/woa/nitrate/all/1.00/woa18_all_n01_01.nc', checks = False)
ds.crop(lon =  [-40, 20], lat = [40, 70], nco = True)
ds.select(variables = "n_an")
ds.run()
nctoolkit is using Climate Data Operators version 2.0.5

This is a subset of the data covering a large part of the North Atlantic, and it has nitrate values from the sea surface to the sea floor.

[2]:
ds.plot()
[2]:

Now, let’s say we had the following dataframe of 4 coordinates and depths. How would we identify the nitrate values using nctoolkit?

[3]:
import pandas as pd
df = pd.DataFrame({"lon":[-10, -12, -14, -16], "lat":[45, 50, 53, 55], "depth":[4, 2, 30, 40]})
df
[3]:
lon lat depth
0 -10 45 4
1 -12 50 2
2 -14 53 30
3 -16 55 40

We start by creating a Matchpoint object:

[4]:
matcher = nc.open_matchpoint()

We then need to add the gridded nitrate dataset, and also the depths. We can find the depths as follows:

[5]:
depths = ds.levels
depths
[5]:
[0.0,
 5.0,
 10.0,
 15.0,
 20.0,
 25.0,
 30.0,
 35.0,
 40.0,
 45.0,
 50.0,
 55.0,
 60.0,
 65.0,
 70.0,
 75.0,
 80.0,
 85.0,
 90.0,
 95.0,
 100.0,
 125.0,
 150.0,
 175.0,
 200.0,
 225.0,
 250.0,
 275.0,
 300.0,
 325.0,
 350.0,
 375.0,
 400.0,
 425.0,
 450.0,
 475.0,
 500.0,
 550.0,
 600.0,
 650.0,
 700.0,
 750.0,
 800.0]

Note: to carry out matchups by depth, you will need to provide depths in an appropriate format. When the depths are the same at all grid points, which is the case here, you can provide a list. When the depths vary by grid point you will need to provide a dataset or netCDF with the depths at each point.

We can then add the datase to the matcher, as follows:

[6]:
matcher.add_data(ds, depths = depths)
All variables will be used

Once that is done we need to add the points to the matcher. You will need to specify the column names that represent longitude and latitude, and if you want to do depth matchups the name of depth.

[7]:
matcher.add_points(df, lon = "lon", lat = "lat", depth = "depth")
Warning: You have not provided year in map
Warning: You have not provided month in map
Warning: You have not provided day in map

Now that the dataset and the points have been added, we can do the matchups, using matchup:

[8]:
matcher.matchup()
Points will be matched for all time steps

This has now found the nitrate values for each location. You can find the matchups as follows:

[9]:
matcher.values
[9]:
lon lat depth n_an day month year
0 -10 45 4 5.661312 16 1 1958
1 -12 50 2 8.932838 16 1 1958
2 -14 53 30 8.672163 16 1 1958
3 -16 55 40 6.973096 16 1 1958

Spatial matchup approach

The approach taken to matching up data spatially is as follows. First, data is regridded horizontally using bilinear interpolation to the lon/lat pairs provided. If depths are provided the data is than interpolated verticallying using 1d interpolation using scipy.

Spatiotemporal matchups

In progress….