This is a basic demonstration of the core features of xopr for loading and plotting radar data.
%load_ext autoreload
%autoreload 2
import numpy as np
import xarray as xr
import hvplot.xarray
import geoviews as gv
import geoviews.feature as gf
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import xopr
hvplot.extension('bokeh')
You’ll first establish an OPR session. This object serves to contain any needed information about how to connect to OPR and to the STAC API.
Generally, you can just use opr = xopr.OPRConnection()
, but you may want to customize other options. The most common would be to create a local radar cache, which you can do as shown in the cell below.
If you specify a cache_dir
, then fsspec will automatically manage a cache of any radar data that you need to download. This makes re-running things fast.
# Establish an OPR session
# You'll probably want to set a cache directory if you're running this locally to speed
# up subsequent requests. You can do other things like customize the STAC API endpoint,
# but you shouldn't need to do that for most use cases.
opr = xopr.OPRConnection(cache_dir="radar_cache")
# Or you can open a connection without a cache directory (for example, if you're parallelizing
# this on a cloud cluster without persistent storage).
#opr = xopr.OPRConnection()
In the STAC catalog, every season (an entity such as 2022_Antarctica_BaslerMKB
) is a distinct collection. You can list the available collections.
# List the available OPR datasets
collections = opr.get_collections()
print([c['id'] for c in collections])
selected_collection = '2022_Antarctica_BaslerMKB' # Select a collection for demonstration
print(f"Selected collection: {selected_collection}")
['2016_Antarctica_DC8', '2017_Antarctica_Basler', '2017_Antarctica_P3', '2018_Antarctica_DC8', '2019_Antarctica_GV', '2022_Antarctica_BaslerMKB', '2023_Antarctica_BaslerMKB']
Selected collection: 2022_Antarctica_BaslerMKB
Similarly, you can list available flights for a given season. When we add a season into the STAC catalog, we add every flight for which a CSARP_standard
product is available. (And we also link to other data products, if those are available.)
# List flights in the selected collection
flights = opr.get_flights(selected_collection)
print(f"Found {len(flights)} flights in collection {selected_collection}")
print(f"The first 3 flights are: {[f['flight_id'] for f in flights[:3]]}")
Found 23 flights in collection 2022_Antarctica_BaslerMKB
The first 3 flights are: ['20221210_01', '20221212_01', '20221213_01']
Once you pick a flight, we can actually start loading data. How much data this needs to transfer will vary depending on both how long the flight is and how the underlying data is stored.
We’re working on migrating OPR data files to be cloud-optimized, however most of them aren’t and some of them are old-school MATLAB v5 files. xopr is designed to hide these differences from you as much as possible, but that can only go so far.
selected_flight = '20230109_01' # Or from the list of flights like this: flights[0]['flight_id']
print(f"Selected flight: {selected_flight}")
stac_items = opr.query_frames(seasons=[selected_collection], flight_ids=[selected_flight])
frames = opr.load_frames(stac_items)
Selected flight: 20230109_01
Let’s look a single frame. This corresponds to a single .mat
file that you might download from the OPR website. The structure of this should look familiar to you.
If you’ve tried directly loading one of these files in Python, you’ll probably know that there are some quirks. We try to handle those behind the scenes and give you a nicely-formatted xarray Dataset.
If you’ve never heard of xarray, this might be a good time to go read the xarray overview doc.
# Inspect an individual frame
frames[0].xopr
Merging frames together to get a full flight line is easy with built-in xarray functions.
from xopr.util import merge_dicts_no_conflicts
# Combine the frames into a single xarray Dataset representing the flight line
flight_line = xr.concat(frames, dim='slow_time', combine_attrs=merge_dicts_no_conflicts).sortby('slow_time')
flight_line.xopr
This is an example of one way to make a plot of where this flight line is. This plot is Bokeh-based, so it’s fully interactive. Use the tools on the right to pan and zoom.
# Plot a map of where the data was collected with a basemap on an EPSG:3031 projection
proj = ccrs.Stereographic(central_latitude=-90, true_scale_latitude=-71)
coords = flight_line[['Longitude', 'Latitude']].to_dataframe().dropna()
line_plot = gv.Points([coords.values]).opts(size=1)
start_point = gv.Points([coords.iloc[0].values]).opts(color='green', size=5, projection=proj)
(gf.ocean * gf.coastline * line_plot * start_point).opts(projection=proj, aspect='equal')
Another common radar operation is stacking. Depending on our goals, we might want to do that in one of a couple of ways.
If we want to stack every 10 traces, we can do something like this:
stacked = flight_line.Data.rolling(slow_time=10, center=True).mean()
If our goal is to display a radargram, it might be more useful to average over fixed-time windows, which we can do like this:
stacked = flight_line.resample(slow_time='2s').mean()
The advantage of the latter approach is that you end up with a uniform spacing in the slow_time
dimension. This allows for more efficient plotting (imshow
can only be used with fixed spacing -- try pcolormesh
if your spacing is variable).
#stacked = flight_line.Data.rolling(slow_time=10, center=True).mean()
stacked = flight_line.resample(slow_time='2s').mean()
pwr_dB = 10*np.log10(np.abs(stacked.Data))
OPR data often also includes traced layers (surface, bed, and occasionally internal layers). There are two distinct formats that OPR uses, which you may be familiar with from the CReSIS imb.picker
tool. One is a database that stores layer picks. The other is a layer file that is available as a separate data product.
xopr allows you to fetch the relevant layer information from either source. In the cell below, we first try to load from the OPS database. If that fails (because there is no data for the currently selected flight) then we load the layer files. In the background, the former works by querying the OPS API and the latter uses thes STAC catalog.
Once again, we do our best to hide the different formats. Once you’ve loaded the layers either way, you’ll get a dictionary mapping layer IDs to xarray Datasets of the same basic structure.
stacked.xopr
layers = None
try:
layers = opr.get_layers_db(stacked) # Fetch layers from the database
except Exception as e:
print(f"Error fetching layers: {e}")
print("Trying to load layers from file instead...")
layers = opr.get_layers_files(stacked)
{'status': 2, 'data': 'WARNING: NO LAYER POINTS FOUND FOR THE GIVEN PARAMETERS.'}
Error fetching layers: Failed to fetch layer points. Received response with status 2.
Trying to load layers from file instead...
/home/runner/work/xopr/xopr/src/xopr/opr_access.py:970: FutureWarning: In a future version of xarray the default value for data_vars will change from data_vars='all' to data_vars=None. This is likely to lead to different results when multiple datasets have matching variables with overlapping values. To opt in to new defaults and get rid of these warnings now use `set_options(use_new_combine_kwarg_defaults=True) or set data_vars explicitly.
layers_flight = xr.concat(layer_frames, dim='slow_time', combine_attrs='drop_conflicts')
layers[1] # Display the surface layer as an example
Finally, let’s make a radargram. If we were successful in loading layers, we will also plot the surface and bed layers here.
fig, ax = plt.subplots(figsize=(15, 4))
pwr_dB.plot.imshow(x='slow_time', cmap='gray', ax=ax)
ax.invert_yaxis()
if layers:
layers[1]['twtt'].plot(ax=ax, x='slow_time', linestyle=':', label='Surface')
layers[2]['twtt'].plot(ax=ax, x='slow_time', linestyle='--', label='Bed')
ax.legend()

OPR data comes from many sources. Collecting and processing that data has required the contributions of many people over multiple decades. We want to make it easy for you, as a user, to figure out how to appropriately cite the data you’ve used. This is still just an early prototype, but here’s an idea of what it might look like to generate a report on how to cite your data:
print(opr.generate_citation(flight_line))
== Data Citation ==
This data was collected by The University of Texas at Austin.
Please cite the dataset DOI: https://doi.org/10.18738/T8/J38CO5
Please include the following funder acknowledgment:
This work was supported by the Center for Oldest Ice Exploration, an NSF Science and Technology Center (NSF 2019719) and the G. Unger Vetlesen Foundation.
== Processing Citation ==
Data was processed using the Open Polar Radar (OPR) Toolbox: https://doi.org/10.5281/zenodo.5683959
Please cite the OPR Toolbox as:
Open Polar Radar. (2024). opr (Version 3.0.1) [Computer software]. https://gitlab.com/openpolarradar/opr/. https://doi.org/10.5281/zenodo.5683959
And include the following acknowledgment:
We acknowledge the use of software from Open Polar Radar generated with support from the University of Kansas, NASA grants 80NSSC20K1242 and 80NSSC21K0753, and NSF grants OPP-2027615, OPP-2019719, OPP-1739003, IIS-1838230, RISE-2126503, RISE-2127606, and RISE-2126468.