Data Parsing Functions
gcmprocpy provides a range of functions for data extraction and manipulation. Below are the key plotting routines along with their detailed parameters and usage examples.
Note
For live examples with output, see the Data Exploration and Data Extraction notebooks.
Data Containers
These dataclasses are used throughout gcmprocpy to hold dataset metadata and extracted plot data.
- class gcmprocpy.containers.PlotData(values: numpy.ndarray, variable_unit: str, variable_long_name: str, model: str, filename: str, levs: numpy.ndarray | None = None, lats: numpy.ndarray | None = None, lons: numpy.ndarray | None = None, mtime: list | None = None, mtime_values: list | None = None, selected_lat: float | None = None, selected_lon: float | None = None, selected_lev: float | None = None)[source]
Container for data returned by arr_* functions when plot_mode=True.
- values
The extracted variable values (numpy array).
- Type:
numpy.ndarray
- variable_unit
The unit string after any conversion.
- Type:
str
- variable_long_name
The long descriptive name of the variable.
- Type:
str
- model
The model type (‘TIE-GCM’ or ‘WACCM-X’).
- Type:
str
- filename
The source dataset filename.
- Type:
str
- levs
Level/ilevel coordinate array (if applicable).
- Type:
numpy.ndarray
- lats
Latitude coordinate array (if applicable).
- Type:
numpy.ndarray
- lons
Longitude coordinate array (if applicable).
- Type:
numpy.ndarray
- mtime
Single model time as [day, hour, min, sec] (for single-time plots).
- Type:
list
- mtime_values
List of model times (for multi-time plots like lev_time, lat_time).
- Type:
list
- selected_lat
The latitude value used for selection (if applicable).
- Type:
float
- selected_lon
The longitude value used for selection (if applicable).
- Type:
float
- selected_lev
The level value used for selection (if applicable).
- Type:
float
Model Defaults
MODEL_DEFAULTS is a dictionary containing model-specific default variable names,
species mappings, wind scale factors, and color scheme configurations for TIE-GCM and WACCM-X.
- gcmprocpy.containers.MODEL_DEFAULTS = {'TIE-GCM': {'density': {'cmap': 'viridis', 'line_color': 'white', 'vars': ['NE', 'DEN', 'O2', 'O1', 'N2', 'NO', 'N4S', 'HE', 'OP', 'NMF2', 'TEC']}, 'electric': {'cmap': 'bwr', 'line_color': 'black', 'vars': ['POTEN']}, 'electron_density': 'NE', 'species': {'co2': 'CO2', 'h': 'H', 'ho2': 'HO2', 'hox': 'HOX', 'n2': 'N2', 'no': 'NO', 'no2': 'NO2', 'noz': 'NOZ', 'o': 'O1', 'o2': 'O2', 'o3': 'O3', 'oh': 'OH', 'ox': 'OX', 'temp': 'TN'}, 'temperature': 'TN', 'temperature_type': {'cmap': 'inferno', 'line_color': 'white', 'vars': ['TN', 'TE', 'TI', 'QJOULE']}, 'wind': {'cmap': 'bwr', 'line_color': 'black', 'vars': ['UN', 'VN', 'WN', 'UI_ExB', 'VI_ExB', 'WI_ExB']}, 'wind_scale': 0.01, 'wind_u': 'UN', 'wind_v': 'VN', 'wind_w': 'WN'}, 'WACCM-X': {'density': {'cmap': 'viridis', 'line_color': 'white', 'vars': ['EDens', 'OpDens', 'O2p', 'NOp', 'N2p', 'Op', 'ElecColDens', 'O3', 'NO', 'NO2', 'N2O', 'CO', 'CO2', 'CH4', 'H2O', 'HE', 'O', 'O2', 'N2', 'HNO3', 'NOY', 'CLOY', 'BROY']}, 'electric': {'cmap': 'bwr', 'line_color': 'black', 'vars': ['ED1', 'ED2', 'POTEN']}, 'electron_density': 'EDens', 'radiation': {'cmap': 'plasma', 'line_color': 'white', 'vars': ['FSDS', 'FSNS', 'FSNT', 'FLDS', 'FLNS', 'FLNT', 'FLUT', 'QRL_TOT', 'QRS_TOT', 'QRS_EUV', 'QRS_AUR', 'QTHERMAL', 'SWCF', 'LWCF']}, 'species': {'co2': 'CO2', 'h': 'H', 'ho2': 'HO2', 'hox': 'HOX', 'n2': 'N2', 'no': 'NO', 'no2': 'NO2', 'noz': 'NOZ', 'o': 'O', 'o2': 'O2', 'o3': 'O3', 'oh': 'OH', 'ox': 'OX', 'temp': 'T'}, 'temperature': 'T', 'temperature_type': {'cmap': 'inferno', 'line_color': 'white', 'vars': ['T', 'TREFHT', 'THETA']}, 'wind': {'cmap': 'bwr', 'line_color': 'black', 'vars': ['U', 'V', 'OMEGA', 'UTGW_TOTAL', 'VTGW_TOTAL']}, 'wind_scale': 1.0, 'wind_u': 'U', 'wind_v': 'V', 'wind_w': 'OMEGA'}}
dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object’s
(key, value) pairs
- dict(iterable) -> new dictionary initialized as if via:
d = {} for k, v in iterable:
d[k] = v
- dict(**kwargs) -> new dictionary initialized with the name=value pairs
in the keyword argument list. For example: dict(one=1, two=2)
- Example:
Access default wind variable names for a model.
from gcmprocpy import MODEL_DEFAULTS # TIE-GCM wind variables print(MODEL_DEFAULTS['TIE-GCM']['wind_u']) # 'UN' print(MODEL_DEFAULTS['TIE-GCM']['wind_v']) # 'VN' # WACCM-X wind variables print(MODEL_DEFAULTS['WACCM-X']['wind_u']) # 'U' print(MODEL_DEFAULTS['WACCM-X']['wind_v']) # 'V' # Species name mapping print(MODEL_DEFAULTS['TIE-GCM']['species']['temp']) # 'TN' print(MODEL_DEFAULTS['WACCM-X']['species']['temp']) # 'T' # Wind unit scale factor (cm/s → m/s for TIE-GCM) print(MODEL_DEFAULTS['TIE-GCM']['wind_scale']) # 0.01
Species Name Lookup
- gcmprocpy.containers.get_species_names(model)[source]
Return species name mapping for a model type.
Uses
MODEL_DEFAULTSas the single source of truth for mapping canonical role names to dataset variable names.- Parameters:
model (str) – Model type (
'TIE-GCM'or'WACCM-X').- Returns:
Mapping from canonical names (e.g.
'temp','o','o2') to dataset variable names (e.g.'TN','O1','O2').- Return type:
dict
- Raises:
ValueError – If model is not recognized.
- Example:
Get species variable names for a specific model.
from gcmprocpy import get_species_names sp = get_species_names('TIE-GCM') print(sp['temp']) # 'TN' print(sp['o']) # 'O1' print(sp['o2']) # 'O2' sp = get_species_names('WACCM-X') print(sp['temp']) # 'T' print(sp['o']) # 'O'
Data Exploration
Listing Dimensions
This function reads all the datasets and returns the unique dimensions present.
- gcmprocpy.data_parse.dim_list(datasets)[source]
Retrieves a sorted list of unique dimension names across all datasets.
- Parameters:
datasets (list of tuples) – A list of tuples, where each tuple contains an xarray dataset and its filename.
- Returns:
A sorted list of unique dimension names across all datasets.
- Return type:
list
- Example:
Load datasets and list unique dimensions.
datasets = gy.load_datasets(directory, dataset_filter) dims = gy.dim_list(datasets) print(dims)
Listing Variables
This function reads all the datasets and reutrns the variables listed in there.
- gcmprocpy.data_parse.var_list(datasets)[source]
Reads all the datasets and returns the variables listed in them.
- Parameters:
datasets (xarray.Dataset) – The loaded dataset opened using xarray.
- Returns:
A sorted list of variable entries in the datasets.
- Return type:
list
- Example:
Load datasets and list unique variables.
datasets = gy.load_datasets(directory, dataset_filter) vars = gy.var_list(datasets) print(vars)
Listing Timestamps
This function compiles and returns a list of all timestamps present in the provided datasets.
- gcmprocpy.data_parse.time_list(datasets)[source]
Compiles and returns a list of all timestamps present in the provided datasets. This function is particularly useful for aggregating time data from multiple sources.
- Parameters:
datasets (list of tuples) – Each tuple in the list contains an xarray dataset and its corresponding filename. The function will iterate through each dataset to gather timestamps.
- Returns:
A list containing all the datetime64 timestamps found in the datasets.
- Return type:
list of np.datetime64
- Example:
Load datasets and list unique timestamps.
datasets = gy.load_datasets(directory, dataset_filter) times = gy.time_list(datasets) print(times)
Listing Levels
This function reads all the datasets and returns the unique lev and ilev entries in sorted order.
- gcmprocpy.data_parse.level_list(datasets, log_level=True)[source]
Reads all the datasets and returns the unique lev and ilev entries in sorted order.
- Parameters:
datasets (list of tuples) – A list of tuples, where each tuple contains an xarray dataset and its filename.
log_level (bool) – A flag indicating whether to display level in log values. Default is True.
- Returns:
A sorted list of unique lev and ilev entries from the datasets.
- Return type:
lev_ilevs (list)
- Example:
Load datasets and list unique lev and ilev entries.
datasets = gy.load_datasets(directory, dataset_filter) lev_ilevs = gy.level_list(datasets) print(lev_ilevs)
Listing Longitudes
This function reads all the datasets and returns the unique longitude (lon) entries in sorted order.
- gcmprocpy.data_parse.lon_list(datasets)[source]
Reads all the datasets and returns the unique longitude (lon) entries in sorted order.
- Parameters:
datasets (list of tuples) – A list of tuples, where each tuple contains an xarray dataset and its filename.
- Returns:
A sorted list of unique longitude entries from the datasets.
- Return type:
list
- Example:
Load datasets and list unique longitude entries.
datasets = gy.load_datasets(directory, dataset_filter) lons = gy.lon_list(datasets) print(lons)
Listing Latitudes
This function reads all the datasets and returns the unique latitude (lat) entries in sorted order.
- gcmprocpy.data_parse.lat_list(datasets)[source]
Reads all the datasets and returns the unique latitude (lat) entries in sorted order.
- Parameters:
datasets (list of tuples) – A list of tuples, where each tuple contains an xarray dataset and its filename.
- Returns:
A sorted list of unique latitude entries from the datasets.
- Return type:
list
- Example:
Load datasets and list unique latitude entries.
datasets = gy.load_datasets(directory, dataset_filter) lats = gy.lat_list(datasets) print(lats)
Variable Information
This function provides detailed information about a specific variable in the datasets.
- gcmprocpy.data_parse.var_info(datasets, variable_name)[source]
Retrieves the attributes and dimension information of a specified variable from all datasets.
- Parameters:
datasets (list of tuples) – A list of tuples, where each tuple contains an xarray dataset and its filename.
variable_name (str) – The name of the variable to retrieve attributes for.
- Returns:
A dictionary where keys are filenames and values are dictionaries of attributes for the specified variable.
- Return type:
dict
- Example:
Load datasets and get information about a specific variable.
datasets = gy.load_datasets(directory, dataset_filter) info = gy.var_info(datasets, 'variable_name') print(info)
Dimension Information
This function provides detailed information about a specific dimension in the datasets.
- gcmprocpy.data_parse.dim_info(datasets, dimension)[source]
Retrieves information about a specified dimension’s size across all datasets.
- Parameters:
datasets (list of tuples) – A list of tuples, where each tuple contains an xarray dataset and its filename.
dimension (str) – The name of the dimension to retrieve information for.
- Returns:
- A dictionary where keys are filenames and values are the size of the specified dimension.
If the dimension does not exist in a dataset, the value is None.
- Return type:
dict
- Example:
Load datasets and get information about a specific dimension.
datasets = gy.load_datasets(directory, dataset_filter) info = gy.dim_info(datasets, 'dimension_name') print(info)
Data Xarrays
Selected Time
This function extracts and processes data for a given variable at a specific time from multiple datasets. It also handles unit conversion and provides additional information if needed for plotting.
- gcmprocpy.data_parse.arr_var(datasets, variable_name, time, selected_unit=None, log_level=True, plot_mode=False)
- Example:
Extract all level data for a variable at a specific time.
datasets = gy.load_datasets(directory, dataset_filter) time_value = '2022-01-01T12:00:00' # Get raw xarray DataArray data = gy.arr_var(datasets, 'TN', time=time_value) print(data.shape) # (nlev, nlat, nlon) # Get PlotData object with metadata result = gy.arr_var(datasets, 'TN', time=time_value, plot_mode=True) print(result.variable_unit, result.long_name) # Using model time (TIE-GCM) data = gy.arr_var(datasets, 'TN', mtime=[360, 0, 0, 0])
Selected Time, Level
This function extracts data from the dataset based on the specified variable, time, and level (lev/ilev).
- gcmprocpy.data_parse.arr_lat_lon(datasets, variable_name, time, selected_lev_ilev=None, selected_unit=None, plot_mode=False)
- Example:
Extract a latitude-longitude slice at a specific time and pressure level.
datasets = gy.load_datasets(directory, dataset_filter) # Raw xarray DataArray (lat x lon) data = gy.arr_lat_lon(datasets, 'TN', time='2022-01-01T12:00:00', selected_lev_ilev=4.0) print(data.shape) # (nlat, nlon) # PlotData object for use with custom plotting result = gy.arr_lat_lon(datasets, 'TN', time='2022-01-01T12:00:00', selected_lev_ilev=4.0, plot_mode=True) print(result.lats, result.lons, result.values.shape) # Using model time (TIE-GCM) data = gy.arr_lat_lon(datasets, 'TN', mtime=[360, 0, 0, 0], selected_lev_ilev=4.0) # Specify level as height in km data = gy.arr_lat_lon(datasets, 'TN', time='2022-01-01T12:00:00', selected_lev_ilev=300.0, level_type='height')
Batch Selected Time, Level (Multiple Variables)
This function extracts multiple variables at once for a given time and level, reducing redundant dataset traversal.
- gcmprocpy.data_parse.batch_arr_lat_lon(datasets, variable_names, time, selected_lev_ilev=None, selected_unit=None, plot_mode=False)
- Example:
Load datasets and extract multiple variables in a single pass.
datasets = gy.load_datasets(directory, dataset_filter) results = gy.batch_arr_lat_lon(datasets, ['TN', 'O1', 'NO'], time=time_value, selected_lev_ilev=4.0, plot_mode=True) for name, result in results.items(): print(f'{name}: {result.values.shape}')
Selected Time, Latitude, Longitude
This function extracts data from the dataset for a given variable name, latitude, longitude, and time.
- gcmprocpy.data_parse.arr_lev_var(datasets, variable_name, time, selected_lat, selected_lon, selected_unit=None, log_level=True, plot_mode=False)
- Example:
Extract a vertical profile at a specific latitude, longitude, and time.
datasets = gy.load_datasets(directory, dataset_filter) # Raw xarray DataArray (1D array of values at each level) data = gy.arr_lev_var(datasets, 'TN', latitude=30.0, time='2022-01-01T12:00:00', longitude=45.0) # PlotData object with level information result = gy.arr_lev_var(datasets, 'TN', latitude=30.0, time='2022-01-01T12:00:00', longitude=45.0, plot_mode=True) print(result.levs, result.values) # Using local time instead of longitude data = gy.arr_lev_var(datasets, 'TN', latitude=0.0, time='2022-01-01T12:00:00', local_time=12.0)
Variable vs Latitude (Meridional 1D)
This function extracts a 1D meridional profile of a variable along latitude at a fixed pressure level and longitude (or zonal mean).
- gcmprocpy.data_parse.arr_var_lat(datasets, variable_name, time, selected_lev_ilev, selected_lon, selected_unit=None, plot_mode=False)
- Example:
Extract a 1D meridional slice at a specific level, time, and longitude.
datasets = gy.load_datasets(directory, dataset_filter) # PlotData object with 1D values aligned to latitudes result = gy.arr_var_lat(datasets, 'TN', time='2022-01-01T12:00:00', selected_lev_ilev=4.0, selected_lon=30.0, plot_mode=True) print(result.lats, result.values) # Zonal mean across all longitudes result = gy.arr_var_lat(datasets, 'TN', time='2022-01-01T12:00:00', selected_lev_ilev=4.0, selected_lon='mean', plot_mode=True)
Variable vs Longitude (Zonal 1D)
This function extracts a 1D zonal profile of a variable along longitude at a fixed pressure level and latitude (or meridional mean).
- gcmprocpy.data_parse.arr_var_lon(datasets, variable_name, time, selected_lev_ilev, selected_lat, selected_unit=None, plot_mode=False)
- Example:
Extract a 1D zonal slice at a specific level, time, and latitude.
datasets = gy.load_datasets(directory, dataset_filter) # PlotData object with 1D values aligned to longitudes result = gy.arr_var_lon(datasets, 'TN', time='2022-01-01T12:00:00', selected_lev_ilev=4.0, selected_lat=2.5, plot_mode=True) print(result.lons, result.values) # Meridional mean across all latitudes result = gy.arr_var_lon(datasets, 'TN', time='2022-01-01T12:00:00', selected_lev_ilev=4.0, selected_lat='mean', plot_mode=True) # Area-weighted meridional mean (cos-lat) — see note below result = gy.arr_var_lon(datasets, 'TN', time='2022-01-01T12:00:00', selected_lev_ilev=4.0, selected_lat='wmean', plot_mode=True)
Note
``’mean’`` vs ``’wmean’``. Anywhere a selected_lat / selected_lon
accepts 'mean' (an unweighted average over that axis), it also accepts
'wmean' for a cos(lat) area-weighted average. Weighting only changes
the result when latitude is the collapsed axis (cells around a longitude
circle are equal-area), so 'wmean' matters for meridional and global means
— where a plain mean over-weights the poles — and is identical to 'mean'
for a zonal (longitude) mean. For a true global mean use
selected_lat='wmean', selected_lon='mean' in arr_lev_var().
Selected Time Latitude
This function extracts and processes data from the dataset based on a specific variable, time, and latitude.
- gcmprocpy.data_parse.arr_lev_lon(datasets, variable_name, time, selected_lat, selected_unit=None, log_level=True, plot_mode=False)
- Example:
Extract a level-longitude cross section at a specific latitude and time.
datasets = gy.load_datasets(directory, dataset_filter) # Raw xarray DataArray (nlev x nlon) data = gy.arr_lev_lon(datasets, 'TN', latitude=30.0, time='2022-01-01T12:00:00') print(data.shape) # PlotData object for custom contour plotting result = gy.arr_lev_lon(datasets, 'TN', latitude=30.0, time='2022-01-01T12:00:00', plot_mode=True) print(result.levs, result.lons, result.values.shape)
Selected Time, Longitude
This function extracts data from a dataset based on the specified variable name, time, and longitude.
- gcmprocpy.data_parse.arr_lev_lat(datasets, variable_name, time, selected_lon, selected_unit=None, log_level=True, plot_mode=False)
- Example:
Extract a level-latitude cross section at a specific longitude and time.
datasets = gy.load_datasets(directory, dataset_filter) # Raw xarray DataArray (nlev x nlat) data = gy.arr_lev_lat(datasets, 'TN', time='2022-01-01T12:00:00', selected_lon=45.0) print(data.shape) # PlotData object result = gy.arr_lev_lat(datasets, 'TN', time='2022-01-01T12:00:00', selected_lon=45.0, plot_mode=True) print(result.levs, result.lats, result.values.shape)
Selected Latitude, Longitude Over Time-range
This function extracts and processes data from multiple datasets using data across different levels and times for a given latitude and longitude.
- gcmprocpy.data_parse.arr_lev_time(datasets, variable_name, selected_lat, selected_lon, selected_unit=None, log_level=True, plot_mode=False)
- Example:
Extract a level-time cross section at a specific latitude and longitude.
datasets = gy.load_datasets(directory, dataset_filter) # Raw xarray DataArray (nlev x ntime) data = gy.arr_lev_time(datasets, 'TN', latitude=30.0, longitude=45.0) print(data.shape) # With time range filter data = gy.arr_lev_time(datasets, 'TN', latitude=30.0, longitude=45.0, time_minimum='2022-01-01T00:00:00', time_maximum='2022-01-02T00:00:00') # PlotData object result = gy.arr_lev_time(datasets, 'TN', latitude=30.0, longitude=45.0, plot_mode=True) print(result.levs, result.times, result.values.shape)
Selected Level, Longitude Over Time-range
This function extracts and processes data from the dataset based on the specified variable name, longitude, and level/ilev.
- gcmprocpy.data_parse.arr_lat_time(datasets, variable_name, selected_lon, selected_lev_ilev=None, selected_unit=None, plot_mode=False)
- Example:
Extract a latitude-time cross section at a specific level and longitude.
datasets = gy.load_datasets(directory, dataset_filter) # Raw xarray DataArray (nlat x ntime) data = gy.arr_lat_time(datasets, 'TN', selected_lev_ilev=4.0, longitude=45.0) print(data.shape) # With time range filter data = gy.arr_lat_time(datasets, 'TN', selected_lev_ilev=4.0, longitude=45.0, time_minimum='2022-01-01T00:00:00', time_maximum='2022-01-02T00:00:00') # PlotData object result = gy.arr_lat_time(datasets, 'TN', selected_lev_ilev=4.0, longitude=45.0, plot_mode=True) print(result.lats, result.times, result.values.shape) # Specify level as height in km data = gy.arr_lat_time(datasets, 'TN', selected_lev_ilev=300.0, longitude=0.0, level_type='height')
Selected Level, Latitude Over Time-range
This function extracts and processes data from the dataset based on the specified variable name, latitude, and level/ilev. Returns a 2D array of longitudes x time.
- gcmprocpy.data_parse.arr_lon_time(datasets, variable_name, selected_lat, selected_lev_ilev=None, selected_unit=None, plot_mode=False)
- Example:
Extract a longitude-time cross section at a specific level and latitude.
datasets = gy.load_datasets(directory, dataset_filter) # Raw xarray DataArray (nlon x ntime) data = gy.arr_lon_time(datasets, 'TN', latitude=0.0, selected_lev_ilev=4.0) print(data.shape) # PlotData object result = gy.arr_lon_time(datasets, 'TN', latitude=0.0, selected_lev_ilev=4.0, plot_mode=True) print(result.lons, result.times, result.values.shape) # Specify level as height in km data = gy.arr_lon_time(datasets, 'TN', latitude=0.0, selected_lev_ilev=250.0, level_type='height')
Selected Level, Latitude, Longitude Over Time-range
This function extracts a 1D time series of a variable at a specific latitude, longitude, and level/ilev.
- gcmprocpy.data_parse.arr_var_time(datasets, variable_name, selected_lat, selected_lon, selected_lev_ilev=None, selected_unit=None, plot_mode=False)
- Example:
Extract a time series at a specific latitude, longitude, and level.
datasets = gy.load_datasets(directory, dataset_filter) # Raw xarray DataArray (1D time series) data = gy.arr_var_time(datasets, 'TN', latitude=0.0, longitude=45.0, selected_lev_ilev=4.0) print(data.shape) # PlotData object result = gy.arr_var_time(datasets, 'TN', latitude=0.0, longitude=45.0, selected_lev_ilev=4.0, plot_mode=True) print(result.times, result.values) # Specify level as height in km data = gy.arr_var_time(datasets, 'TN', latitude=0.0, longitude=45.0, selected_lev_ilev=300.0, level_type='height')
Satellite Track Interpolation
This function interpolates model data along a satellite trajectory using trilinear interpolation (time, latitude, longitude). Input is three arrays of equal length representing the satellite’s position at each point along its orbit.
- gcmprocpy.data_parse.arr_sat_track(datasets, variable_name, sat_time, sat_lat, sat_lon, selected_lev_ilev=None, selected_unit=None, plot_mode=False)[source]
Interpolates model data along a satellite trajectory.
Takes arrays of satellite time/lat/lon points and interpolates the model field to those locations using xarray’s built-in interpolation.
- Parameters:
datasets (list[ModelDataset]) – Loaded model datasets.
variable_name (str) – The name of the variable to extract.
sat_time (array-like) – Satellite timestamps as numpy datetime64 values.
sat_lat (array-like) – Satellite latitudes in degrees.
sat_lon (array-like) – Satellite longitudes in degrees.
selected_lev_ilev (Union[float, str, None]) – Level value to extract at, ‘mean’ to average over all levels, or None to return all levels.
selected_unit (str, optional) – Desired unit for the variable.
plot_mode (bool, optional) – If True, returns a PlotData object.
- Returns:
If selected_lev_ilev is given: 1D array of shape (n_points,). If selected_lev_ilev is None: 2D array of shape (n_levels, n_points). If plot_mode is True, returns a PlotData object.
- Return type:
Union[numpy.ndarray, PlotData]
- Example:
Interpolate temperature along a satellite track.
import numpy as np datasets = gy.load_datasets(directory, dataset_filter) times = gy.time_list(datasets) sat_time = np.array([times[0] + np.timedelta64(i * 6, 'm') for i in range(20)]) sat_lat = np.linspace(-60, 60, 20) sat_lon = np.linspace(-120, 120, 20) # 1D array at a specific level values = gy.arr_sat_track(datasets, 'TN', sat_time, sat_lat, sat_lon, selected_lev_ilev=5.0) # 2D array (levels x track points) values = gy.arr_sat_track(datasets, 'TN', sat_time, sat_lat, sat_lon) # PlotData object result = gy.arr_sat_track(datasets, 'TN', sat_time, sat_lat, sat_lon, selected_lev_ilev=5.0, plot_mode=True)
Data Utilities
mTime to Time
This function searches for a specific time in a dataset based on the provided model time (mtime) and returns the corresponding np.datetime64 time value. It iterates through multiple datasets to find a match.
- gcmprocpy.data_parse.get_time(datasets, mtime)[source]
Searches for a specific time in a dataset based on the provided model time (mtime) and returns the corresponding np.datetime64 time value. It iterates through multiple datasets to find a match.
- Parameters:
datasets (list[tuple]) – Each tuple contains an xarray dataset and its filename. The function will search each dataset for the time value.
mtime (list[int]) – Model time represented as a list of integers in the format [day, hour, minute].
- Returns:
The corresponding datetime value in the dataset for the given mtime. Returns None if no match is found.
- Return type:
np.datetime64
- Example:
Convert a model time (mtime) to a datetime value.
datasets = gy.load_datasets(directory, dataset_filter) # TIE-GCM model time: [Day, Hour, Min, Sec] mtime = [360, 0, 0, 0] time = gy.get_time(datasets, mtime) print(time) # np.datetime64 value
Time to mTime
This function finds and returns the model time (mtime) array that corresponds to a specific time in a dataset. The mtime is an array representing [Day, Hour, Min].
- gcmprocpy.data_parse.get_mtime(ds, time)[source]
Finds and returns the model time (mtime) array that corresponds to a specific time in a dataset. The mtime is an array representing [Day, Hour, Min].
- Parameters:
ds (xarray.Dataset) – The dataset opened using xarray, containing time and mtime data.
time (Union[str, numpy.datetime64]) – The timestamp for which the corresponding mtime is to be found.
- Returns:
- The mtime array containing [Day, Hour, Min] for the given timestamp.
Returns None if no corresponding mtime is found.
- Return type:
numpy.ndarray
- Example:
Convert a datetime string to model time (mtime).
datasets = gy.load_datasets(directory, dataset_filter) # Get mtime for a specific datetime mtime = gy.get_mtime(datasets, '2022-01-01T12:00:00') print(mtime) # e.g., [1, 12, 0] for Day 1, 12:00 # Use with time_list to convert all times times = gy.time_list(datasets) for t in times[:5]: mt = gy.get_mtime(datasets, t) print(f'{t} -> mtime {mt}')
Data Caching
All arr_* data extraction functions (and derived-variable handlers) are transparently memoized
by a bounded LRU cache. Repeated calls with the same (datasets, variable, time, level, ...)
tuple return the cached result in O(1), which speeds up timeline scrubbing, re-plotting, and
composite plots that extract the same field multiple times.
The cache is keyed on the Python identity (id) of the datasets list plus all positional
and keyword arguments (lists are normalized to tuples so batch calls cache correctly). Unhashable
arguments (e.g. raw numpy arrays in arr_sat_track) transparently bypass the cache.
- gcmprocpy.containers.clear_data_cache()[source]
Drop all cached results. Call on dataset reload.
The default cache holds up to 128 entries and evicts least-recently-used results. Call
clear_data_cache() after reloading datasets or otherwise mutating them in place, so that
stale results don’t leak across sessions. The GUI does this automatically on dataset reload.
clear_derived_cache is kept as a backwards-compatible alias for clear_data_cache.
- Example:
Invalidate the cache after reloading datasets.
from gcmprocpy import clear_data_cache, load_datasets datasets = load_datasets(directory, dataset_filter) # ... use datasets ... # Reload from disk — drop stale cached extractions datasets = load_datasets(directory, dataset_filter) clear_data_cache()
Height Interpolation
gcmprocpy supports converting between pressure levels and geometric height (km) using the model’s
height variable (ZG for TIE-GCM, Z3 for WACCM-X). This enables specifying levels as heights
and plotting vertical axes in km instead of pressure coordinates.
Height to Pressure Level
This function converts a target height in km to the nearest pressure level by looking up the model’s geometric height field (ZG or Z3).
- gcmprocpy.data_parse.height_to_pres_level(datasets, time, target_height_km, latitude=None, longitude=None)[source]
Convert a target height (km) to the nearest pressure level.
Finds the pressure level whose average geometric height is closest to the requested height. Optionally narrows to a specific lat/lon.
- Parameters:
datasets – Loaded datasets.
time – Timestamp for height lookup.
target_height_km (float) – Desired height in km.
latitude (float, optional) – Latitude to evaluate height at.
longitude (float, optional) – Longitude to evaluate height at.
- Returns:
The pressure level (lev or ilev value) closest to target_height_km.
- Return type:
float
- Example:
Find the pressure level closest to 300 km altitude.
datasets = gy.load_datasets(directory, dataset_filter) time = '2022-01-01T12:00:00' # Global average — find the level whose mean height is closest to 300 km pres_level = gy.height_to_pres_level(datasets, time, 300.0) print(f'300 km ≈ pressure level {pres_level}') # At a specific location pres_level = gy.height_to_pres_level(datasets, time, 300.0, latitude=0.0, longitude=45.0) print(f'300 km at equator, 45°E ≈ pressure level {pres_level}')
Interpolate to Height
This function interpolates a 2D field from pressure levels to constant height surfaces using the model’s geometric height field. Supports both linear and exponential (log) interpolation.
- gcmprocpy.data_parse.interpolate_to_height(datasets, variable_values, levs, time, target_heights=None, n_heights=50, log_interp=False)[source]
Interpolate a field from pressure levels to constant height surfaces.
- Parameters:
datasets – Loaded datasets (to access ZG/Z3).
variable_values (np.ndarray) – 2D array (nlev, nlat) or (nlev, nlon) on pressure levels.
levs (np.ndarray) – Pressure level coordinate values matching axis 0 of variable_values.
time – Timestamp for height field lookup.
target_heights (np.ndarray, optional) – Desired height levels in km. If None, auto-generates n_heights levels spanning the data range.
n_heights (int) – Number of height levels if target_heights is None.
log_interp (bool) – If True, use exponential interpolation (for densities).
- Returns:
- (interpolated_values, target_heights_km)
interpolated_values: 2D array (n_heights, n_spatial) target_heights_km: 1D array of height levels in km
- Return type:
tuple
- Example:
Interpolate a latitude-altitude cross section from pressure to height coordinates.
import numpy as np datasets = gy.load_datasets(directory, dataset_filter) time = '2022-01-01T12:00:00' # Extract lev vs lat data on pressure levels result = gy.arr_lev_lat(datasets, 'TN', time, selected_lon=0.0, plot_mode=True) # Interpolate to 40 evenly spaced height levels interp_values, heights_km = gy.interpolate_to_height( datasets, result.values, result.levs, time, n_heights=40) print(f'Height range: {heights_km[0]:.1f} – {heights_km[-1]:.1f} km') print(f'Interpolated shape: {interp_values.shape}') # Interpolate to specific heights target_heights = np.array([100, 200, 300, 400, 500]) interp_values, _ = gy.interpolate_to_height( datasets, result.values, result.levs, time, target_heights=target_heights) # Use exponential interpolation for density-like variables ne_result = gy.arr_lev_lat(datasets, 'NE', time, selected_lon=0.0, plot_mode=True) interp_ne, heights = gy.interpolate_to_height( datasets, ne_result.values, ne_result.levs, time, log_interp=True)
Height in Plot Functions
All plot functions that accept a level parameter also accept level_type to specify
whether the level value is a pressure level (default) or a height in km. When level_type='height',
the height is automatically converted to the nearest pressure level using the model’s geometric
height field (ZG for TIE-GCM, Z3 for WACCM-X).
All level-axis plots (plt_lev_var, plt_lev_lon, plt_lev_lat, plt_lev_time) also
accept y_axis='height' to display the vertical axis in km instead of pressure coordinates.
- Example:
Specify a level as height instead of pressure.
datasets = gy.load_datasets(directory, dataset_filter) # Lat-lon plot at 300 km altitude (automatically finds nearest pressure level) plot = gy.plt_lat_lon(datasets, 'TN', time='2022-01-01T12:00:00', level=300.0, level_type='height') # Latitude vs time at 400 km altitude plot = gy.plt_lat_time(datasets, 'TN', level=400.0, level_type='height', longitude=0.0) # Longitude vs time at 250 km altitude plot = gy.plt_lon_time(datasets, 'TN', latitude=0.0, level=250.0, level_type='height') # Variable vs time at 300 km altitude plot = gy.plt_var_time(datasets, 'TN', latitude=0.0, longitude=0.0, level=300.0, level_type='height')
- Example:
Plot vertical axis in km instead of pressure.
datasets = gy.load_datasets(directory, dataset_filter) # Vertical profile with height axis plot = gy.plt_lev_var(datasets, 'TN', latitude=0.0, time='2022-01-01T12:00:00', longitude=0.0, y_axis='height') # Longitude cross-section with height axis plot = gy.plt_lev_lon(datasets, 'TN', latitude=0.0, time='2022-01-01T12:00:00', y_axis='height') # Latitude cross-section with height axis plot = gy.plt_lev_lat(datasets, 'TN', time='2022-01-01T12:00:00', longitude=0.0, y_axis='height') # Level vs time with height axis plot = gy.plt_lev_time(datasets, 'TN', latitude=0.0, longitude=0.0, y_axis='height')