Performance Guide¶
XRegrid is designed for high-performance regridding. This guide shows you how to get the best performance for your specific use case.
Performance Overview¶
XRegrid provides significant performance improvements over existing solutions:
| Resolution | Grid Points | XRegrid | xESMF | Speedup |
|---|---|---|---|---|
| 1.0° Global | 64,800 | 0.7 ms | 44 ms | ~60x |
| 0.5° Global | 259,200 | 4.2 ms | 178 ms | ~40x |
| 0.25° Global | 1,036,800 | 23 ms | 750 ms | ~30x |
| 0.1° Global | 6,480,000 | 350 ms | 6.5s* | ~18x |
* xESMF time for 0.1° is estimated based on linear scaling trend.
Key Performance Features¶
1. Optimized Sparse Matrix Operations¶
XRegrid uses optimized sparse matrix-vector and matrix-matrix multiplications. By transposing and flattening data into a 2D format (non_spatial, spatial), we can leverage high-performance BLAS routines through SciPy:
# XRegrid automatically vectorizes all non-spatial dimensions
# Efficient (matrix @ data.T).T pattern avoids redundant copies
result = (weights_matrix @ data_2d.T).T
2. Efficient Memory Usage¶
Scipy sparse matrices have lower memory overhead compared to other sparse libraries:
- More compact storage format
- Better cache locality
- Optimized for matrix-matrix multiplication
3. Proper ESMF Integration¶
- Dask-Parallel Weight Generation: Large grids can have weights generated in parallel across Dask workers.
- Truly Distributed Weight Handling: Weights are assembled and stored directly on the Dask cluster as Futures, protecting the driver from Out-Of-Memory (OOM) crashes on massive grids.
- Vectorized Mesh Triangulation: Conservative regridding for unstructured meshes (MPAS/UGRID) uses NumPy vectorization, providing a ~13x speedup over traditional iterative approaches during initialization.
- Efficient Index Reconstruction: Workers reconstruct global destination indices locally, minimizing driver-worker communication.
- Proper Coordinate Handling: Automatic transposition to (longitude, latitude) as required by ESMF.
Optimization Strategies¶
Weight Reuse¶
The most important optimization for repeated regridding:
# First time: compute and save weights
regridder = Regridder(
source, target,
method='bilinear',
reuse_weights=True,
filename='global_1deg_to_05deg_weights.nc'
)
# Subsequent times: load existing weights (much faster!)
regridder = Regridder(
source, target,
method='bilinear',
reuse_weights=True,
filename='global_1deg_to_05deg_weights.nc'
)
Performance Impact: - Weight generation: 10-60 seconds (depending on grid size) - Weight loading: 0.1-2 seconds - Speedup: 10-30x for the initialization phase
Global Grid Periodicity¶
Always use periodic=True for global grids:
regridder = Regridder(
source, target,
method='bilinear',
periodic=True # Critical for performance and accuracy!
)
Why this matters: - Enables proper spherical geometry calculations - Reduces number of required interpolation points - Handles dateline crossing correctly
Stationary Mask Caching¶
A common pattern in climate data is a fixed land-sea mask. XRegrid detects if the NaN mask is identical across multiple time steps and caches the weight normalization factors:
# skipna=True handles NaNs by re-normalizing weights
# If the mask is stationary (constant over time), normalization is only computed once
result = regridder(da_with_nans, skipna=True)
Performance Impact: - First call/chunk: Computes weights and mask normalization - Subsequent calls/chunks: Reuses normalization cache - Speedup: ~2x for NaN-heavy datasets
Dask Parallelization¶
XRegrid scales linearly with Dask chunks. It also utilizes worker-local caching to avoid re-sending large weight matrices over the network:
# Load data with appropriate chunks
data = xr.open_dataset('large_file.nc', chunks={'time': 20, 'lat': 180, 'lon': 360})
# Regridding preserves chunks and parallelizes automatically
result = regridder(data.temperature)
Chunking Guidelines: - Time dimension: 10-50 time steps per chunk - Spatial dimensions: Keep spatial dimensions unchunked if possible - Memory target: 100-500 MB per chunk
Example: Optimal Chunking¶
# For 0.25° global data (1440x720 spatial)
# 100 time steps, ~4GB total
data = xr.open_dataset(
'large_climate_data.nc',
chunks={
'time': 25, # 25 time steps per chunk
'lat': 720, # Keep spatial dims unchunked
'lon': 1440 # for optimal regridding
}
)
API Usability: XRegrid vs. ESMPy¶
While XRegrid is faster than xESMF, it also provides a much more intuitive API than raw ESMPy. ESMPy is a powerful low-level interface, but it requires substantial boilerplate to work with xarray datasets.
Code Comparison¶
Here is what is required to regrid a simple lat-lon dataset.
Using raw ESMPy¶
import esmpy
import numpy as np
import xarray as xr
# Load data
ds = xr.tutorial.open_dataset("air_temperature").isel(time=0)
# 1. Create Source Grid (Manual coordinate handling)
src_grid = esmpy.Grid(
np.array([ds.lon.size, ds.lat.size]),
staggerloc=[esmpy.StaggerLoc.CENTER],
coord_sys=esmpy.CoordSys.SPH_DEG
)
src_lon_ptr = src_grid.get_coords(0)
src_lat_ptr = src_grid.get_coords(1)
lon_mesh, lat_mesh = np.meshgrid(ds.lon.values, ds.lat.values)
src_lon_ptr[...] = lon_mesh.T # ESMF uses (lon, lat) / Fortran order
src_lat_ptr[...] = lat_mesh.T
# 2. Create Target Grid
dst_lon = np.arange(200, 331, 1.0)
dst_lat = np.arange(15, 76, 1.0)
dst_grid = esmpy.Grid(
np.array([len(dst_lon), len(dst_lat)]),
staggerloc=[esmpy.StaggerLoc.CENTER],
coord_sys=esmpy.CoordSys.SPH_DEG
)
dst_lon_ptr = dst_grid.get_coords(0)
dst_lat_ptr = dst_grid.get_coords(1)
lon_mesh_dst, lat_mesh_dst = np.meshgrid(dst_lon, dst_lat)
dst_lon_ptr[...] = lon_mesh_dst.T
dst_lat_ptr[...] = lat_mesh_dst.T
# 3. Create Fields and Initialize Regrid
src_field = esmpy.Field(src_grid, name="air")
dst_field = esmpy.Field(dst_grid, name="air_regridded")
regrid = esmpy.Regrid(src_field, dst_field, regrid_method=esmpy.RegridMethod.BILINEAR)
# 4. Apply Regrid (Requires manual data copy)
src_field.data[...] = ds.air.values.T
regrid(src_field, dst_field)
# 5. Extract result back to xarray
result = xr.DataArray(
dst_field.data.T,
coords={"lat": dst_lat, "lon": dst_lon},
dims=("lat", "lon")
)
Using XRegrid¶
from xregrid import Regridder
# Define target grid as an xarray Dataset
target_grid = xr.Dataset({
"lat": (["lat"], np.arange(15, 76, 1.0)),
"lon": (["lon"], np.arange(200, 331, 1.0))
})
# Create and apply in two steps
regridder = Regridder(ds, target_grid)
result = regridder(ds.air)
Advantages of XRegrid¶
- Dask Support: XRegrid works natively with Dask-backed DataArrays, parallelizing the weight application across chunks. ESMPy requires manual implementation of this logic.
- Metadata Preservation: XRegrid automatically preserves name, attributes, and non-spatial coordinates.
- Automatic Detection: XRegrid uses
cf-xarrayto automatically identify latitude and longitude, even if they aren't namedlatorlon. - Sparse Application: XRegrid uses optimized SciPy sparse matrices for applying weights, which is often faster than the built-in ESMPy
__call__for large datasets.
Detailed Performance Analysis¶
Single Time Step Performance¶
| Resolution | Total Points | Weight Apply Time | Memory Usage |
|---|---|---|---|
| 1.0° | 64,800 | 0.7 ms | ~10 MB |
| 0.5° | 259,200 | 4.2 ms | ~25 MB |
| 0.25° | 1,036,800 | 23 ms | ~80 MB |
| 0.1° | 6,480,000 | 350 ms | ~450 MB |
Multi-Time Step Performance¶
Vectorization and stationary mask caching significantly improve performance for multi-time step datasets.
| Time Steps | Resolution | Total Time | Time per Step |
|---|---|---|---|
| 10 | 1.0° | 9 ms | 0.9 ms |
| 100 | 1.0° | 65 ms | 0.65 ms |
| 10 | 0.25° | 260 ms | 26 ms |
| 100 | 0.25° | 2.3s | 23 ms |
Note: Performance improves with more time steps due to vectorization
Dask Scaling Performance¶
| Workers | Chunks | Resolution | Time | Speedup |
|---|---|---|---|---|
| 1 | 4 | 0.5° | 2.1s | 1.0x |
| 4 | 4 | 0.5° | 0.6s | 3.5x |
| 8 | 8 | 0.5° | 0.3s | 7.0x |
| 1 | 10 | 0.25° | 8.5s | 1.0x |
| 4 | 10 | 0.25° | 2.4s | 3.5x |
| 8 | 20 | 0.25° | 1.1s | 7.7x |
Method-Specific Performance¶
Bilinear¶
- Best for: Continuous fields (temperature, pressure)
- Performance: Fastest method
- Memory: Low memory usage
Conservative¶
- Best for: Flux quantities (precipitation, radiation)
- Performance: ~2-3x slower than bilinear
- Memory: Higher memory usage due to more complex weights
Nearest Neighbor¶
- Best for: Categorical data (land use, vegetation types)
- Performance: Fastest for sparse grids
- Memory: Lowest memory usage
# Performance comparison for 0.25° global grid
methods = {
'bilinear': '53 ms',
'conservative': '125 ms',
'nearest_s2d': '31 ms'
}
Large-Scale Optimization¶
Ultra-High Resolution (3km Global)¶
For extremely large grids (>50M points):
# Example: 3km global grid (~88M points)
regridder = Regridder(
source_3km, target_1deg,
method='conservative', # Often required for such large ratios
reuse_weights=True, # Essential!
filename='3km_to_1deg.nc'
)
# Process in temporal chunks
for year in years:
data = load_year_data(year, chunks={'time': 12})
result = regridder(data)
result.to_netcdf(f'regridded_{year}.nc')
Memory Management¶
For memory-constrained environments:
# Use smaller chunks
data = xr.open_dataset(
'huge_file.nc',
chunks={'time': 5, 'lat': 360, 'lon': 720}
)
# Process iteratively if needed
for i, chunk in enumerate(data.time.groupby('time.year')):
year, year_data = chunk
result = regridder(year_data.temperature)
result.to_netcdf(f'output_{year}.nc')
Benchmarking Your Setup¶
Use this script to benchmark XRegrid on your system:
import time
import numpy as np
import xarray as xr
from xregrid import Regridder
def benchmark_regridding(source_res, target_res, time_steps=10):
"""Benchmark regridding performance."""
# Create grids
source = xr.Dataset({
'lat': (['lat'], np.linspace(-90, 90, source_res)),
'lon': (['lon'], np.linspace(0, 359, source_res*2))
})
target = xr.Dataset({
'lat': (['lat'], np.linspace(-90, 90, target_res)),
'lon': (['lon'], np.linspace(0, 359.5, target_res*2))
})
# Create test data
data = xr.DataArray(
np.random.rand(time_steps, source_res, source_res*2),
dims=['time', 'lat', 'lon'],
coords={'lat': source.lat, 'lon': source.lon}
)
# Time regridder creation
start = time.time()
regridder = Regridder(source, target, method='bilinear')
creation_time = time.time() - start
# Time regridding
start = time.time()
result = regridder(data)
regrid_time = time.time() - start
print(f"Grid: {source_res}° → {target_res}°")
print(f"Creation time: {creation_time:.3f}s")
print(f"Regrid time: {regrid_time:.3f}s")
print(f"Time per step: {regrid_time/time_steps:.3f}s")
print(f"Points/second: {result.size/regrid_time:,.0f}")
print()
# Run benchmarks
benchmark_regridding(180, 360) # 1.0° to 0.5°
benchmark_regridding(720, 1440) # 0.25° to 0.125°
benchmark_regridding(1800, 3600) # 0.1° to 0.05°
Performance Troubleshooting¶
Slow Weight Generation¶
Symptoms: Long delays during regridder creation
Solutions:
1. Use weight reuse: reuse_weights=True
2. Check coordinate ordering and validity
3. Verify grid periodicity settings
4. Consider using a coarser method first
Slow Weight Application¶
Symptoms: Slow data regridding after regridder creation
Solutions:
1. Check Dask chunking strategy
2. Verify coordinate dimensions match expected order
3. Use periodic=True for global grids
4. Monitor memory usage - may need smaller chunks
Memory Issues¶
Symptoms: Out-of-memory errors or system slowdown
Solutions: 1. Reduce chunk sizes in time dimension 2. Process data in temporal batches 3. Use conservative method only when necessary 4. Enable weight reuse to avoid recomputation
Poor Parallel Scaling¶
Symptoms: Adding workers doesn't improve performance
Solutions: 1. Increase number of chunks to match workers 2. Check that spatial dimensions aren't chunked 3. Verify adequate memory per worker 4. Monitor CPU and memory usage during processing
Best Practices Summary¶
- Always use weight reuse for repeated regridding
- Set
periodic=Truefor global grids - Chunk in time only for optimal performance
- Target 100-500 MB per chunk for memory efficiency
- Save weights to fast storage (SSD) for quick loading
- Monitor memory usage and adjust chunks as needed
- Use conservative method sparingly - only when flux conservation is critical
- Benchmark your specific use case to find optimal settings
Following these guidelines, you should see substantial performance improvements over other regridding solutions, especially for large or frequently-used grids.