Rasterio vs GDAL Python Bindings
Both Rasterio and the raw osgeo.gdal bindings sit on the same C library, but they offer very different developer experiences. Rasterio is Pythonic — NumPy arrays, context managers, named windows; the GDAL bindings mirror the C API directly. This guide compares them for raster work and explains when the lower-level bindings still earn their place. It is for anyone choosing how to script raster I/O. It sits under Raster Data Handling with Rasterio in Mastering Core Geospatial Python Libraries.
Why This Approach / What Goes Wrong
Rasterio and osgeo.gdal both call libgdal, so neither is faster at the metal — the difference is ergonomics and safety. Rasterio gives you with rasterio.open(...) as src, NumPy arrays from src.read(), an Affine transform object, and named Window reads; resources close automatically. The raw bindings require gdal.Open, band.ReadAsArray(), a six-tuple geotransform you index by hand, and explicit dataset = None to flush and release — forget it and you leak file handles or lose unwritten data. The raw bindings still win when you need GDAL features Rasterio doesn't wrap directly: VRT construction, certain warp options, or algorithms exposed only on the C API. For everyday reading, windowing, and band math, Rasterio is the safer default, as set out in Reading Multi-Band TIFFs with Rasterio.
Prerequisites
rasterio>=1.3gdal>=3.8(providesosgeo.gdal)numpy>=1.26
conda install -c conda-forge "rasterio=1.3.*" "gdal=3.8.*" "numpy=1.26.*"
Step-by-Step Implementation
1. Read a band and its georeferencing — the Rasterio way.
import rasterio
with rasterio.open("elevation.tif") as src:
dem = src.read(1) # NumPy array
transform = src.transform # Affine object
crs = src.crs # CRS object
nodata = src.nodata
# File closed automatically here
print(dem.shape, crs)
2. The same read with the raw GDAL bindings.
from osgeo import gdal
import numpy as np
ds = gdal.Open("elevation.tif")
band = ds.GetRasterBand(1)
dem = band.ReadAsArray() # NumPy array
gt = ds.GetGeoTransform() # (x0, dx, 0, y0, 0, dy) — index by position
wkt = ds.GetProjection() # CRS as WKT string
nodata = band.GetNoDataValue()
ds = None # MUST release explicitly
3. Write a derived raster — Rasterio clones the profile.
import rasterio
with rasterio.open("elevation.tif") as src:
profile = src.profile.copy()
slope = (src.read(1) * 0.1).astype("float32")
profile.update(dtype="float32", count=1)
with rasterio.open("slope.tif", "w", **profile) as dst:
dst.write(slope, 1)
4. Reach for raw GDAL when you need a feature Rasterio doesn't wrap — e.g. building a VRT mosaic.
from osgeo import gdal
# VRT mosaicking is a GDAL-native operation with no direct Rasterio API
gdal.BuildVRT("mosaic.vrt", ["tile_a.tif", "tile_b.tif", "tile_c.tif"])
Verification
Confirm both APIs read identical pixels and georeferencing from the same file.
import rasterio
from osgeo import gdal
import numpy as np
with rasterio.open("elevation.tif") as src:
rio_arr = src.read(1)
rio_transform = tuple(src.transform)[:6]
ds = gdal.Open("elevation.tif")
gdal_arr = ds.GetRasterBand(1).ReadAsArray()
gdal_gt = ds.GetGeoTransform()
ds = None
assert np.array_equal(rio_arr, gdal_arr), "Pixel arrays differ — same file, same bytes expected"
# Rasterio Affine is (a,b,c,d,e,f); GDAL gt is (c,a,b,f,d,e) — reordered but equivalent
print("Rasterio reads", rio_arr.shape, "| GDAL reads", gdal_arr.shape)
Edge Cases & Debugging
- Leaked file handles / truncated output. You forgot
dataset = Nonewith the raw bindings; Rasterio'swithblock prevents this. - Geotransform confusion. GDAL's six-tuple orders differently from Rasterio's
Affine; don't mix them — convert deliberately. ImportError: osgeo. GDAL Python bindings aren't installed; the conda-forgegdalpackage provides them.- Version mismatch. Rasterio bundles its own GDAL; a separate
osgeo.gdalmay be a different GDAL version — keep them aligned via conda-forge. - Need a feature Rasterio lacks. Use raw GDAL for that step only, then return to Rasterio — they read the same files.
- NumPy dtype surprises. Both return the band's native dtype; cast explicitly before math to avoid integer overflow.