Shapely 1.x vs Shapely 2.0 Vectorization

Shapely 2.0 was a ground-up rewrite around a NumPy-style vectorized API, and it changes how fast geometry operations run and how you should write them. Code that loops over geometries the 1.x way still works, but leaves most of the speedup on the table. This guide compares the two and shows the vectorized idioms. It is for anyone whose Shapely-heavy code feels slow. It sits under Shapely Geometry Operations in Mastering Core Geospatial Python Libraries.

Shapely 2.0 moves the loop into C: operations apply across arrays of geometries in a single call.

Why This Approach / What Goes Wrong

In Shapely 1.x, every operation went through a Python method on a single geometry object, so processing a million polygons meant a million Python-level GEOS calls. Shapely 2.0 exposes module-level functions (shapely.area, shapely.buffer, shapely.intersects) that accept NumPy arrays of geometries and execute the loop in C, often 5–100× faster on bulk work. The catch during migration: a handful of 1.x behaviors changed — iterating a multipart geometry, the .ctypes/array interface, and some cascaded_union-era names — so old code can break subtly. The biggest missed opportunity is writing 2.0 like 1.x: looping with .apply() or a Python for instead of passing the whole array. If you use GeoPandas DataFrames Explained, you already get 2.0 vectorization for column operations for free.

Prerequisites

shapely>=2.0
numpy>=1.26
geopandas>=0.14 (optional, for the DataFrame comparison)

conda install -c conda-forge "shapely=2.0.*" "numpy=1.26.*" "geopandas=0.14.*"

Step-by-Step Implementation

1. The 1.x idiom — a Python loop over objects (still valid, but slow at scale).

from shapely.geometry import Point

points = [Point(x, x) for x in range(1_000_000)]
# One GEOS round trip per object
buffers_loop = [p.buffer(10) for p in points]
areas_loop = [p.buffer(10).area for p in points]

2. The 2.0 idiom — operate on a NumPy array in one call.

import numpy as np
import shapely

pts = shapely.points(np.arange(1_000_000), np.arange(1_000_000))
buffers_vec = shapely.buffer(pts, 10)        # whole array, C-level loop
areas_vec = shapely.area(buffers_vec)        # returns a NumPy array

3. Vectorized predicates replace per-pair loops.

import numpy as np
import shapely

a = shapely.points(np.random.rand(100_000), np.random.rand(100_000))
boundary = shapely.box(0.2, 0.2, 0.8, 0.8)
inside = shapely.contains(boundary, a)       # boolean NumPy array, one call
print("Inside count:", int(inside.sum()))

4. Inside GeoPandas it is automatic — column operations already vectorize.

import geopandas as gpd

parcels = gpd.read_file("parcels.gpkg").to_crs(epsg=25832)
parcels["area_m2"] = parcels.geometry.area   # Shapely 2.0 vectorized under the hood

Verification

Confirm the vectorized path gives identical results and is materially faster.

import numpy as np, shapely, time
from shapely.geometry import Point

n = 200_000
xs = np.arange(n, dtype=float)

t0 = time.perf_counter()
loop_areas = np.array([Point(x, x).buffer(5).area for x in xs])
t_loop = time.perf_counter() - t0

t0 = time.perf_counter()
vec_areas = shapely.area(shapely.buffer(shapely.points(xs, xs), 5))
t_vec = time.perf_counter() - t0

assert np.allclose(loop_areas, vec_areas), "Vectorized result must match the loop"
print(f"loop: {t_loop:.2f}s   vectorized: {t_vec:.2f}s   speedup: {t_loop/t_vec:.0f}x")
# loop: 1.93s   vectorized: 0.07s   speedup: 28x

Edge Cases & Debugging

AttributeError after upgrading. Renamed/removed 1.x helpers (e.g. cascaded_union → shapely.union_all); update the import.
Iterating a multipolygon changed. Use shapely.get_parts() to access components in 2.0.
Still slow on 2.0. You're looping with .apply()/for; pass the whole array to a shapely.* function instead.
Mixing object methods and module functions. Both work, but only the array functions vectorize; be consistent in hot loops.
None/empty geometries in an array. Vectorized functions propagate None; filter or handle with shapely.is_missing/is_empty.
GeoPandas pins an older Shapely. Ensure the environment resolved shapely>=2.0; check shapely.__version__.