feat: NumPy-accelerated vector serialization in VectorType (hugs improvements in some use cases) by mykaul · Pull Request #793 · scylladb/python-driver

mykaul · 2026-04-05T06:00:48Z

Summary

Add three fast serialization paths to VectorType for fixed-size numeric subtypes (float, double, int, bigint), delivering 70–300× speedups for high-dimensional vector inserts
Add serialize_numpy_bulk() classmethod for batch workloads (e.g. loading embeddings from Parquet files in VectorDBBench)
NumPy remains optional; all new code is guarded by try/except ImportError

Motivation

The VectorDBBench use case (https://github.com/scylladb/VectorDBBench) pipelines data as Parquet → PyArrow/NumPy → wire. The existing VectorType.serialize() performed 768+ individual struct.pack('>f', val) + BytesIO.write() calls per vector, making serialization the bottleneck for bulk inserts.

What's new

Three fast paths in `VectorType.serialize()`:

bytes/bytearray passthrough – if the caller already holds a correctly-sized blob (e.g. from serialize_numpy_bulk()), return it directly with zero conversion
NumPy ndarray fast path – for a 1-D ndarray, byte-swap to big-endian via np.asarray(v, dtype='>f4').tobytes() instead of per-element struct.pack
serialize_numpy_bulk() classmethod – byte-swap an entire 2-D array (N_rows × dim) once and slice the raw buffer into a list[bytes]

Benchmark results

768-dim float32, 100-row batches (Python 3.14, NumPy 2.3, best of 3):

Method	ns/call	µs/row	Speedup
list (baseline)	6,196,012	61.96	1×
numpy (per-row)	85,245	0.85	73×
bulk (serialize_numpy_bulk)	41,239	0.41	150×
bytes passthrough	17,980	0.18	345×
bulk+passthrough (end-to-end)	60,216	0.60	103×

In absolute terms: serializing 100 × 768-dim float32 vectors drops from ~6.2 ms (baseline) to ~60 µs (bulk+passthrough end-to-end) — a 103× improvement. The bottleneck was 76,800 individual struct.pack('>f', val) + BytesIO.write() calls (768 elements × 100 rows); NumPy replaces all of that with a single array byte-swap + buffer slice.

768-dim float32, single vector (latency per call):

Method	ns/call	Speedup
list (baseline)	65,181	1×
numpy (per-row)	963	68×
bytes passthrough	237	275×

Per-vector latency: a single 768-dim vector goes from ~65 µs to under 1 µs via the NumPy path, or 237 ns if pre-serialized bytes are reused.

Safety

Variable-size subtypes (smallint, tinyint, text) are excluded from the fast path and continue to use the original element-by-element serialization
Passing raw bytes for a variable-size subtype now raises TypeError explicitly (rather than silently falling through)
_numpy_dtype is only set when subtype.serial_size() is not None and the typename is in the 4-entry _NUMPY_DTYPE_MAP
Comprehensive unit tests (97 pass) cover correctness, round-trip fidelity, error handling, and fallback behavior

Commits

feat: NumPy-accelerated vector serialization – all code + 97 unit tests
bench: add vector NumPy serialization benchmark – standalone benchmark script with ns/call column
docs: document NumPy-accelerated vector serialization – usage examples in docs/performance.rst

mykaul · 2026-04-05T06:52:14Z

CC @swasik

Add three fast paths to VectorType.serialize() for the common case of fixed-size numeric subtypes (float, double, int, bigint): 1. bytes/bytearray passthrough – skip all conversion when the caller already holds a correctly-sized blob (e.g. from serialize_numpy_bulk). 2. NumPy ndarray fast path – convert a 1-D numpy array to big-endian bytes via asarray(dtype=...).tobytes() instead of 768+ individual struct.pack + BytesIO.write calls. 3. serialize_numpy_bulk() classmethod – byte-swap an entire 2-D array (N rows × dim columns) once and slice the raw buffer, yielding one bytes object per row with zero per-element overhead. Benchmarks on 768-dim float32 vectors show 70-300× speedups depending on the path, directly benefiting bulk-insert workloads such as loading embeddings from Parquet files (VectorDBBench use case). NumPy remains an optional dependency; all new code is guarded by try/except ImportError. Variable-size subtypes (smallint, tinyint, text, etc.) are excluded and continue to use the original element-by-element path. Unit tests cover correctness, round-trip fidelity, error handling, and fallback behavior for all three paths.

Standalone benchmark comparing the four serialization paths for VectorType across dimensions (128, 768, 1536) and batch sizes (1, 100, 10000): - list (element-by-element) – baseline - numpy (per-row ndarray) – single-row fast path - bulk (serialize_numpy_bulk) – batch fast path - bytes passthrough (bind) – pre-serialized blob Includes auto-calibrated iteration counts and correctness verification.

Add a new section to docs/performance.rst covering the three fast serialization paths (single-row ndarray, bulk serialize_numpy_bulk, bytes passthrough) with usage examples and supported subtype list.

mykaul marked this pull request as draft April 5, 2026 06:52

mykaul force-pushed the numpy-vector-serialize branch 2 times, most recently from af88c0e to fba36f3 Compare April 5, 2026 12:51

mykaul added 3 commits April 5, 2026 16:05

docs: document NumPy-accelerated vector serialization

9e4bf4a

Add a new section to docs/performance.rst covering the three fast serialization paths (single-row ndarray, bulk serialize_numpy_bulk, bytes passthrough) with usage examples and supported subtype list.

mykaul force-pushed the numpy-vector-serialize branch from fba36f3 to 9e4bf4a Compare April 5, 2026 13:06

mykaul changed the title ~~feat: NumPy-accelerated vector serialization in VectorType~~ feat: NumPy-accelerated vector serialization in VectorType (hugs improvements in some use cases) Apr 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: NumPy-accelerated vector serialization in VectorType (hugs improvements in some use cases)#793

feat: NumPy-accelerated vector serialization in VectorType (hugs improvements in some use cases)#793
mykaul wants to merge 3 commits intoscylladb:masterfrom
mykaul:numpy-vector-serialize

mykaul commented Apr 5, 2026 •

edited

Loading

Uh oh!

mykaul commented Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mykaul commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

What's new

Three fast paths in VectorType.serialize():

Benchmark results

Safety

Commits

Uh oh!

mykaul commented Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mykaul commented Apr 5, 2026 •

edited

Loading

Three fast paths in `VectorType.serialize()`: