Sub-segment reads for uncompressed flat layouts to reduce read amplification during take #6333

jiaqizho · 2026-02-06T10:29:15Z

jiaqizho
Feb 6, 2026

Summary

I'm using Vortex 0.56. My workload involves take operations on a column with a fixed-size list type (e.g., FixedSizeList<f32, 768> for embedding vectors). The column is stored uncompressed because floating-point embeddings don't compress well.

A single segment is approximately 1 MB(by default). When I only need one row via take (e.g., a single FixedSizeList<f32, 768> row is only 768 × 4 = ~3 KB), the entire 1 MB segment is read from disk, deserialized, and then filtered in memory?

Current Behavior

The read path always fetches the full segment:

FlatReader::array_future()

segment_source.request(segment_id) // reads entire segment blob
ArrayParts::try_from(segment) // deserializes everything
array.slice() / filter() // in-memory row selection

Key locations:

vortex-layout/src/layouts/flat/reader.rs:50-68 — array_future() requests the full segment
vortex-file/src/segments/source.rs:25-41 — FileSegmentSource::request() reads (offset, length) with no partial read support
vortex-array/src/serde.rs:419-470 — ArrayParts::try_from() expects the complete blob

The segment format packs data buffers and a trailing FlatBuffer metadata together, so the reader must fetch the entire blob to locate and parse the metadata before it can access any data.

Expected Behavior

For uncompressed, fixed-width data (e.g., primitives, fixed-size lists of primitives), it should be possible to read only the bytes corresponding to the requested rows, dramatically reducing IO.

onursatici · 2026-02-06T15:31:09Z

onursatici
Feb 6, 2026
Maintainer

Hi, one thing you can try is to reduce the sizes of each segment by playing with the repartition writer config. The default is most likely too much for elements of size 768. By adjusting that you would end up with smaller segments

https://github.com/vortex-data/vortex/blob/develop/vortex-layout/src/layouts/repartition.rs#L30-L49

5 replies

robert3005 Feb 6, 2026
Maintainer

We have also made an improvement to the writer that's unreleased which would automatically give you smaller segments #6180

jiaqizho Feb 8, 2026
Author

I'm just curious, why does Vortex never read less than one segment? What is the purpose behind this design? Wouldn't directly accessing a single row of data achieve less read amplification?

robert3005 Feb 8, 2026
Maintainer

The format right now is such that the whole segment is serialised as a unit which means you need at least the segment header but we don’t know how long that header is without reading the beginning. We are planning to change it to split out the header of the segment from the data buffer.

However, the assumption of the design was such that whether you read the segment or less it doesn’t make a significant difference in performance. For extremely sparse reads it could be beneficial to fetch part of the segment. We might enable that in the future

jiaqizho Feb 9, 2026
Author

However, the assumption of the design was such that whether you read the segment or less it doesn’t make a significant difference in performance.

I don't think that assumption holds on object storage. I extended the random_access benchmark to track IO stats and support S3:

Added IO statistics reporting. After each benchmark run, we now print IOPS and bytes read. For Vortex this is tracked via VortexSession metrics (vortex.io.read.size histogram + io.requests.{individual,coalesced} counters). For Lance this uses lance_io::scheduler::{iops_counter, bytes_read_counter}.
Added S3 support. Two new CLI flags --s3-bucket and --s3-endpoint enable benchmarking against S3-compatible storage (e.g. MinIO). When provided, the benchmark generates taxi data locally, uploads it to S3 (skipping if already present), then runs the same random-access takes reading from S3 — using VortexOpenOptions::open_object_store() for Vortex, ParquetObjectReader for Parquet, and s3:// URLs for Lance.

And the result is:

Local Disk Random Access Benchmark Results

Benchmark	vortex-file-compressed	lance	parquet
random-access/vortex-tokio-local-disk	3,409 μs (1.00x)	4,316 μs (1.27x)	236,362 μs (69.33x)

IO Statistics (last run)

Format	IOPS	Read
vortex-file-compressed	36	0.0 KB
lance	133	307.7 KB
parquet	(not tracked)	(not tracked)

S3(minio) Random Access Benchmark Results

Benchmark	vortex-file-compressed	lance	parquet
random-access/vortex-tokio-s3	1,920,911 μs (1.00x)	641,619 μs (0.33x)	13,109,992 μs (6.82x)

IO Statistics (last run)

Format	IOPS	Read
vortex-file-compressed	18	6.17 MB
lance	129	476.2 KB
parquet	(not tracked)	(not tracked)

robert3005 Feb 11, 2026
Maintainer

I see. The assumption we usually held was that your segments are small and on object store fetching 4kb vs 1mb wouldn't be materially different. However, there's a lot of toggles we can play around with here. First of all we need to extract the header from the segment somewhere else, this is already in flight to support decoding vortex files on gpus. Secondly we can always trade off performance for smaller io with layout readers. For instance for fixed width types we could calculate the offsets and for variable length types we could fetch the segment containing the lengths/offsets first before downloading any values. We haven't properly looked into lazy fetching of elements in the list but it's something we are interested in doing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sub-segment reads for uncompressed flat layouts to reduce read amplification during take #6333

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 5 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Sub-segment reads for uncompressed flat layouts to reduce read amplification during take #6333

Uh oh!

jiaqizho Feb 6, 2026

Replies: 1 comment · 5 replies

Uh oh!

onursatici Feb 6, 2026 Maintainer

Uh oh!

robert3005 Feb 6, 2026 Maintainer

Uh oh!

jiaqizho Feb 8, 2026 Author

Uh oh!

robert3005 Feb 8, 2026 Maintainer

Uh oh!

Uh oh!

jiaqizho Feb 9, 2026 Author

Local Disk Random Access Benchmark Results

IO Statistics (last run)

S3(minio) Random Access Benchmark Results

IO Statistics (last run)

Uh oh!

robert3005 Feb 11, 2026 Maintainer

jiaqizho
Feb 6, 2026

Replies: 1 comment 5 replies

onursatici
Feb 6, 2026
Maintainer

robert3005 Feb 6, 2026
Maintainer

jiaqizho Feb 8, 2026
Author

robert3005 Feb 8, 2026
Maintainer

jiaqizho Feb 9, 2026
Author

robert3005 Feb 11, 2026
Maintainer