Optimize frame reading throughput by wtn · Pull Request #24 · socketry/protocol-websocket

wtn · 2026-04-15T21:28:31Z

Reduce per-frame overhead on the read path.

On a local benchmark (25K text frames, ~83B avg payload, Unix socket pair, IO::Stream::Buffered) throughput improved ~26%.

List of Changes

Read both header bytes in a single stream.read(2) instead of two separate reads
Combine extended length + mask into one read for masked frames (stream.read(6) or stream.read(12))
Use getbyte/unpack1 instead of unpack("C").first
Short-circuit unpack_frames for single-frame messages (to avoid map/join)

Types of Changes

Performance improvement.

Contribution

I added tests for my changes.
I tested my changes locally.
I agree to the Developer's Certificate of Origin 1.1.

samuel-williams-shopify · 2026-04-16T02:58:28Z

We independently verified the changes in this PR by re-applying each commit
one at a time from main, running a reproducible benchmark after every step,
and confirming the improvement before proceeding to the next.

Environment

Ruby 4.0.2 (arm64-darwin)
Benchmark: in-process StringIO-backed framer, median of 7 repetitions × 40,000 frames per scenario (200 frames for large payloads)
Primary metric: median µs/frame, averaged across four common scenarios

Scenarios

Scenario	Payload	Length encoding	Masked
small unmasked	13 B	1 byte	no
small masked	13 B	1 byte	yes
medium unmasked	200 B	2 bytes (126)	no
medium masked	200 B	2 bytes (126)	yes

Large frames (70 kB, length=127 encoding) were also exercised but excluded from
the primary metric as they are dominated by payload I/O rather than parsing overhead.

Results

Step	Change	µs/frame	vs `main`
Baseline (`main`)	—	0.644	—
1	`parse_header`: `getbyte(0)` instead of `unpack("C").first`; integer comparisons instead of `Range#include?`	0.561	−12.8%
2	`Framer#read_frame`: read both header bytes in a single `stream.read(2)` call; pass second byte directly to `Frame.read`, eliminating a redundant read	0.449	−30.3%
3	`Frame.read`: `unpack1` instead of `.unpack(...).first`; combined length+mask read for masked extended frames (one 6-byte or 12-byte read instead of two separate reads)	0.433	−32.7%
4	`Connection#unpack_frames`: fast-path for the common single-frame case, avoiding `map` + `join`	0.431	−33.1%

Per-scenario breakdown (baseline vs final)

Scenario	`main`	This PR	Δ
small unmasked	0.572 µs	0.368 µs	−35.6%
small masked	0.635 µs	0.429 µs	−32.4%
medium unmasked	0.654 µs	0.436 µs	−33.4%
medium masked	0.714 µs	0.482 µs	−32.5%

Notes

Step 2 is the dominant win (~two-thirds of the total improvement). Eliminating
one stream.read call per frame — by reading the 2-byte header atomically and
passing the second byte directly into Frame.read — is the single biggest change.
All improvements were consistent across payload sizes and mask/no-mask variants,
indicating the gains are in parsing overhead rather than payload handling.
The test suite was extended with 5 new cases (masked medium/large messages,
pre-read second_byte, reserved opcode rejection) and all 92 tests pass.

Co-authored-by: Claude <[email protected]>

samuel-williams-shopify reviewed Apr 16, 2026

View reviewed changes

Comment thread lib/protocol/websocket/framer.rb

samuel-williams-shopify reviewed Apr 16, 2026

View reviewed changes

Comment thread lib/protocol/websocket/framer.rb Outdated

samuel-williams-shopify force-pushed the perf branch from b75c0f7 to fbb59fb Compare April 16, 2026 05:21

Optimize frame reading.

5e52f03

Co-authored-by: Claude <[email protected]>

samuel-williams-shopify force-pushed the perf branch from fbb59fb to 5e52f03 Compare April 16, 2026 05:23

samuel-williams-shopify added 3 commits April 16, 2026 15:33

Add missing require.

362809c

Remove redundant Frame.parse_header.

3cc82e1

Move all read logic to Framer.

f9fb1e2

samuel-williams-shopify force-pushed the perf branch from 0aa290a to f9fb1e2 Compare April 16, 2026 08:40

samuel-williams-shopify merged commit ae76bd5 into socketry:main Apr 16, 2026
18 of 21 checks passed

ioquatix added this to the v0.21.0 milestone Apr 16, 2026

wtn deleted the perf branch April 16, 2026 14:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize frame reading throughput#24

Optimize frame reading throughput#24
samuel-williams-shopify merged 4 commits intosocketry:mainfrom
wtn:perf

wtn commented Apr 15, 2026

Uh oh!

samuel-williams-shopify commented Apr 16, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

wtn commented Apr 15, 2026

List of Changes

Types of Changes

Contribution

Uh oh!

samuel-williams-shopify commented Apr 16, 2026

Environment

Scenarios

Results

Per-scenario breakdown (baseline vs final)

Notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants