Add array subset-check benchmark across Ruby 3.3, 3.4, 4.0 by etagwerker · Pull Request #237 · fastruby/fast-ruby

etagwerker · 2026-06-28T21:44:49Z

What

Adds an Array benchmark for subset checks (is every element of a1 also in a2?), comparing five approaches across Ruby 3.3.10, 3.4.7, and 4.0.0:

(a1 - a2).empty?
(a1 & a2) == a1
(a1 & a2).size == a1.size
a1.all? { |e| a2.include?(e) }
a1.to_set.subset?(a2.to_set)

Background

This revisits the comparison from #125 by @gabteles (now closed). That PR had two problems the reviewers (@mblumtritt, @Arcovion) hinted at back in 2017:

A correctness bug: Set#subset? had its arguments reversed (a2.to_set.subset?(a1.to_set)), so it returned false while every other method returned true. It wasn't measuring the same operation.
No stable winner: the result is highly data-dependent.

This version fixes the Set arguments, adds an equivalence guard so all five approaches must agree before the benchmark runs, and reports results across three modern Ruby versions.

Findings

(a1 - a2).empty? is the consistent winner across 3.3, 3.4, and 4.0 for the common case where a1 really is a subset.
a1.all? { include? } is data-dependent: it short-circuits on the first miss (so it wins when a1 is not a subset), but it's O(n*m) and degrades badly on large true subsets. The README entry documents this caveat.
Set#subset? improved dramatically in Ruby 4.0: ~6.8x slower on 3.3/3.4 (dominated by to_set allocation) but only ~1.7x slower on 4.0. If you already hold Sets or check repeatedly, it scales best.

Notes

Benchmarks were run via rbenv on each version (benchmark-ips installed per version).
The README block shows the full output for 3.3.10 and the Comparison: summary for 3.4.7 and 4.0.0 to keep it readable.

🤖 Generated with Claude Code

Revisits the comparison from the closed #125 (gabteles), fixing the reversed Set#subset? arguments so every approach returns the same result (guarded by an equivalence check), and benchmarks across modern Ruby versions. Findings: - (a1 - a2).empty? is the consistent winner for true-subset inputs. - a1.all? { include? } only wins when a1 is NOT a subset (short-circuits) and is O(n*m) on large true subsets. - Set#subset? (incl. to_set) went from ~6.8x slower on 3.3/3.4 to ~1.7x slower on 4.0, where Set got much faster. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add array subset-check benchmark across Ruby 3.3, 3.4, 4.0#237

Add array subset-check benchmark across Ruby 3.3, 3.4, 4.0#237
etagwerker wants to merge 1 commit into
mainfrom
array-subset-check-benchmark

etagwerker commented Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

etagwerker commented Jun 28, 2026

What

Background

Findings

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant