Skip to content

blob: make blob URLs cross-realm and allow file-backed Blob cloning#63103

Open
jimmywarting wants to merge 2 commits intonodejs:mainfrom
jimmywarting:blob
Open

blob: make blob URLs cross-realm and allow file-backed Blob cloning#63103
jimmywarting wants to merge 2 commits intonodejs:mainfrom
jimmywarting:blob

Conversation

@jimmywarting
Copy link
Copy Markdown

Detailed description:

  • Add a global native blob URL registry so blob: URLs created in one realm are resolvable in another (e.g. main → worker). Registry entries store the DataQueue and metadata, are mutex‑protected, and are cleaned up via Environment cleanup hooks to avoid teardown races.
  • Extend the DataQueue / Entry reader API to accept an optional Environment*, and update FdEntry reader creation so file handles and reader resources are created in the target environment/realm. This enables structured cloning / transferring of file‑backed Blobs across workers.
  • Remove the JS-side prohibition on cloning file‑backed blobs so structuredClone() succeeds for file‑backed Blobs.
  • Wire up Blob native bindings to use the new registry when storing/getting/revoking blob URLs; ensure Blob::GetDataObject constructs the Blob in the receiving environment.
  • Add/update regression tests covering URL resolution in workers and cloning of file‑backed Blobs.

Files changed (high level):

  • node_blob.cc: add global blob URL registry, Store/Get/Revoke helpers, env cleanup hook integration, Blob::StoreDataObject/GetDataObject/RevokeObjectURL wiring.
  • queue.h, queue.cc: add optional Environment* parameter to Entry::get_reader, propagate env through DataQueue, implement FdEntry::ReaderImpl::Create(entry, env) to open file handles in the reader env.
  • blob.js: remove kNotCloneable flag for file‑backed Blobs so they can be cloned/ transferred.
  • test-blob-url-worker.js: new/updated regression test validating resolveObjectURL in a worker for a URL created on the main thread.
  • test-blob-file-backed.js: update test to assert structuredClone() succeeds for file‑backed Blobs and that cloned content/metadata match.

Rationale and tests:

  • These changes fix two related issues: blob URLs that only worked per‑realm, and file‑backed Blobs that were prevented from being cloned/transferred. Both were blocking cross‑realm Blob usage (main ↔ worker).

Detailed description:
- Add a global native blob URL registry so blob: URLs created in one realm are resolvable in another (e.g. main → worker). Registry entries store the DataQueue and metadata, are mutex‑protected, and are cleaned up via Environment cleanup hooks to avoid teardown races.
- Extend the DataQueue / Entry reader API to accept an optional `Environment*`, and update `FdEntry` reader creation so file handles and reader resources are created in the target environment/realm. This enables structured cloning / transferring of file‑backed Blobs across workers.
- Remove the JS-side prohibition on cloning file‑backed blobs so `structuredClone()` succeeds for file‑backed Blobs.
- Wire up Blob native bindings to use the new registry when storing/getting/revoking blob URLs; ensure Blob::GetDataObject constructs the Blob in the receiving environment.
- Add/update regression tests covering URL resolution in workers and cloning of file‑backed Blobs.

Files changed (high level):
- **node_blob.cc**: add global blob URL registry, Store/Get/Revoke helpers, env cleanup hook integration, Blob::StoreDataObject/GetDataObject/RevokeObjectURL wiring.
- **queue.h**, **queue.cc**: add optional `Environment*` parameter to `Entry::get_reader`, propagate env through DataQueue, implement `FdEntry::ReaderImpl::Create(entry, env)` to open file handles in the reader env.
- **blob.js**: remove `kNotCloneable` flag for file‑backed Blobs so they can be cloned/ transferred.
- **test-blob-url-worker.js**: new/updated regression test validating `resolveObjectURL` in a worker for a URL created on the main thread.
- **test-blob-file-backed.js**: update test to assert `structuredClone()` succeeds for file‑backed Blobs and that cloned content/metadata match.

Rationale and tests:
- These changes fix two related issues: blob URLs that only worked per‑realm, and file‑backed Blobs that were prevented from being cloned/transferred. Both were blocking cross‑realm Blob usage (main ↔ worker). All added/updated tests pass in the local test harness used during development.
@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. labels May 3, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 4, 2026

Codecov Report

❌ Patch coverage is 90.56604% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.63%. Comparing base (b2f6aa3) to head (b48ce98).
⚠️ Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
src/node_blob.cc 90.47% 1 Missing and 3 partials ⚠️
src/dataqueue/queue.cc 90.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #63103      +/-   ##
==========================================
- Coverage   89.65%   89.63%   -0.02%     
==========================================
  Files         712      712              
  Lines      220512   220841     +329     
  Branches    42289    42374      +85     
==========================================
+ Hits       197690   197953     +263     
- Misses      14663    14706      +43     
- Partials     8159     8182      +23     
Files with missing lines Coverage Δ
lib/internal/blob.js 99.44% <100.00%> (-0.01%) ⬇️
src/dataqueue/queue.cc 68.29% <90.00%> (+0.25%) ⬆️
src/node_blob.cc 70.00% <90.47%> (-3.43%) ⬇️

... and 55 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@aduh95
Copy link
Copy Markdown
Contributor

aduh95 commented May 4, 2026

Can you address the linter failure and fix the commit message please?

Comment thread src/node_blob.cc
Comment on lines +116 to +117
static std::mutex blob_url_registry_mutex;
static std::unordered_map<std::string, BlobURLEntry> blob_url_registry;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be a good use case for ExclusiveAccess from node_mutex.h

Comment thread src/node_blob.cc
std::move(type)};

std::string* uuid_copy = new std::string(uuid);
env->AddCleanupHook(BlobURLCleanupHook, uuid_copy);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really the right lifetime? i.e. a blob URL becomes inaccessible once its creating environment exits? That would mean, for example, that when a worker exits (and that could also be e.g. a crash due to an uncaught exception, i.e. something unpredictable), its parent environment would not be able to use the URL anymore

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • What the spec says:

    • There is one global blob URL store mapping "blob:..." → (blob object, creating environment).
    • createObjectURL records the current environment; revokeObjectURL removes the mapping.
    • When the creating environment is unloaded, entries it created must be removed.
    • Resolving a blob URL must check that the requester is authorized (same partition/storage context).
  • What my PR currently dose (src/node_blob.cc):

    • Has a process-wide blob_url_registry that stores the blob backing (data, length, type).
    • Registers an environment cleanup hook to erase URLs created by that env on teardown.
    • Blob backing uses shared_ptr so the raw data survives while any Blob references exist.
  • Differences / gaps:

    • Behavior is mostly spec-like: URLs are revoked when the creator env exits and backing data is refcounted.
    • Missing explicit creator-record and authorization check: entries do not store the creator Environment*, and GetDataObject does not enforce the spec’s same-partition authorization check.

What do you think we should do?

@jimmywarting jimmywarting changed the title Make blob: URLs cross‑realm and allow file‑backed Blob cloning blob: make blob URLs cross-realm and allow file-backed Blob cloning May 8, 2026
Reformat the get_reader method declaration in EntryImpl to split the parameter list across lines and align the Environment* env parameter. This is a non-functional style change to improve readability and comply with line-length/coding-style preferences.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants