Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
76d837f
feat: add CDP pipeline, cdpmonitor, and wire into API service
archandatta Apr 2, 2026
5c0f7ae
refactor: rename Pipeline to CaptureSession and delete pipeline.go
archandatta Apr 2, 2026
ad2dd63
review: fix request context leak in StartCapture, add missing categor…
archandatta Apr 2, 2026
365787c
review: clean up syntax
archandatta Apr 6, 2026
d086571
review: move CategoryFor to cdpmonitor package
archandatta Apr 7, 2026
3339cac
review: internalize ring buffer and file writer in CaptureSession con…
archandatta Apr 7, 2026
64e67b1
review: write logs under /var/log/kernel and ensure dir exists
archandatta Apr 7, 2026
048f9c6
review: add capture config to /events/start and OpenAPI spec
archandatta Apr 7, 2026
e57e58a
review: fix lifecycle context, stop-before-reset ordering, seq reset,…
archandatta Apr 7, 2026
886b893
fix: oapi version
archandatta Apr 7, 2026
06d2470
fix: Shutdown cancels context outside lock, racing with StartCapture
archandatta Apr 7, 2026
6bb5402
review: validate DetailLevel with generated Valid
archandatta Apr 7, 2026
cb45a55
fix: reset ring buffer on session restart to unstrand existing readers
archandatta Apr 7, 2026
02ee74e
chore: remove dead categoryFor function
archandatta Apr 7, 2026
3523994
review: guard zero-capacity ring buffer and fix reader reset after bu…
archandatta Apr 7, 2026
cb1a1a7
review: use t.TempDir in test helper, map-based ValidCategory, avoid …
archandatta Apr 7, 2026
c9e78a3
review: add captureConfigFrom and StartCapture/StopCapture handler tests
archandatta Apr 7, 2026
1cddf53
feat: refactor events API to resource-style capture sessions
archandatta Apr 8, 2026
c6dd362
review: update file writer to be internal to the package
archandatta Apr 9, 2026
ff8bddf
review: tighten to `Write(filename string, data []byte) error`
archandatta Apr 9, 2026
a8bdeaf
review: update panic -> error
archandatta Apr 9, 2026
8f88ed0
review: update oapi and remove detail level
archandatta Apr 9, 2026
3eaacb3
review: remove url
archandatta Apr 9, 2026
214858a
chore: restore server/api on branch
archandatta Apr 9, 2026
b06132a
review: harden captureConfigFromOAPI and clarify stop comment
archandatta Apr 9, 2026
8ebb5e3
review: unexport ringBuffer and drop AllCategories wrapper
archandatta Apr 9, 2026
4bcba48
review: replace event producers with cdp monitor in stop description
archandatta Apr 9, 2026
1295232
remove test line
archandatta Apr 9, 2026
a4fd0d6
review: update uuid to cuid2
archandatta Apr 10, 2026
d6b348b
Merge branch 'main' into archand/kernel-1116/cdp-pipeline
archandatta Apr 10, 2026
3da65c3
Merge branch 'main' into archand/kernel-1116/cdp-pipeline
archandatta Apr 10, 2026
a7b2e54
fix naming
archandatta Apr 10, 2026
85d570a
feat: add cdpmonitor foundation — types, util, computed state machine…
archandatta Apr 13, 2026
bed53f8
self review
archandatta Apr 13, 2026
d73793c
review: cursor feedback
archandatta Apr 13, 2026
348243a
[kernel-1116] CDP monitor core (#214)
archandatta Apr 13, 2026
a62c403
review: update types and sensitive interaction data
archandatta Apr 14, 2026
0605227
feat: add two-layer CDP decode, protocol-faithful types, then monitor…
archandatta Apr 14, 2026
33c07d3
Merge branch 'main' into archand/kernel-1116/cdp-pipeline
archandatta Apr 21, 2026
9c6e066
review: clean up monitor health and types
archandatta Apr 22, 2026
5dd9273
review: add chromium version
archandatta Apr 22, 2026
7550bc1
review: remove dead code
archandatta Apr 22, 2026
f3d3166
Merge branch 'archand/kernel-1116/cdp-pipeline' into archand/kernel-1…
archandatta Apr 22, 2026
1cfbc5e
fix injection script
archandatta Apr 22, 2026
2e0c4a0
review: remove sensitive data from inject
archandatta Apr 22, 2026
bf4b04c
review
archandatta Apr 22, 2026
90a3ae1
review: sensitive data audit interaction.js
archandatta Apr 22, 2026
4feef7e
review: reconnect failure leaks goroutines and deadlocks Stop
archandatta Apr 22, 2026
8e94162
Merge branch 'main' into archand/kernel-1116/cdp-foundation
archandatta Apr 22, 2026
5465e59
review: create cdpMonitorController
archandatta Apr 22, 2026
7c4c654
review: add ctx to monitor and update comment
archandatta Apr 23, 2026
bca495b
review: lift lifeMu to dispatch level to make ctx handling explicit
archandatta Apr 23, 2026
fd4d4d3
review: add readme for cdp monitor
archandatta Apr 23, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 11 additions & 2 deletions server/cmd/api/api/api.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import (
"context"
"errors"
"fmt"
"log/slog"
"os"
"os/exec"
"sync"
Expand All @@ -20,6 +21,14 @@ import (
"github.com/kernel/kernel-images/server/lib/scaletozero"
)

type cdpMonitorController interface {
Start(ctx context.Context) error
Stop()
IsRunning() bool
}

var _ cdpMonitorController = (*cdpmonitor.Monitor)(nil)

type ApiService struct {
// defaultRecorderID is used whenever the caller doesn't specify an explicit ID.
defaultRecorderID string
Expand Down Expand Up @@ -73,7 +82,7 @@ type ApiService struct {

// CDP event pipeline and cdpMonitor.
captureSession *events.CaptureSession
cdpMonitor *cdpmonitor.Monitor
cdpMonitor cdpMonitorController
monitorMu sync.Mutex
lifecycleCtx context.Context
lifecycleCancel context.CancelFunc
Expand Down Expand Up @@ -103,7 +112,7 @@ func New(
return nil, fmt.Errorf("captureSession cannot be nil")
}

mon := cdpmonitor.New(upstreamMgr, captureSession.Publish, displayNum)
mon := cdpmonitor.New(upstreamMgr, captureSession.Publish, displayNum, slog.Default())
ctx, cancel := context.WithCancel(context.Background())

return &ApiService{
Expand Down
7 changes: 7 additions & 0 deletions server/cmd/api/api/capture_session_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -248,5 +248,12 @@ func newTestService(t *testing.T, mgr recorder.RecordManager) *ApiService {
t.Helper()
svc, err := New(mgr, newMockFactory(), newTestUpstreamManager(), scaletozero.NewNoopController(), newMockNekoClient(t), newCaptureSession(t), 0)
require.NoError(t, err)
svc.cdpMonitor = &stubCdpMonitor{}
return svc
}

type stubCdpMonitor struct{}

func (s *stubCdpMonitor) Start(_ context.Context) error { return nil }
func (s *stubCdpMonitor) Stop() {}
func (s *stubCdpMonitor) IsRunning() bool { return false }
79 changes: 79 additions & 0 deletions server/lib/cdpmonitor/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# CDP Monitor

The monitor is the browser-facing layer of the kernel browser logging pipeline. It connects to Chrome's DevTools endpoint, tracks all page sessions via CDP's `Target.setAutoAttach`, and converts raw CDP notifications into typed `events.Event` values for downstream consumers.

## Overview

`cdpmonitor` manages a Chrome DevTools Protocol (CDP) WebSocket connection to a running Chrome browser. It subscribes to CDP events across all attached tabs, translates them into structured `events.Event` values, and publishes them via a caller-supplied `PublishFunc`. It also derives synthetic events from sequences of CDP events and takes screenshots on significant page activity.

Chrome can restart independently of the monitor. When that happens, `UpstreamProvider` pushes a new DevTools URL and the monitor reconnects automatically, emitting lifecycle events so consumers can track continuity.

## Event taxonomy

**CDP-derived** (1-to-1 with a CDP notification): `console_log`, `console_error`, `network_request`, `network_response`, `network_loading_failed`, `navigation`, `dom_content_loaded`, `page_load`, `layout_shift`

**Computed** (inferred from sequences of CDP events): `network_idle` (fires when in-flight requests drop to zero), `layout_settled` (1 s after `page_load` with no intervening layout shifts), `navigation_settled` (fires once `dom_content_loaded`, `network_idle`, and `layout_settled` have all fired for the same navigation).

**Interaction** (fired by `interaction.js` via `Runtime.bindingCalled`): `interaction_click`, `interaction_key`, `scroll_settled`

**Monitor lifecycle** (emitted by the monitor itself, not by Chrome): `screenshot`, `monitor_disconnected`, `monitor_reconnected`, `monitor_reconnect_failed`, `monitor_init_failed`

## Responsibilities

| Concern | Where |
| --- | --- |
| WebSocket lifecycle (connect, read, reconnect) | `monitor.go` |
| CDP domain setup per session | `domains.go` |
| Event translation (CDP params to `events.Event`) | `handlers.go` |
| Synthetic event state machines | `computed.go` |
| Screenshot capture via ffmpeg | `screenshot.go` |
| CDP protocol types | `cdp_proto.go`, `types.go` |
| Interaction tracking injected into the page | `interaction.js` |
| Body/MIME capture sizing and text truncation helpers | `util.go` |

## Internals

### Reconnect model

`subscribeToUpstream` listens to `UpstreamProvider.Subscribe()` for new DevTools URLs. On each URL change (indicating Chrome restarted), `handleUpstreamRestart` tears down the existing connection, dials the new URL with capped-exponential backoff (250 ms → 500 ms → 1 s → 2 s, up to 10 attempts), then restarts `readLoop` and re-initializes all CDP sessions. `restartMu` serializes concurrent restart signals so rapid Chrome restarts do not produce overlapping reconnects.

### Goroutines

| Goroutine | Lifetime | Tracked by |
| --- | --- | --- |
| `readLoop` | one per WebSocket connection | `done` channel |
| `subscribeToUpstream` | same as `lifecycleCtx` | `asyncWg` |
| `sweepPendingRequests` | same as `lifecycleCtx` | `asyncWg` |
| `initSession` | short-lived, one per connect or reconnect | `asyncWg` |
| `attachExistingTargets` wrapper | short-lived, one per existing target on reconnect | `asyncWg` |
| `enableDomains` + `injectScript` | short-lived, one per target attach | `asyncWg` |
| `fetchResponseBody` | one per completed network request | `asyncWg` |
| `captureScreenshot` | one per screenshot trigger | `asyncWg` |

`Stop()` cancels `lifecycleCtx`, waits for `readLoop` via `done`, then waits for all other goroutines via `asyncWg` before closing the connection.

### Lock ordering

Locks must be acquired left to right. Never hold a lock on the left while acquiring one further right.

```
restartMu -> lifeMu -> pendReqMu -> computed.mu -> pendMu -> sessionsMu
```

`bindingRateMu` is independent of this ordering and is always acquired alone.

| Lock | Protects |
| --- | --- |
| `restartMu` | Serializes `handleUpstreamRestart` to prevent overlapping reconnects from rapid Chrome restarts |
| `lifeMu` | `conn`, `lifecycleCtx`, `cancel`, `done`, `readReady` -- all fields that change during Start / Stop / reconnect |
| `pendReqMu` | `pendingRequests` (requestId -> `networkReqState`): in-flight network requests accumulating request/response metadata until `loadingFinished` |
| `computed.mu` | All `computedState` fields: counters and timers for the `network_idle`, `layout_settled`, and `navigation_settled` state machines |
| `pendMu` | `pending` (id -> reply channel): in-flight CDP commands waiting for a response from Chrome |
| `sessionsMu` | `sessions` (sessionID -> `targetInfo`): the set of currently attached CDP targets (tabs, iframes, workers) |
| `bindingRateMu` | `bindingLastSeen` (sessionID:eventType -> time): rate-limit state for `__kernelEvent` binding calls |

Fields that need no mutex use `sync/atomic`: `nextID`, `mainSessionID`, `running`, `lastScreenshotAt`, `screenshotInFlight`.

### WebSocket concurrency

`coder/websocket` guarantees one concurrent `Read` and one concurrent `Write` are safe on the same connection. `readLoop` is the sole reader. All writes go through `send`, which calls `conn.Write` directly -- `conn.Write` is internally serialized by the library, so no external write mutex is needed.
Loading
Loading