Skip to content

Commit 1898b5f

Browse files
committed
More metrics documentation
1 parent 0a6da58 commit 1898b5f

File tree

3 files changed

+168
-3
lines changed

3 files changed

+168
-3
lines changed

docs/insights/metrics.mdx

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,12 @@ In the Trigger.dev dashboard we have built-in dashboards and you can create your
99

1010
Metrics dashboards are powered by [TRQL queries](/insights/query) with widgets that can be displayed as charts, tables, or single values. They automatically refresh to show the latest data.
1111

12+
### Available metrics data
13+
14+
Trigger.dev automatically collects system metrics (CPU, memory, disk, network) and Node.js runtime metrics (event loop, heap) for all deployed tasks -- no configuration needed. You can also create custom metrics using the `otel.metrics` API from the SDK.
15+
16+
All of this data is available in the `metrics` table for use in dashboard widgets. See [Logging, tracing & metrics](/logging#metrics) for the full list of automatic metrics and how to create custom ones, or the [Query page](/insights/query#metrics-table-columns) for the `metrics` table schema.
17+
1218
![The built-in Metrics dashboard](/images/metrics-built-in.png)
1319

1420
### Visualization types

docs/insights/query.mdx

Lines changed: 37 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,43 @@ description: "Query allows you to write custom queries against your data using T
66
### Available tables
77

88
- `runs`: contains all task run data including status, timing, costs, and metadata
9-
- `metrics`: contains metrics data for your runs including CPU, memory, and your custom metrics.
9+
- `metrics`: contains metrics data for your runs including CPU, memory, and your custom metrics
10+
11+
### `metrics` table columns
12+
13+
| Column | Type | Description |
14+
| :--- | :--- | :--- |
15+
| `metric_name` | string | Metric identifier (e.g., `process.cpu.utilization`) |
16+
| `metric_type` | string | `gauge`, `sum`, or `histogram` |
17+
| `value` | number | The observed value |
18+
| `bucket_start` | datetime | 10-second aggregation bucket start time |
19+
| `run_id` | string | Associated run ID |
20+
| `task_identifier` | string | Task slug |
21+
| `attempt_number` | number | Attempt number |
22+
| `machine_id` | string | Machine that produced the metric |
23+
| `machine_name` | string | Machine preset (e.g., `small-1x`) |
24+
| `worker_version` | string | Worker version |
25+
| `environment_type` | string | `PRODUCTION`, `STAGING`, `DEVELOPMENT`, `PREVIEW` |
26+
| `attributes` | json | Raw JSON attributes for custom data |
27+
28+
See [Logging, tracing & metrics](/logging#automatic-system-and-runtime-metrics) for the full list of automatically collected metrics and how to create custom metrics.
29+
30+
### `prettyFormat()`
31+
32+
Use `prettyFormat()` to format metric values for display:
33+
34+
```sql
35+
SELECT
36+
timeBucket(),
37+
prettyFormat(avg(value), 'bytes') AS avg_memory
38+
FROM metrics
39+
WHERE metric_name = 'process.memory.usage'
40+
GROUP BY timeBucket
41+
ORDER BY timeBucket
42+
LIMIT 1000
43+
```
44+
45+
Available format types: `bytes`, `percent`, `duration`, `durationSeconds`, `quantity`, `costInDollars`.
1046

1147
## Using the Query dashboard
1248

docs/logging.mdx

Lines changed: 125 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
2-
title: "Logging and tracing"
3-
description: "How to use the built-in logging and tracing system."
2+
title: "Logging, tracing & metrics"
3+
description: "How to use the built-in logging, tracing, and metrics system."
44
---
55

66
![The run log](/images/run-log.png)
@@ -77,3 +77,126 @@ export const customTrace = task({
7777
},
7878
});
7979
```
80+
81+
## Metrics
82+
83+
Trigger.dev collects system and runtime metrics automatically for deployed tasks, and provides an API for recording custom metrics using OpenTelemetry.
84+
85+
You can view metrics in the [Metrics dashboards](/insights/metrics), query them with [TRQL](/insights/query), and export them to external services via [telemetry exporters](/config/config-file#telemetry).
86+
87+
### Custom metrics API
88+
89+
Import `otel` from `@trigger.dev/sdk` and use the standard OpenTelemetry Metrics API to create custom instruments.
90+
91+
Create instruments **at module level** (outside the task `run` function) so they are reused across runs:
92+
93+
```ts /trigger/metrics.ts
94+
import { task, logger, otel } from "@trigger.dev/sdk";
95+
96+
// Create a meter — instruments are created once at module level
97+
const meter = otel.metrics.getMeter("my-app");
98+
99+
const itemsProcessed = meter.createCounter("items.processed", {
100+
description: "Total number of items processed",
101+
unit: "items",
102+
});
103+
104+
const itemDuration = meter.createHistogram("item.duration", {
105+
description: "Time spent processing each item",
106+
unit: "ms",
107+
});
108+
109+
const queueDepth = meter.createUpDownCounter("queue.depth", {
110+
description: "Current queue depth",
111+
unit: "items",
112+
});
113+
114+
export const processQueue = task({
115+
id: "process-queue",
116+
run: async (payload: { items: string[] }) => {
117+
queueDepth.add(payload.items.length);
118+
119+
for (const item of payload.items) {
120+
const start = performance.now();
121+
122+
// ... process item ...
123+
124+
const elapsed = performance.now() - start;
125+
126+
itemsProcessed.add(1, { "item.type": "order" });
127+
itemDuration.record(elapsed, { "item.type": "order" });
128+
queueDepth.add(-1);
129+
}
130+
131+
logger.info("Queue processed", { count: payload.items.length });
132+
},
133+
});
134+
```
135+
136+
#### Available instrument types
137+
138+
| Instrument | Method | Use case |
139+
| :--- | :--- | :--- |
140+
| Counter | `meter.createCounter()` | Monotonically increasing values (items processed, requests sent) |
141+
| Histogram | `meter.createHistogram()` | Distributions of values (durations, sizes) |
142+
| UpDownCounter | `meter.createUpDownCounter()` | Values that go up and down (queue depth, active connections) |
143+
144+
All instruments accept optional attributes when recording values. Attributes let you break down metrics by dimension (e.g., by item type, status, or region).
145+
146+
### Automatic system and runtime metrics
147+
148+
Trigger.dev automatically collects the following metrics for deployed tasks. No configuration is needed.
149+
150+
| Metric name | Type | Unit | Description |
151+
| :--- | :--- | :--- | :--- |
152+
| `process.cpu.utilization` | gauge | ratio | Process CPU usage (0-1) |
153+
| `process.cpu.time` | counter | seconds | CPU time consumed |
154+
| `process.memory.usage` | gauge | bytes | Process memory usage |
155+
| `system.memory.usage` | gauge | bytes | System memory usage |
156+
| `system.memory.utilization` | gauge | ratio | System memory utilization (0-1) |
157+
| `system.network.io` | counter | bytes | Network I/O |
158+
| `system.disk.io` | counter | bytes | Disk I/O (Linux only) |
159+
| `system.disk.operations` | counter | operations | Disk operations (Linux only) |
160+
| `system.filesystem.usage` | gauge | bytes | Filesystem usage (Linux only) |
161+
| `system.filesystem.utilization` | gauge | ratio | Filesystem utilization (Linux only) |
162+
| `nodejs.event_loop.utilization` | gauge | ratio | Event loop utilization (0-1) |
163+
| `nodejs.event_loop.delay.p95` | gauge | seconds | Event loop delay p95 |
164+
| `nodejs.event_loop.delay.max` | gauge | seconds | Event loop delay max |
165+
| `nodejs.heap.used` | gauge | bytes | V8 heap used |
166+
| `nodejs.heap.total` | gauge | bytes | V8 heap total |
167+
168+
<Note>
169+
In dev mode (`trigger dev`), `system.*` metrics are not collected to reduce noise. Only `process.*`, `nodejs.*`, and custom metrics are available during development.
170+
</Note>
171+
172+
### Context attributes
173+
174+
All metrics (both automatic and custom) are tagged with run context so you can filter and group them:
175+
176+
- `run_id` — the run that produced the metric
177+
- `task_identifier` — the task slug
178+
- `attempt_number` — the attempt number
179+
- `machine_name` — the machine preset (e.g., `small-1x`)
180+
- `worker_version` — the deployed worker version
181+
- `environment_type``PRODUCTION`, `STAGING`, `DEVELOPMENT`, or `PREVIEW`
182+
183+
### Querying metrics
184+
185+
Use [TRQL](/insights/query) to query metrics data. For example, to see average CPU utilization over time:
186+
187+
```sql
188+
SELECT
189+
timeBucket(),
190+
avg(value) AS avg_cpu
191+
FROM metrics
192+
WHERE metric_name = 'process.cpu.utilization'
193+
GROUP BY timeBucket
194+
ORDER BY timeBucket
195+
LIMIT 1000
196+
```
197+
198+
See the [Query page](/insights/query#metrics-table-columns) for the full `metrics` table schema.
199+
200+
### Exporting metrics
201+
202+
You can send metrics to external observability services (Axiom, Honeycomb, Datadog, etc.) by configuring [telemetry exporters](/config/config-file#telemetry) in your `trigger.config.ts`.

0 commit comments

Comments
 (0)