Skip to content

Add same-metrics dataset for time series UI test scenarios#7

Open
kpatticha wants to merge 6 commits intosimianhacker:mainfrom
kpatticha:add-same-metric-dataset
Open

Add same-metrics dataset for time series UI test scenarios#7
kpatticha wants to merge 6 commits intosimianhacker:mainfrom
kpatticha:add-same-metric-dataset

Conversation

@kpatticha
Copy link
Contributor

@kpatticha kpatticha commented Feb 26, 2026

Add same-metrics dataset for time series UI test scenarios

Summary

Introduces a new dataset same-metrics that generates time series data for testing the Metrics experience in Discover. Data is written to eight time series data streams with clear naming so each scenario can be identified in the UI.

Stage changes

New dataset: same-metrics

  • CLI: ./forge --dataset same-metrics [--interval 1m] [--backfill now-1h] [--purge]
  • Purge: Deletes the eight same-metrics data streams by name.

Data streams and scenarios

Scenario Data streams
Different units timeseries-same-metric-different-unit-eur, timeseries-same-metric-different-unit-usd, timeseries-same-metric-different-unit-null
Different datastreams timeseries-same-metric-different-datastream-1, timeseries-same-metric-different-datastream-2
Different dimensions timeseries-same-metric-different-dimensions-has-1 (1 dim), timeseries-same-metric-different-dimensions-has-4 (4 dims)

Shared metric: metric.name = cost_total, field total_cost. Each document includes test.scenario and test.data_stream for filtering in Discover.


This PR was prepared with AI assistance.

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced Same Metrics dataset with cost tracking capabilities across multiple scenarios: different units, same metrics across data streams, and different dimensions.
    • Expanded CLI options for Docker Compose integration and enhanced authentication/API key configuration.
  • Documentation

    • Enhanced README with comprehensive usage guidance, additional dataset coverage, and expanded feature documentation.

- Introduced a new dataset for generating time series test scenarios with the same metric in two streams, supporting different units and dimensions.
- Implemented the SameMetricsSimulator, SameMetricsMetricsGenerator, and SameMetricsFormatter to handle the new dataset.
- Added configuration and template generation for same metrics data streams.
- Updated README.md to include details about the new dataset and usage examples.
@coderabbitai
Copy link

coderabbitai bot commented Feb 26, 2026

📝 Walkthrough

Walkthrough

A new "same-metrics" dataset is added to the simulator platform, introducing type definitions, a dedicated simulator with config and metrics generators, Elasticsearch template builders, and a formatter. The main entry point is updated to integrate and validate the new dataset throughout the initialization and backfill pipeline. Documentation is expanded with additional datasets and usage guidance.

Changes

Cohort / File(s) Summary
Documentation
README.md
Expands dataset list and key features, adds Same Metrics dataset documentation, updates installation commands with Docker Compose and auth options, and refines usage examples and formatting.
Type Definitions
src/types/same-metrics-types.ts
Defines SameMetricsScenarioKind, SameMetricsDimensions, SameMetricsConfig, SameMetricsMetrics, and SameMetricsDocument interfaces for type-safe same-metrics data modeling.
Simulators
src/simulators/same-metrics-simulator.ts, same-metrics-config-generator.ts, same-metrics-metrics-generator.ts
Introduces SameMetricsSimulator orchestrating config generation, metrics generation, and ES template setup. Config generator maps data streams to scenarios with dimensional data. Metrics generator produces deterministic counter values per stream and timestamp.
Elasticsearch Integration
src/simulators/same-metrics-template-builder.ts
Builds time-series index templates with metric mappings, routing paths, and optional dimensional fields for same-metrics data streams.
Formatter
src/formatters/same-metrics-formatter.ts
Converts SameMetricsMetrics to SameMetricsDocuments with @timestamp, metric.name (cost_total), total_cost, test metadata, and optional dimensions.
Main Integration
src/index.ts
Adds same-metrics simulator and DATA_STREAM_IDS imports, extends dataset/format validation, purges multiple data streams per dataset, initializes appropriate simulator based on selected dataset, and standardizes CLI options and quoting style.

Sequence Diagram

sequenceDiagram
    participant CLI as CLI/User
    participant Simulator as SameMetricsSimulator
    participant ConfigGen as ConfigGenerator
    participant MetricsGen as MetricsGenerator
    participant Formatter as Formatter
    participant ES as Elasticsearch

    CLI->>Simulator: initialize(dataset, format, options)
    Simulator->>ConfigGen: generateConfig(entityId)
    ConfigGen-->>Simulator: SameMetricsConfig
    
    Simulator->>MetricsGen: generateMetrics(config, timestamp)
    MetricsGen-->>Simulator: SameMetricsMetrics
    
    Simulator->>Formatter: formatMetrics(metrics)
    Formatter-->>Simulator: FormatterResult<SameMetricsDocument>[]
    
    Simulator->>ES: setupElasticsearchTemplates()
    ES-->>Simulator: templates created
    
    Simulator->>ES: bulk index documents
    ES-->>Simulator: acknowledgment
    Simulator-->>CLI: simulation complete
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 Hop, hop, hooray! A new dataset bounds,
With metrics that measure through many rounds,
Config and formatting dance side-by-side,
Templates and simulators stride with pride!
Same-metrics now shine in the burrow's new light! 🌟

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add same-metrics dataset for time series UI test scenarios' directly and clearly describes the main change: introduction of a new same-metrics dataset for time series testing, which aligns perfectly with the PR's core objective and the majority of code changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
src/types/same-metrics-types.ts (1)

11-15: Consider making dimension fields optional for flexibility.

As noted in the config generator, SameMetricsDimensions is used with partial data (e.g., only host.name for some streams). Making these fields optional would align with actual usage:

 export interface SameMetricsDimensions {
-  'host.name': string;
-  region: string;
-  environment: string;
+  'host.name'?: string;
+  region?: string;
+  environment?: string;
 }

This mirrors the optional fields already defined in SameMetricsDocument.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/types/same-metrics-types.ts` around lines 11 - 15, Make the three
properties on the SameMetricsDimensions interface optional to match actual usage
and the existing SameMetricsDocument shape: update the SameMetricsDimensions
declaration so 'host.name', region, and environment are optional (e.g., mark
them as optional properties) to allow partial dimension objects when only some
fields (like host.name) are present.
src/formatters/same-metrics-formatter.ts (1)

18-22: Type-safe dimension assignment possible.

Since SameMetricsDimensions has fixed keys (host.name, region, environment) that match the optional fields in SameMetricsDocument, you could avoid the double cast by spreading or directly assigning:

-    if (metrics.dimensions) {
-      for (const [key, value] of Object.entries(metrics.dimensions)) {
-        (doc as unknown as Record<string, unknown>)[key] = value;
-      }
-    }
+    if (metrics.dimensions) {
+      Object.assign(doc, metrics.dimensions);
+    }

The current implementation works correctly, so this is purely a readability refinement.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/formatters/same-metrics-formatter.ts` around lines 18 - 22, The loop that
assigns metrics.dimensions into the document uses a double-cast and can be
simplified for type-safety and readability: replace the manual for-loop that
iterates over metrics.dimensions (in same-metrics-formatter.ts) with a direct,
type-safe assignment or spread so SameMetricsDimensions (keys: "host.name",
"region", "environment") are merged into the SameMetricsDocument without
casting; specifically, merge metrics.dimensions into the existing doc object
(e.g., Object.assign(doc, metrics.dimensions) or doc = { ...doc,
...metrics.dimensions }) ensuring types align with
SameMetricsDocument/SameMetricsDimensions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/simulators/same-metrics-config-generator.ts`:
- Around line 25-31: The function buildDimensionsFromNames constructs a partial
dimensions object but unsafely casts it to SameMetricsDimensions; change the
types or the casting so callers can't assume all fields exist: either update
SameMetricsDimensions to make fields optional (host.name, region, environment)
and adjust SameMetricsConfig.dimensions accordingly, or keep
SameMetricsDimensions strict and change buildDimensionsFromNames to return
Partial<SameMetricsDimensions> (and update any consumer types/usages). Locate
buildDimensionsFromNames, DEFAULT_DIMENSION_VALUES, and
SameMetricsConfig/SameMetricsDimensions in the types/diff and ensure the
returned type and downstream type annotations are updated consistently to avoid
the unsafe cast.

In `@src/simulators/same-metrics-simulator.ts`:
- Around line 104-107: Replace the invalid "body" parameter passed to the
Elasticsearch client's indices.putIndexTemplate calls with the flattened
"template" parameter expected by `@elastic/elasticsearch` v9.0.2; locate the
putIndexTemplate invocations in same-metrics-simulator (the call using
dataStream and template), and make the argument { name: dataStream, template:
template } instead of { name: dataStream, body: template }, and apply the same
change to the identical putIndexTemplate calls in weather-simulator.ts and
histograms-simulator.ts so all simulators use the "template" parameter.

---

Nitpick comments:
In `@src/formatters/same-metrics-formatter.ts`:
- Around line 18-22: The loop that assigns metrics.dimensions into the document
uses a double-cast and can be simplified for type-safety and readability:
replace the manual for-loop that iterates over metrics.dimensions (in
same-metrics-formatter.ts) with a direct, type-safe assignment or spread so
SameMetricsDimensions (keys: "host.name", "region", "environment") are merged
into the SameMetricsDocument without casting; specifically, merge
metrics.dimensions into the existing doc object (e.g., Object.assign(doc,
metrics.dimensions) or doc = { ...doc, ...metrics.dimensions }) ensuring types
align with SameMetricsDocument/SameMetricsDimensions.

In `@src/types/same-metrics-types.ts`:
- Around line 11-15: Make the three properties on the SameMetricsDimensions
interface optional to match actual usage and the existing SameMetricsDocument
shape: update the SameMetricsDimensions declaration so 'host.name', region, and
environment are optional (e.g., mark them as optional properties) to allow
partial dimension objects when only some fields (like host.name) are present.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6bd1080 and 33df8e4.

📒 Files selected for processing (8)
  • README.md
  • src/formatters/same-metrics-formatter.ts
  • src/index.ts
  • src/simulators/same-metrics-config-generator.ts
  • src/simulators/same-metrics-metrics-generator.ts
  • src/simulators/same-metrics-simulator.ts
  • src/simulators/same-metrics-template-builder.ts
  • src/types/same-metrics-types.ts

Comment on lines +25 to +31
function buildDimensionsFromNames(dimNames: string[]): SameMetricsDimensions {
const result = {} as Record<string, string>;
for (const name of dimNames) {
result[name] = DEFAULT_DIMENSION_VALUES[name] ?? '';
}
return result as unknown as SameMetricsDimensions;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unsafe type cast for partial dimensions.

buildDimensionsFromNames creates an object with only the provided dimension names, but casts it to SameMetricsDimensions which requires all three fields (host.name, region, environment). For data streams like different-dimensions-has-1 with only ["host.name"], this produces a partial object that doesn't satisfy the interface.

Consider making SameMetricsDimensions have optional fields or using a mapped type:

Option 1: Update the types file to make dimensions optional

In src/types/same-metrics-types.ts:

 export interface SameMetricsDimensions {
-  'host.name': string;
-  region: string;
-  environment: string;
+  'host.name'?: string;
+  region?: string;
+  environment?: string;
 }
Option 2: Use Partial type here
-function buildDimensionsFromNames(dimNames: string[]): SameMetricsDimensions {
+function buildDimensionsFromNames(dimNames: string[]): Partial<SameMetricsDimensions> {
   const result = {} as Record<string, string>;
   for (const name of dimNames) {
     result[name] = DEFAULT_DIMENSION_VALUES[name] ?? '';
   }
-  return result as unknown as SameMetricsDimensions;
+  return result as Partial<SameMetricsDimensions>;
 }

This would require updating SameMetricsConfig.dimensions to Partial<SameMetricsDimensions> as well.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function buildDimensionsFromNames(dimNames: string[]): SameMetricsDimensions {
const result = {} as Record<string, string>;
for (const name of dimNames) {
result[name] = DEFAULT_DIMENSION_VALUES[name] ?? '';
}
return result as unknown as SameMetricsDimensions;
}
function buildDimensionsFromNames(dimNames: string[]): Partial<SameMetricsDimensions> {
const result = {} as Record<string, string>;
for (const name of dimNames) {
result[name] = DEFAULT_DIMENSION_VALUES[name] ?? '';
}
return result as Partial<SameMetricsDimensions>;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/simulators/same-metrics-config-generator.ts` around lines 25 - 31, The
function buildDimensionsFromNames constructs a partial dimensions object but
unsafely casts it to SameMetricsDimensions; change the types or the casting so
callers can't assume all fields exist: either update SameMetricsDimensions to
make fields optional (host.name, region, environment) and adjust
SameMetricsConfig.dimensions accordingly, or keep SameMetricsDimensions strict
and change buildDimensionsFromNames to return Partial<SameMetricsDimensions>
(and update any consumer types/usages). Locate buildDimensionsFromNames,
DEFAULT_DIMENSION_VALUES, and SameMetricsConfig/SameMetricsDimensions in the
types/diff and ensure the returned type and downstream type annotations are
updated consistently to avoid the unsafe cast.

Comment on lines +104 to +107
await this.elasticsearchClient.indices.putIndexTemplate({
name: dataStream,
body: template
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat package.json | jq '.dependencies["@elastic/elasticsearch"], .devDependencies["@elastic/elasticsearch"]'

Repository: simianhacker/simian-forge

Length of output: 82


🏁 Script executed:

rg -n "putIndexTemplate" --type ts -C 3

Repository: simianhacker/simian-forge

Length of output: 1642


🌐 Web query:

@elastic/elasticsearch 9.0.2 putIndexTemplate API documentation

💡 Result:

For @elastic/elasticsearch v9.0.2, the Node.js client method is:

  • client.indices.putIndexTemplate({ name, index_patterns, composed_of, template, data_stream, priority, version, _meta, ... }) (creates/updates a composable index template). [1]

Official docs:

  • JavaScript client API referenceclient.indices.putIndexTemplate (arguments + behavior). [1]
  • Elasticsearch v9 REST API → “Create or update an index template” (PUT /_index_template/{name}) with request/response examples. [2]

Example (Node.js):

await client.indices.putIndexTemplate({
  name: 'template_1',
  index_patterns: ['template*'],
  priority: 1,
  template: { settings: { number_of_shards: 2 } }
})

[1][2]


Use template parameter instead of body for v9.0.2 client compatibility.

The @elastic/elasticsearch v9.0.2 client uses flattened parameters. The body parameter is not valid; use template instead:

await this.elasticsearchClient.indices.putIndexTemplate({
  name: dataStream,
  template: template
});

This same issue appears in weather-simulator.ts (lines 82-85) and histograms-simulator.ts (lines 104-107) and should be fixed consistently across all simulators.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/simulators/same-metrics-simulator.ts` around lines 104 - 107, Replace the
invalid "body" parameter passed to the Elasticsearch client's
indices.putIndexTemplate calls with the flattened "template" parameter expected
by `@elastic/elasticsearch` v9.0.2; locate the putIndexTemplate invocations in
same-metrics-simulator (the call using dataStream and template), and make the
argument { name: dataStream, template: template } instead of { name: dataStream,
body: template }, and apply the same change to the identical putIndexTemplate
calls in weather-simulator.ts and histograms-simulator.ts so all simulators use
the "template" parameter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant