update remote config polling to be faster and centralized by rochdev · Pull Request #6851 · DataDog/system-tests

rochdev · 2026-05-04T17:24:00Z

Motivation

The main motivation behind this PR was to fix the APPSEC_RUNTIME_ACTIVATION, APPSEC_AUTO_EVENTS_RC and APPSEC_API_SECURITY_RC tests which are extremely slow. Turned out that most of the time was spent waiting on RC polling since it happens infrequently in production to avoid overloading our backend. This limitation does not exist for tests, so they should instead poll as quickly as possible. This makes for example APPSEC_RUNTIME_ACTIVATION go from ~10 minutes to ~2 minutes.

While working on this, I also realized that all tests using RC are already overriding the value, but each test does so on their own. This makes it very easy to forget to do it for a specific test (like the above) which ends up slowing the entire workflow. So I instead removed all the overrides and add a fixed value in a centralized location. This will thus benefit any future tests as well.

Changes

Update remote config polling to be faster and centralized.

I had originally opened #6808 which combined multiple fixes and didn't quite get to the root cause of remote config specifically, so it's superseded by this one and the other fixes will be later split up as well.

Workflow

⚠️ Create your PR as draft ⚠️
Work on you PR until the CI passes
Mark it as ready for review
- Test logic is modified? -> Get a review from RFC owner.
- Framework is modified, or non obvious usage of it -> get a review from R&P team

🚀 Once your PR is reviewed and the CI green, you can merge it!

🛟 #apm-shared-testing 🛟

Reviewer checklist

Anything but tests/ or manifests/ is modified ? I have the approval from R&P team
A docker base image is modified?
- the relevant build-XXX-image label is present
A scenario is added, removed or renamed?
- Get a review from R&P team

github-actions · 2026-05-04T17:24:29Z

CODEOWNERS have been resolved as:

tests/parametric/test_dynamic_configuration.py                          @DataDog/system-tests-core @DataDog/apm-sdk-capabilities
tests/parametric/test_ffe/test_dynamic_evaluation.py                    @DataDog/feature-flagging-and-experimentation-sdk @DataDog/system-tests-core
tests/parametric/test_tracer_flare.py                                   @DataDog/system-tests-core @DataDog/apm-sdk-capabilities
utils/_context/_scenarios/__init__.py                                   @DataDog/system-tests-core
utils/_context/_scenarios/debugger.py                                   @DataDog/system-tests-core
utils/_context/_scenarios/endtoend.py                                   @DataDog/system-tests-core
utils/_context/_scenarios/parametric.py                                 @DataDog/system-tests-core

datadog-datadog-prod-us1 · 2026-05-04T18:11:26Z

✨ Fix all issues with BitsAI or with Cursor

⚠️ Warnings

🧪 3 Tests failed

tests.appsec.test_remote_config_rule_changes.Test_UpdateRuleFileWithRemoteConfig.test_update_rules[poc] from system_tests_suite

(Fix with Cursor)

AssertionError: assert '1.15.0' == '2.71.8182'
  - 2.71.8182
  + 1.15.0

self = <tests.appsec.test_remote_config_rule_changes.Test_UpdateRuleFileWithRemoteConfig object at 0x7f834d07a900>

    def test_update_rules(self):
        expected_rules_version_tag = "_dd.appsec.event_rules.version"
        expected_version_regex = r"[0-9]+\.[0-9]+\.[0-9]+"
    
...

tests.appsec.test_remote_config_rule_changes.Test_UpdateRuleFileWithRemoteConfig.test_update_rules[poc] from system_tests_suite

(Fix with Cursor)

AssertionError: missing span meta tag \`_dd.appsec.event_rules.version\` in meta
assert '_dd.appsec.event_rules.version' in {'_dd.appsec.fp.http.endpoint': 'http-get-ce4235d7--', '_dd.appsec.fp.http.header': 'hdr-0010000000-dfb9a587-1-4740ae63', '_dd.appsec.fp.http.network': 'net-0-0000000000', '_dd.appsec.fp.session': 'ssn----b2d936d9', ...}

self = <tests.appsec.test_remote_config_rule_changes.Test_UpdateRuleFileWithRemoteConfig object at 0x7fa8b827c410>

    def test_update_rules(self):
        expected_rules_version_tag = "_dd.appsec.event_rules.version"
        expected_version_regex = r"[0-9]+\.[0-9]+\.[0-9]+"
    
        def validate_waf_rule_version_tag(span: DataDogLibrarySpan, appsec_data: dict):  # noqa: ARG001
...

tests.appsec.test_remote_config_rule_changes.Test_UpdateRuleFileWithRemoteConfig.test_update_rules[uds] from system_tests_suite

(Fix with Cursor)

AssertionError: assert '1.15.0' == '2.71.8182'
  - 2.71.8182
  + 1.15.0

self = <tests.appsec.test_remote_config_rule_changes.Test_UpdateRuleFileWithRemoteConfig object at 0x7f7e4c2174d0>

    def test_update_rules(self):
        expected_rules_version_tag = "_dd.appsec.event_rules.version"
        expected_version_regex = r"[0-9]+\.[0-9]+\.[0-9]+"
    
...

ℹ️ Info

No other issues found (see more)

❄️ No new flaky tests detected

Useful? React with 👍 / 👎

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: a871fab | Docs | Datadog PR Page | Give us feedback!}

rochdev · 2026-05-04T19:18:50Z

PHP is failing everywhere, not related to this PR.

cbeauchesne · 2026-05-04T19:28:14Z

+        if rc_api_enabled:
+            # Low poll interval ensures RC config changes are picked up quickly during tests.
+            # Polling aggressively is safe because the RC backend is a mock, not a
+            # real Datadog backend that could be affected by high request rates.
+            self._weblog_env.setdefault("DD_REMOTE_CONFIG_POLL_INTERVAL_SECONDS", "0.2")


From a QA point of view, I advise to not change this globally : it set all end-to-end scenarios to a configuration different from the default configuration, which is a violation of a key principle of end-to-end testing : stay as close as possible from what your customers are using ; and triple check that every trade-off on this rule is more than worthy.

While working on this, I also realized that all tests using RC are already overriding the value, but each test does so on their own. This makes it very easy to forget to do it for a specific test (like the above) which ends up slowing the entire workflow. So I instead removed all the overrides and add a fixed value in a centralized location. This will thus benefit any future tests as well.

I'd say it was made on purpose, either to test this specific value, or to optimize the runtime, making a one-of compromise with the above rule.

This makes for example APPSEC_RUNTIME_ACTIVATION go from ~10 minutes to ~2 minutes.

Instead, I (strongly) advise to set this value for specific scenarios like this one where it makes a real difference, justifying a good tradeoff, with an explicit aproval from the team owning product impacted by this change.

Setting this as a blocking comment, until we have the approval form all impacted teams.

From a QA point of view, I advise to not change this globally : it set all end-to-end scenarios to a configuration different from the default configuration, which is a violation of a key principle of end-to-end testing : stay as close as possible from what your customers are using ; and triple check that every trade-off on this rule is more than worthy.

That would be true for a configuration that changes behaviour entirely, but not for something like polling. The only reason the default value is high is because we use short-polling, and you can't have too short of an interval with short-polling, otherwise you would overload the server. This is not an issue here, so we can use a much lower value. It still polls exactly the same way, but faster.

I'd say it was made on purpose, either to test this specific value, or to optimize the runtime, making a one-of compromise with the above rule.

It's because it makes everything slow, same issue.

Instead, I (strongly) advise to set this value for specific scenarios like this one where it makes a real difference, justifying a good tradeoff, with an explicit aproval from the team owning product impacted by this change.

The value is already set in many places, and would need to be set in many more now, and implementors of any future test would need to know to do it or we'd be back to square one. If something is needed everywhere always, it should be set once in a centralized location. There is absolutely no benefit to your suggestion. Polling should always be faster in test. That's true not only for RC but for any other polling we do.

cbeauchesne · 2026-05-04T19:29:32Z

        self._weblog_env = dict(weblog_env) if weblog_env else {}
+        if rc_api_enabled:
+            # Low poll interval ensures RC config changes are picked up quickly during tests.
+            # Polling aggressively is safe because the RC backend is a mock, not a


By default it's mocked, but it's an option. if mocked_backend is set to false, then it's not mocked. Could you add a condition on this value ?

We can still poll faster, we don't have nearly as many hosts sending requests than in production.

Very probably, But then the comment is not accurate, could you update it ?

cbeauchesne · 2026-05-04T19:30:42Z

+            # Low poll interval ensures RC config changes are picked up quickly during tests.
+            # Polling aggressively is safe because the RC backend is a mock, not a
+            # real Datadog backend that could be affected by high request rates.


If I'm not wrong, the parametric scenario does not use nay backend.

Suggested change

# Low poll interval ensures RC config changes are picked up quickly during tests.

# Polling aggressively is safe because the RC backend is a mock, not a

# real Datadog backend that could be affected by high request rates.

# Low poll interval ensures RC config changes are picked up quickly during tests.

# Polling aggressively is safe because parametric scenario does not use any backend

Maybe that was worded properly. The reason we can't poll aggressively by default is two-fold:

Way too many hosts in the real world, and we can't have all of them poll every 200ms.

Real application are long-lived, and the config shouldn't change often. This is not true in tests.

Yep, I get it. It's just to precise that it's safe to use this in the parametric scenario, not because we use a mocked backend, but because so it's safe to use this does not use any backend.

rochdev · 2026-05-04T22:46:45Z

Looks like there is a race condition in .NET, should it just be excluded for now?

rochdev · 2026-05-12T16:28:21Z

Superseded by #6879

rochdev added 2 commits May 4, 2026 12:45

update remote config polling to be faster and centralized

ca11652

update comment

a871fab

rochdev mentioned this pull request May 4, 2026

[nodejs] fix remote config polling slowing down multiple tests #6808

Closed

5 tasks

rochdev marked this pull request as ready for review May 4, 2026 19:18

rochdev requested review from a team as code owners May 4, 2026 19:18

rochdev requested review from dd-oleksii, greghuels and rachelyangdog and removed request for a team May 4, 2026 19:18

cbeauchesne requested changes May 4, 2026

View reviewed changes

rochdev mentioned this pull request May 7, 2026

Improve performances by reducing RC poll interval on all E2E scenarios on python and node.js #6879

Merged

1 task

rochdev marked this pull request as draft May 7, 2026 22:14

rochdev closed this May 12, 2026

cbeauchesne deleted the rochdev/faster-rc-poll branch May 12, 2026 19:23

Conversation

rochdev commented May 4, 2026

Motivation

Changes

Workflow

Reviewer checklist

Uh oh!

github-actions Bot commented May 4, 2026

Uh oh!

datadog-datadog-prod-us1 Bot commented May 4, 2026 • edited by datadog-prod-us1-6 Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Warnings

ℹ️ Info

Uh oh!

rochdev commented May 4, 2026

Uh oh!

cbeauchesne May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rochdev May 4, 2026

Choose a reason for hiding this comment

Uh oh!

cbeauchesne May 4, 2026

Choose a reason for hiding this comment

Uh oh!

rochdev May 4, 2026

Choose a reason for hiding this comment

Uh oh!

cbeauchesne May 5, 2026

Choose a reason for hiding this comment

Uh oh!

cbeauchesne May 4, 2026

Choose a reason for hiding this comment

Uh oh!

rochdev May 4, 2026

Choose a reason for hiding this comment

Uh oh!

cbeauchesne May 5, 2026

Choose a reason for hiding this comment

Uh oh!

rochdev commented May 4, 2026

Uh oh!

rochdev commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

datadog-datadog-prod-us1 Bot commented May 4, 2026 •

edited by datadog-prod-us1-6 Bot

Loading

cbeauchesne May 4, 2026 •

edited

Loading