Skip to content

CMP-3618 added chrony-wait fix#14649

Open
vickeybrown wants to merge 1 commit intoComplianceAsCode:masterfrom
vickeybrown:CMP-3618-chrony-wait-fix
Open

CMP-3618 added chrony-wait fix#14649
vickeybrown wants to merge 1 commit intoComplianceAsCode:masterfrom
vickeybrown:CMP-3618-chrony-wait-fix

Conversation

@vickeybrown
Copy link
Copy Markdown

Description:

Fixed chrony-wait.service timeout failures when "cmdport 0" is configured by the "chronyd_no_chronyc_network" rule. The default chrony-wait.service uses -h 127.0.0.1,::1 which forces network connection to chronyd's command port, but the STIG-required cmdport 0 setting disables network access. This causes chrony-wait.service to timeout and fail, preventing time-sync.target from being reached.

The fix replaces the entire chrony-wait.service unit file to:

  1. Remove the -h flag so chronyc uses the Unix socket at /run/chrony/chronyd.sock
  2. Remove PrivateUsers=yes and other sandboxing restrictions that block Unix socket access

Rationale:

The "chronyd_no_chronyc_network" rule implements STIG requirements by setting "cmdport 0" to disable network access to chronyd's command port, while enabling local access via bindcmdaddress/run/chrony/chronyd.sock. However, chrony-wait.service bypasses the Unix socket by hard-coding network addresses, causing it to fail when cmdport is disabled. This breaks time synchronization verification on RHCOS nodes and can cause compliance remediation failures in OpenShift environments.

Review Hints:

Testing on OpenShift/RHCOS:

# Apply the chronyd_no_chronyc_network remediation
oc patch complianceremediation/<name> -n openshift-compliance --type=merge -p '{"spec":{"apply":true}}' 

# After MachineConfig applies, verify on a node:
oc debug node/<worker-node> 
chroot /host

# Verify chrony.conf has the required settings
grep -E '(cmdport|bindcmdaddress)' /etc/chrony.conf
# Should show: 
# cmdport 0
# bindcmdaddress /run/chrony/chronyd.sock

# Verify the replacement service file exists                                                              
cat /etc/systemd/system/chrony-wait.service
# Should have: ExecStart=/usr/bin/chronyc waitsync 0 0.1 0.0 1 (no -h flag)                               

# Verify chrony-wait.service succeeds
systemctl status chrony-wait.service
# Should show: Active: active (exited)

@openshift-ci openshift-ci Bot added the needs-ok-to-test Used by openshift-ci bot. label Apr 13, 2026
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 13, 2026

Hi @vickeybrown. Thanks for your PR.

I'm waiting for a ComplianceAsCode member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@yuumasato
Copy link
Copy Markdown
Member

/ok-to-test

@openshift-ci openshift-ci Bot added ok-to-test Used by openshift-ci bot. and removed needs-ok-to-test Used by openshift-ci bot. labels Apr 14, 2026
Copy link
Copy Markdown
Member

@Mab879 Mab879 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR.

  1. I would suggest putting this fix into a new rules

{{{ ansible_set_config_file(file=chrony_conf_path, parameter='bindcmdaddress', separator=' ', value='/run/chrony/chronyd.sock', create='yes', rule_title=rule_title) }}}

# Fix chrony-wait.service to use Unix socket instead of network socket
- name: Check if chrony-wait.service exists
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- name: Check if chrony-wait.service exists
- name: "{{{ rule_title }} - Check if chrony-wait.service exists"

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted it

path: /usr/lib/systemd/system/chrony-wait.service
register: chrony_wait_service

- name: Replace chrony-wait.service to use Unix socket (KCS 7064388)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- name: Replace chrony-wait.service to use Unix socket (KCS 7064388)
- name: {{{ rule_title }}} - Replace chrony-wait.service to use Unix socket (KCS 7064388)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted

@jan-cerny jan-cerny added the OpenShift OpenShift product related. label Apr 15, 2026
@vickeybrown
Copy link
Copy Markdown
Author

I broke the change into its own rule, and added it to the profiles that currently have the "chronyd_no_chronyc_network" rule since that was what was causing the issue - not sure if that's the move or not so let me know if it needs adjustment

Copy link
Copy Markdown
Member

@Mab879 Mab879 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI fails are valid, please take a look.

Also, please organize commits before we merge this.

@Mab879 Mab879 added this to the 0.1.81 milestone Apr 17, 2026
@Mab879
Copy link
Copy Markdown
Member

Mab879 commented Apr 17, 2026

/ok-to-test

Comment thread linux_os/guide/services/ntp/chronyd_configure_local_socket/kubernetes/shared.yml Outdated
Comment thread linux_os/guide/services/ntp/chronyd_configure_local_socket/rule.yml Outdated
@vickeybrown vickeybrown force-pushed the CMP-3618-chrony-wait-fix branch 2 times, most recently from 9d67a7e to e3becf9 Compare April 20, 2026 15:57
Comment thread linux_os/guide/services/ntp/chronyd_configure_local_socket/rule.yml Outdated
Comment thread shared/macros/10-kubernetes.jinja Outdated
Comment thread shared/macros/10-kubernetes.jinja Outdated
@vickeybrown vickeybrown force-pushed the CMP-3618-chrony-wait-fix branch from e3becf9 to 091a18a Compare April 21, 2026 16:22
@openshift-ci openshift-ci Bot added the needs-rebase Used by openshift-ci bot. label Apr 21, 2026
@vickeybrown vickeybrown force-pushed the CMP-3618-chrony-wait-fix branch from 091a18a to a43d97c Compare April 21, 2026 16:50
@Mab879
Copy link
Copy Markdown
Member

Mab879 commented Apr 21, 2026

Looks like there still one merge conflict.

@vickeybrown vickeybrown force-pushed the CMP-3618-chrony-wait-fix branch from a43d97c to 22f97a2 Compare April 21, 2026 17:17
@openshift-ci openshift-ci Bot removed the needs-rebase Used by openshift-ci bot. label Apr 21, 2026
Copy link
Copy Markdown
Member

@Mab879 Mab879 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ ./automatus.py rule --datastream ../build/ssg-rhel9-ds.xml [VM NAME] chronyd_configure_local_socket             
Setting console output to log level INFO
INFO - The base image option has not been specified, choosing libvirt-based test environment.
INFO - Logging into /home/mburket/Developer/github.com/ComplianceAsCode/content/tests/logs/rule-custom-2026-04-21-1230/test_suite.log
INFO - xccdf_org.ssgproject.content_rule_chronyd_configure_local_socket
INFO - Script cmdport_zero.fail.sh using profile (all) OK
INFO - Script missing_marker.fail.sh using profile (all) OK
INFO - Script not_installed.pass.sh using profile (all) OK
ERROR - Rule 'chronyd_configure_local_socket' test setup script 'service_fixed.pass.sh' failed with exit code 2
ERROR - Environment failed to prepare, skipping test

I think the main issue is the leading indent in service_fixed.pass.sh.


<ind:textfilecontent54_object id="obj_chrony_wait_service_description" version="1">
<ind:filepath>/etc/systemd/system/chrony-wait.service</ind:filepath>
<ind:pattern operation="pattern match">^Description=.*KCS 7064388.*$</ind:pattern>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to check the ExecStart for the correct flags.

@vickeybrown vickeybrown force-pushed the CMP-3618-chrony-wait-fix branch 2 times, most recently from 1982e83 to ebbb2d8 Compare April 21, 2026 19:08
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 21, 2026

@vickeybrown: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-openshift-node-compliance ebbb2d8 link true /test e2e-aws-openshift-node-compliance

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

adjustment

moved change into its own rule

added rule title to ansible file

added rule to components

removed bindcmdaddress
@vickeybrown vickeybrown force-pushed the CMP-3618-chrony-wait-fix branch from ebbb2d8 to 3c7772d Compare April 22, 2026 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ok-to-test Used by openshift-ci bot. OpenShift OpenShift product related.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants