Skip to content

Pim/feat/g1 groot wbc#1917

Open
Nabla7 wants to merge 6 commits intofeature/mustafa-write-low-level-adapter-for-g1-humanoidfrom
pim/feat/g1-groot-wbc
Open

Pim/feat/g1 groot wbc#1917
Nabla7 wants to merge 6 commits intofeature/mustafa-write-low-level-adapter-for-g1-humanoidfrom
pim/feat/g1-groot-wbc

Conversation

@Nabla7
Copy link
Copy Markdown
Collaborator

@Nabla7 Nabla7 commented Apr 24, 2026

Problem

Mustafa's branch lands ControlCoordinator + the G1 DDS adapter, but no humanoid policy runs through it yet. To get GR00T-WholeBodyControl driving the G1, dimos needs:

  • A ControlTask that hosts the balance + walk ONNX in the coordinator tick.
  • A simulation path. MujocoConnection's subprocess assumes the older "Twist in, ONNX baked into child" shape and has no way to pass per-joint commands from a parent-side policy.
  • A safe bring-up flow on real hw (arm/dry-run/ramp) — auto_start=True would otherwise drive motors at trained PD the moment the coordinator ticks.
  • Working teleop on macOS. The pygame keyboard_teleop calls NSWindow from a non-main thread; Cocoa rejects it.

Solution

  • GrootWBCTask runs the two ONNX models in the coordinator tick (86-dim obs × 6-frame history, 15-DOF action, decimation=10). Per-joint kp/kd from GR00T-WBC's g1_29dof_gear_wbc.yaml so real DDS LowCmd and sim PD share gains. Pairs with low-priority servo_arms.
  • Coordinator's _on_twist_command now subscribes when any task duck-types set_velocity_command, not only on HardwareType.BASE — otherwise /g1/cmd_vel is dropped.
  • Sim path (subprocess + shm): new joint_cmd/joint_state/imu regions; MujocoConnection accepts control_mode="low_level"; mujoco_process.py low-level branch loads GR00T-WBC's training MJCF (g1_gear_wbc.xml, torque actuators), reads commands from shm, computes PD per step, freezes on mj_forward until first command, steps 1:1 with wall-clock. New SimMujocoG1WholeBodyAdapter wraps MujocoConnection; blueprint flips adapter_type on --simulation. (Considered the in-process MujocoEngine pattern from MuJoCo for Manipulation #1639; went subprocess to reuse the existing mjpython auto-selection on macOS.)
  • Activation state machine on the task: _active / _armed / _arming / _dry_run, with RPC methods arm(ramp_seconds) / disarm() / set_dry_run(enabled). Coordinator gets activate / dry_run In[Bool] ports; TaskConfig gets auto_arm / auto_dry_run / default_ramp_seconds. Sim auto-arms with no ramp; real comes up unarmed + dry-run with a 10 s ramp.
  • Frontend: new ActivationPanel.tsx (Arm + Dry-Run toggle) in the command-center. WebsocketVisModule gains matching Out[Bool] ports + socket.io handlers. WASD captured in the browser DOM via the existing KeyboardControlPanel, sidestepping the Cocoa thread issue.
  • logger.exception(...) in adapter connect() for full traceback. Inline comment that CYCLONEDDS_HOME must be exported at runtime (wheel reads it; without it, topic creation fails with PRECONDITION_NOT_MET even though imports succeed). MUJOCO_LOG.TXT gitignored.

Breaking Changes

None. Additive — existing sim/real blueprints take unchanged paths; the new low-level branch only fires when control_mode="low_level" is explicitly passed.

How to Test

Unit (42/42):

pytest dimos/control/tasks/test_groot_wbc_task.py dimos/control/test_control.py

Sim:

dimos --simulation run unitree-g1-groot-wbc

MuJoCo window opens, G1 stands, dashboard auto-opens, sim auto-arms, W in the command-center iframe walks the robot.

Real hardware (suspend on a crane for first bring-up):

export CYCLONEDDS_HOME=~/cyclonedds/install   # required at runtime; add to ~/.zshrc
ROBOT_INTERFACE=en<N> dimos run unitree-g1-groot-wbc

DDS click, servo_arms moves to default, groot_wbc comes up unarmed + dry-run. In the dashboard: Arm (ramp to default) → 10 s ramp → confirm DRY-RUN |Δq|_max log lines look sane → flip Dry Run off → balance + WASD walk.

End-to-end verified in sim. On real hw: adapter connect + arm-ramp confirmed; full WBC walk also verified.


Contributor License Agreement

  • I have read and approved the CLA.

Nabla7 added 4 commits April 23, 2026 23:02
Integrates the GR00T decoupled whole-body controller as a proper
ControlTask driven by ControlCoordinator. Claims legs+waist (15 of 29
joints) at priority 50; arms held at configured pose by a lower-priority
servo task. ONNX obs/action and per-joint kp/kd transcribed from
g1-control-api's groot_wbc_backend + g1_groot_wbc.yaml.

Real-hardware path only in this commit; sim adapter (sim_mujoco_g1)
lands in a follow-up that mirrors Ruthwik's #1639 pattern.
Closes the architecture gap between ControlCoordinator and dimos's
MuJoCo sim infra so `dimos --simulation run unitree-g1-groot-wbc`
stands and walks end-to-end on macOS.

Sim path (low-level passthrough):
  - shared_memory: add joint_cmd/joint_state/imu regions so the parent
    coordinator can drive per-joint (q, kp, kd) → sim and read back
    (q, dq, tau) + IMU.
  - mujoco_process: low-level branch loads GR00T's `g1_gear_wbc.xml`
    (torque `<motor>` actuators the policy was trained against) via
    dimos's asset-dict pipeline, computes explicit PD torque each
    step, and freezes on `mj_forward` until the coordinator writes
    its first command — avoids the ~2 s free-fall before the tick
    loop starts.  Steps once per outer iter in low-level mode so sim
    runs at wall-clock rate (was 7× real-time → stale PD → seizure).
  - model: `skip_controller` lets the subprocess skip the baked ONNX
    policy install when an external controller owns `data.ctrl`.
  - mujoco_connection: `control_mode` + per-joint read/write helpers.
  - new SimMujocoG1WholeBodyAdapter wraps a MujocoConnection so the
    blueprint swaps `adapter_type="sim_mujoco_g1"` on `--simulation`;
    on real hardware, `unitree_g1` DDS path is unchanged.

Coordinator:
  - `_on_twist_command` now subscribes when any task exposes
    `set_velocity_command` (duck-typed), not only when there's a
    `HardwareType.BASE`.  Whole-body locomotion blueprints with no
    BASE hardware were silently dropping every Twist on /cmd_vel.

Task:
  - GrootWBCTask.start() seeds `_last_targets` with the default
    bent-knee pose so legs are commanded from tick 1; previously the
    first 9 decimation ticks returned None, producing a PD yank when
    the policy's first inference fired.

Blueprint:
  - Replaces pygame keyboard_teleop (broken on macOS — Cocoa rejects
    NSWindow from non-main threads) with WebsocketVisModule.  The
    bundled command-center React app has a KeyboardControlPanel that
    captures WASD in the browser, routed via socket.io → LCM
    /g1/cmd_vel → coordinator → GrootWBCTask.

Bundled MJCF `data/mujoco_sim/g1_gear_wbc.xml` (repacked into
`data/.lfs/mujoco_sim.tar.gz`) reuses menagerie's G1 meshes — every
mesh filename GR00T's MJCF references is present in menagerie's
assets dir, so no mesh bytes are duplicated into the repo.
Gives the GR00T WBC task a safety ritual for real-hardware bring-up:
it comes up unarmed + dry-run, so the robot sits quietly in dev-mode
damping while the operator verifies computed commands look sane.
An Activate button in the dashboard ramps the target from the
current pose to the default bent-knee stance over ~10 s, then hands
control to the ONNX policy.  Sim keeps the zero-touch UX by
auto-arming with no ramp — the MuJoCo subprocess already holds the
MJCF home pose until the coordinator starts commanding.

Task (GrootWBCTask):
  - State machine: _active / _armed / _arming / _dry_run flags.
  - compute() branches:
      not active        → None
      active, unarmed   → echo current joint positions (PD damping)
      arming            → lerp ramp_start → default_15 over duration
      armed + dry_run   → run policy, return None (throttled log)
      armed + live      → run policy, emit JointCommandOutput
  - RPC methods arm(ramp_seconds) / disarm() / set_dry_run(enabled)
    / state_snapshot() — duck-typed, invokable via the coordinator's
    activate/dry_run inputs or its task_invoke RPC.
  - start() honours config.auto_arm / auto_dry_run /
    default_ramp_seconds.

Coordinator:
  - New In[Bool] ports ``activate`` / ``dry_run`` subscribed when any
    task exposes arm() / set_dry_run() — mirrors the duck-typed
    set_velocity_command wiring added previously.
  - TaskConfig gains auto_arm / auto_dry_run / default_ramp_seconds,
    plumbed into GrootWBCTaskConfig in the factory.

WebsocketVisModule:
  - New Out[Bool] ports ``activate`` / ``dry_run`` (typed against
    dimos.msgs.std_msgs.Bool to match coordinator's In[Bool] ports).
  - socket.io events ``arm`` / ``disarm`` / ``set_dry_run`` publish
    to those ports.

Frontend (command-center-extension):
  - New ActivationPanel (arm toggle + dry-run toggle) wired into
    App.tsx; Connection.ts gets arm() / disarm() / setDryRun().
  - data/command_center.html and data/.lfs/command_center.html.tar.gz
    regenerated via ``npm run build:inline``.

Blueprint (unitree_g1_groot_wbc):
  - sim  → auto_arm=True,  auto_dry_run=False, ramp=0.0
  - real → auto_arm=False, auto_dry_run=True,  ramp=10.0
  - LCM transports for /g1/activate and /g1/dry_run.

Tests:
  - Existing fixture flipped to auto_arm=True so old policy tests
    still exercise the always-armed path.
  - New ``unarmed_task`` fixture + 6 tests cover hold-pose echo,
    immediate-arm, ramp interpolation, dry-run suppression, dry-run
    toggle-off, and disarm-returns-to-hold.  42/42 pass.

Also gitignores MUJOCO_LOG.TXT (runtime GL-warning dump).
logger.error("...{e}") only surfaces the message, which obscures which
DDS call inside ChannelFactoryInitialize / ChannelPublisher etc.
actually failed.  logger.exception(...) gives the full traceback —
much faster to diagnose real-hw bring-up issues.

Also adds a comment at the DDS-init site documenting that the
cyclonedds Python wheel reads CYCLONEDDS_HOME at runtime: without it
exported, DDS topic creation fails with PRECONDITION_NOT_MET on real
hardware even though the wheel imports cleanly.  The unitree_sdk2py
docs spell this out; missing it costs hours.

Bundle: data/.lfs/command_center.html.tar.gz repacked from the
ActivationPanel build added in the previous commit.
from dimos.core.global_config import GlobalConfig
from dimos.hardware.whole_body.registry import WholeBodyAdapterRegistry

logger = logging.getLogger(__name__)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use setup_logger

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 24, 2026

Greptile Summary

This PR adds GR00T WholeBodyControl locomotion for the Unitree G1: a GrootWBCTask running balance/walk ONNX models in the coordinator tick loop, a MuJoCo low-level passthrough sim path via shared memory, an activation state machine (arm/disarm/dry-run), and a frontend ActivationPanel. All new paths are additive and gated by control_mode="low_level" or explicit blueprint opt-in, so existing blueprints are unaffected. The three open issues from previous review rounds — POS_STOP writing 0.0 instead of current position, the unacquired _write_lock, and arm() not guarding against a mid-ramp concurrent call — remain unresolved and are worth addressing before first real-hardware deployment.

Confidence Score: 4/5

Safe to merge for simulation validation; three pre-existing P1 findings (POS_STOP default, unacquired write lock, concurrent arm race) should be closed before first unsupervised real-hardware run.

No new P1/P0 issues found in this pass. The two new comments are P2 style issues (unused renderers in low-level mode, misleading read_joint_cmd docstring). Score is capped at 4 because the three P1 findings from previous review rounds remain open.

dimos/hardware/whole_body/mujoco/g1/adapter.py (POS_STOP fallback + unacquired lock), dimos/control/tasks/groot_wbc_task.py (concurrent arming race)

Important Files Changed

Filename Overview
dimos/control/tasks/groot_wbc_task.py New GrootWBCTask with activation state machine (active/armed/arming/dry_run), 86-dim obs building, 6-frame history, and ONNX model selection. Core logic is sound; the arm() guard missing a check for _arming (flagged in previous threads) is the main open concern.
dimos/simulation/mujoco/mujoco_process.py Adds low-level passthrough branch: loads GR00T MJCF, reads joint_cmd shm, applies PD per step, writes joint_state+IMU back. Renderers are unnecessarily allocated in low-level mode (P2).
dimos/simulation/mujoco/shared_memory.py Adds joint_cmd, joint_state, and imu shm regions for low-level passthrough. read_joint_cmd uses seq>0 (always-return) semantics unlike the edge-triggered read_command — misleading docstring but correct for continuous PD.
dimos/hardware/whole_body/mujoco/g1/adapter.py SimMujocoG1WholeBodyAdapter wrapping MujocoConnection in low_level mode. POS_STOP fallback writes 0.0 instead of current position (flagged previously). _write_lock declared but never acquired (flagged previously).
dimos/control/coordinator.py Adds activate/dry_run In[Bool] ports, duck-typed routing for set_velocity_command and arm()/disarm() across tasks, and GrootWBCTask creation from TaskConfig. Changes are additive and backward-compatible.
dimos/web/command-center-extension/src/ActivationPanel.tsx New arm/disarm/dry-run panel; armed state is local UI only (no server echo), unconditionally rendered for all blueprints (flagged previously).
dimos/web/websocket_vis/websocket_vis_module.py Adds activate/dry_run Out[DimosBool] ports and socket.io handlers for arm/disarm/set_dry_run. Guards transport availability before publishing; logs warning when transport not configured.
dimos/robot/unitree/g1/blueprints/basic/unitree_g1_groot_wbc.py Blueprint wiring GR00T WBC: sets auto_arm/auto_dry_run/default_ramp_seconds per sim vs real, flips adapter_type on --simulation flag.

Sequence Diagram

sequenceDiagram
    participant Dashboard as Dashboard (Browser)
    participant WsVis as WebsocketVisModule
    participant Coord as ControlCoordinator
    participant Task as GrootWBCTask
    participant HW as ConnectedWholeBody
    participant Adapter as SimMujocoG1Adapter
    participant Shm as SharedMemory
    participant Proc as mujoco_process.py

    Note over Dashboard,Proc: Startup / Arming
    Adapter->>Shm: create shm regions (joint_cmd, joint_state, imu)
    Adapter->>Proc: spawn subprocess(control_mode=low_level)
    Proc->>Shm: signal_ready()
    Adapter->>Shm: poll read_joint_state() until non-None

    Dashboard->>WsVis: socket arm
    WsVis->>Coord: activate.publish(Bool(True))
    Coord->>Task: arm()
    Task-->>Task: _arm_pending = True

    Note over Coord,Proc: Coordinator tick loop (500 Hz)
    loop Every tick
        HW->>Adapter: read_motor_states()
        Adapter->>Shm: read_joint_state()
        Shm-->>Adapter: (q, dq, tau) x 29
        HW->>Adapter: read_imu()
        Adapter->>Shm: read_imu()
        Coord->>Task: compute(CoordinatorState)
        alt Arming ramp phase
            Task-->>Coord: JointCommandOutput(lerp to default_15)
        else Armed inference tick every 10th
            Task->>Task: build_obs(gyro, gravity, q_29, dq_29)
            Task->>Task: ONNX balance/walk action
            Task-->>Coord: JointCommandOutput(target_q_15)
        end
        Coord->>HW: write_command(positions)
        HW->>Adapter: write_motor_commands(MotorCommand x 29)
        Adapter->>Shm: write_joint_cmd(q, kp, kd)
        Proc->>Shm: read_joint_cmd()
        Proc->>Proc: PD torque = kp*(q_tgt-q) - kd*dq
        Proc->>Proc: mj_step()
        Proc->>Shm: write_joint_state + write_imu
    end
Loading

Reviews (3): Last reviewed commit: "g1 wbc: cleanup pass on review feedback" | Re-trigger Greptile

Comment on lines +204 to +205
# instance is ever driven from multiple threads (coordinator tick + RPC).
_write_lock = threading.Lock()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 _write_lock declared but never acquired

_write_lock = threading.Lock() is created at module level with a comment saying it guards concurrent shm writes from the coordinator tick and RPC threads, but write_motor_commands never acquires it. If two threads race to call write_motor_commands on the same adapter, the lock provides no protection.

Comment on lines +352 to +360
"""Write per-joint (q, kp, kd) into the subprocess shm.

Only meaningful when the connection was started with
``control_mode="low_level"``. The subprocess applies ``q`` to
``data.ctrl``; ``kp`` / ``kd`` are currently advisory (the MJCF
carries baked position-actuator gains).
"""
if self._is_cleaned_up or self.shm_data is None:
return
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Misleading docstring — kp/kd are not advisory

The docstring says kp/kd "are currently advisory (the MJCF carries baked position-actuator gains)", but mujoco_process.py explicitly uses the shm-sourced kp/kd values to compute PD torques (data.ctrl[:num_motors] = kp * (q_tgt - q) - kd * dq). The MJCF deliberately uses torque (<motor>) actuators — which is the whole point of the low-level passthrough mode. The docstring contradicts the actual behaviour.

Suggested change
"""Write per-joint (q, kp, kd) into the subprocess shm.
Only meaningful when the connection was started with
``control_mode="low_level"``. The subprocess applies ``q`` to
``data.ctrl``; ``kp`` / ``kd`` are currently advisory (the MJCF
carries baked position-actuator gains).
"""
if self._is_cleaned_up or self.shm_data is None:
return
"""Write per-joint (q, kp, kd) into the subprocess shm.
Only meaningful when the connection was started with
``control_mode="low_level"``. The subprocess computes PD torque
``kp * (q_tgt - q) - kd * dq`` and applies it to the torque-actuator
MuJoCo model each step.
"""

Comment on lines +170 to +171
sinp = 2.0 * (w * y - z * x)
pitch = math.copysign(math.pi / 2.0, sinp) if abs(sinp) >= 1.0 else math.asin(sinp)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 POS_STOP fallback writes 0.0, not current joint position

The inline comment says "write current state back as the target so the subprocess doesn't see a stale target drift", but the code writes q[i] = 0.0. For joints whose neutral position is far from 0 (e.g. a hip at −0.1 rad), this sends the subprocess a target of 0 with full kp, creating a torque spike. If the intent is genuinely to hold the current position, the current q_actual (available from read_motor_states) should be used instead.

Comment on lines 135 to +140
onSendMoveCommand={handleSendMoveCommand}
onStopMoveCommand={handleStopMoveCommand}
/>
<ActivationPanel
onArm={handleArm}
onDisarm={handleDisarm}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 ActivationPanel is unconditionally rendered for all blueprints

ActivationPanel is always mounted regardless of whether the running blueprint wires the activate/dry_run transports. Users of non-WBC blueprints will see Arm/Disarm controls that silently fail (the server logs a warning but the UI gives no indication). Consider gating the panel on a prop or connection-level capability flag, or at minimum showing a disabled/greyed state when the backend has not confirmed the transport is live.


# Module-global lock to guard concurrent shm writes if the same adapter
# instance is ever driven from multiple threads (coordinator tick + RPC).
_write_lock = threading.Lock()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this used?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never used, dead code I am removing this.

Nabla7 and others added 2 commits April 24, 2026 23:54
- Sim adapter logger uses setup_logger (dimos convention) instead of
  logging.getLogger.  Per @paul-nechifor.
- Drop unused _write_lock + threading import from sim adapter.
  Per @paul-nechifor.
- Drop _ARM_DEFAULT_POSE from blueprint __all__ — private const, no
  external consumer.
- Update blueprint docstring Usage block: WASD now lives in the web
  dashboard, not pygame.  Document CYCLONEDDS_HOME runtime requirement.
- mujoco_process.py: low-level G1 z-spawn 0.8 -> 0.793 to match
  g1_gear_wbc.xml's pelvis pos (was dropping 7 mm at first mj_step).
- mujoco_process.py: simplify ``skip_controller and act_*adr is not None``
  guards to ``if skip_controller`` + assert; act_* are populated whenever
  skip_controller is True.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants