Skip to content

feat: Visual servoing#1902

Open
LuigiVan01 wants to merge 4 commits intodimensionalOS:mainfrom
LuigiVan01:visual-servoing
Open

feat: Visual servoing#1902
LuigiVan01 wants to merge 4 commits intodimensionalOS:mainfrom
LuigiVan01:visual-servoing

Conversation

@LuigiVan01
Copy link
Copy Markdown

Files Created

dimos/manipulation/dynamic_tracking/__init__.py

Package init.

dimos/manipulation/dynamic_tracking/aruco_tracker.py

ArUco marker tracker module — completely rewritten for main's architecture:

  • Subscribes to In[Image] (color) and In[CameraInfo] (intrinsics) from RealSense
  • Detects ArUco markers, averages multiple marker poses
  • Publishes camera_optical_frame → aruco_avg transform to TF tree
  • Visual servo loop: Looks up marker in base_link frame via TF, computes reach pose with configurable offset, publishes PoseStamped on Out[cartesian_command]
  • The PoseStamped.frame_id is set to the CartesianIKTask name, so the coordinator routes it correctly
  • No more OrchestratorClient/RPC or internal IK — CartesianIKTask handles IK in the coordinator tick loop

dimos/manipulation/dynamic_tracking/calibrate_hand_eye.py

Eye-in-hand calibration tool:

  • Interactive CLI: connects to XArm + RealSense, prompts user to move arm to different poses
  • Collects paired (EE pose, marker-in-camera pose) samples
  • Solves AX=XB using cv2.calibrateHandEye() (5 methods available)
  • Saves EEF→camera transform as JSON (translation + quaternion)
  • Includes load_calibration() and calibration_to_transform() helpers for blueprint use
  • Reports mean pairwise consistency error

dimos/manipulation/dynamic_tracking/blueprints.py

Three visual servoing blueprints:

Blueprint CLI command Description
aruco_tracker_realsense dimos run aruco-tracker Camera-only detection, no arm
aruco_servo_mock dimos run aruco-servo-mock Full servo loop with mock arm
aruco_servo_xarm6 dimos run aruco-servo-xarm6 Full servo loop with real XArm6

Each wires: RealSenseCameraArucoTrackerPoseStampedControlCoordinator (with CartesianIKTask using Pinocchio IK via
LfsPath("xarm_description/urdf/xarm6/xarm6.urdf"))

Files Modified

dimos/robot/all_blueprints.py

Added 3 new blueprint entries: aruco-servo-mock, aruco-servo-xarm6, aruco-tracker

TF Tree (Full Chain)

base_link → (FK from joint_state) → ee_link → camera_link
→ camera_color_frame → camera_color_optical_frame → aruco_avg                                                                                                                         

Architecture Decision

Rather than rebasing the 25-commit ruthwik_dynamic_tracking branch (which was 216 commits behind main and referenced a completely different ControlOrchestrator API), I ported the
core concepts onto a fresh branch from main. Main has evolved significantly — ControlOrchestratorControlCoordinator, new CartesianIKTask with built-in Pinocchio IK. The new implementation is ~50% less code because it leverages CartesianIKTask rather than doing IK internally.

One-line Test Command

dimos run aruco-servo-mock    # mock arm + real RealSense camera
dimos run aruco-servo-xarm6   # real xArm6 + real RealSense camera                                                                                                                    

Gradual Testing

  1. Import check (no hardware needed)

Just verify the code loads without errors in your dimos environment:

python3 -c "from dimos.manipulation.dynamic_tracking.aruco_tracker import ArucoTracker; print('OK')"
python3 -c "from dimos.manipulation.dynamic_tracking.blueprints import aruco_servo_mock; print('OK')"

  1. Camera-only test (RealSense needed, no arm)

dimos run aruco-tracker

This runs detection only — verifies ArUco markers are detected and TF transforms are published. You'll need ArUco markers (DICT_4X4_50, 15mm) visible to the camera. Check the Rerun
viewer for annotated images and TF frames.

  1. Mock servo loop (RealSense needed, no arm)

dimos run aruco-servo-mock

Real camera + mock arm. Verifies the full pipeline: detection → TF lookup → PoseStamped → CartesianIKTask. The mock arm won't move physically but you can watch joint state changes in
Rerun or on the /coordinator/joint_state LCM topic.

  1. Full hardware test (RealSense + XArm6)

dimos run aruco-servo-xarm6

Real camera + real arm.

@LuigiVan01 LuigiVan01 marked this pull request as draft April 22, 2026 22:12
@LuigiVan01 LuigiVan01 marked this pull request as ready for review April 22, 2026 22:13
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 22, 2026

Greptile Summary

This PR introduces eye-in-hand visual servoing using ArUco markers: a new ArucoTracker module, three dimos run blueprints (camera-only, mock arm, real xArm6), and an interactive hand-eye calibration tool — all cleanly ported onto the current ControlCoordinator/CartesianIKTask architecture.

  • P1 – calibration unit mismatch: The XArm SDK's Cartesian pose is typically in millimeters; OpenCV's estimatePoseSingleMarkers returns marker translations in meters (because marker_size is in meters). Passing mismatched units to cv2.calibrateHandEye silently produces a wrong t_cam2gripper, causing large real-world positioning errors.
  • P1 – unverified adapter API: calibrate_hand_eye.py calls adapter.connect(), adapter.disconnect(), and adapter.read_cartesian_pose() on XArmAdapter, but none of these methods exist anywhere in the repository, so the tool will raise AttributeError at runtime.

Confidence Score: 4/5

Safe to merge the tracking/blueprint code; the calibration tool has two P1 issues that should be resolved before it is used on hardware.

The ArucoTracker module and blueprints are well-structured and follow the existing architecture. However, calibrate_hand_eye.py has two P1 findings: a likely mm/meter unit mismatch that would silently corrupt the calibration result, and calls to XArmAdapter methods that don't exist in the repository. Both would cause incorrect or crashing behavior when the calibration tool is run against real hardware.

dimos/manipulation/dynamic_tracking/calibrate_hand_eye.py — unit mismatch and unverified adapter API methods.

Important Files Changed

Filename Overview
dimos/manipulation/dynamic_tracking/aruco_tracker.py Core ArUco tracker module; well-structured with processing thread and visual servo loop, but uses the deprecated cv2.aruco.estimatePoseSingleMarkers API.
dimos/manipulation/dynamic_tracking/blueprints.py Three visual-servo blueprints wiring RealSense → ArucoTracker → ControlCoordinator; depth stream is enabled but unused in all three, and the xArm6 blueprint has a hard-coded IP address.
dimos/manipulation/dynamic_tracking/calibrate_hand_eye.py Interactive hand-eye calibration tool; has two P1 concerns: likely mm/meter unit mismatch between XArm pose data and OpenCV aruco measurements, and unverified XArmAdapter method names (connect, disconnect, read_cartesian_pose).
dimos/manipulation/dynamic_tracking/init.py Package init file with docstring; no issues.
dimos/robot/all_blueprints.py Registers three new blueprint entries (aruco-servo-mock, aruco-servo-xarm6, aruco-tracker); straightforward addition, no issues.

Sequence Diagram

sequenceDiagram
    participant RS as RealSenseCamera
    participant AT as ArucoTracker
    participant TF as TF Tree
    participant CC as ControlCoordinator
    participant HW as Hardware (mock/xArm6)

    RS->>AT: color_image (LCM)
    RS->>AT: camera_info (LCM)
    AT->>AT: detectMarkers() + estimatePoseSingleMarkers()
    AT->>TF: publish camera_optical_frame → aruco_avg
    AT->>TF: lookup base_link → aruco_avg
    TF-->>AT: aruco_wrt_base
    AT->>AT: apply reach_offset + min_move filter
    AT->>CC: cartesian_command PoseStamped (LCM)
    AT->>RS: annotated_image (LCM)
    CC->>CC: CartesianIKTask (Pinocchio IK)
    CC->>HW: joint commands
    CC->>TF: publish joint_state FK (base_link → ee_link)
Loading

Comments Outside Diff (4)

  1. dimos/manipulation/dynamic_tracking/calibrate_hand_eye.py, line 965-975 (link)

    P1 Likely unit mismatch: XArm returns mm, OpenCV uses meters

    The XArm SDK's Cartesian pose is typically returned in millimeters, while cv2.aruco.estimatePoseSingleMarkers produces tvecs in meters (because marker_size=0.015 is in meters). cv2.calibrateHandEye requires consistent units across both inputs — if the gripper translations are in mm and the marker translations are in meters, the solved t_cam2gripper will be silently wrong by ~1000×, producing a calibration result that causes large positioning errors on the robot.

    Verify (or document) that adapter.read_cartesian_pose() returns meters and not millimeters. If it returns mm, scale T[:3, 3] by 0.001 before returning.

  2. dimos/manipulation/dynamic_tracking/calibrate_hand_eye.py, line 926-935 (link)

    P1 XArmAdapter API methods unverified against codebase

    The calibration script calls adapter.connect(), adapter.disconnect(), and adapter.read_cartesian_pose() on an XArmAdapter instance. No XArmAdapter class was found in the repository, and there's no existing usage of read_cartesian_pose anywhere in the codebase. If the actual adapter uses a different method name (e.g. from the raw xArm SDK's get_position()), this tool will raise an AttributeError at runtime without any indication of what went wrong.

    Confirm these methods exist on the intended adapter class before merging, or add a comment pointing to the correct import path.

  3. dimos/manipulation/dynamic_tracking/blueprints.py, line 470-472 (link)

    P2 Depth stream enabled but never consumed

    All three blueprints set enable_depth=True on realsense_camera, but ArucoTracker only subscribes to color_image and camera_info. No downstream component reads the depth frames. Enabling the depth stream unnecessarily doubles the USB/PCIe bandwidth consumed by the camera and imposes extra CPU load — especially noticeable at 848×480 @15 fps. Consider setting enable_depth=False unless depth is intended for a future use.

  4. dimos/manipulation/dynamic_tracking/blueprints.py, line 578-580 (link)

    P2 Hard-coded XArm IP address

    address="192.168.1.210" is baked into the blueprint, so the blueprint will fail out-of-the-box for any lab with a different network assignment. Consider accepting it as a function argument with this value as the default, or loading it from an environment variable.

Reviews (1): Last reviewed commit: "blueprints" | Re-trigger Greptile

Comment on lines +235 to +237

# =========================================================================
# Visual servo
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 estimatePoseSingleMarkers is deprecated in OpenCV 4.7+

cv2.aruco.estimatePoseSingleMarkers was deprecated in OpenCV 4.7 and its use generates DeprecationWarning on newer builds; some CI environments treat warnings as errors. The same applies to line 738 of calibrate_hand_eye.py. The recommended replacement is to call cv2.solvePnP directly using the marker corners and the known 3D object points of the marker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant