feat(perception): monocular obstacle avoidance via optical flow 1775 by jhengyilin · Pull Request #1908 · dimensionalOS/dimos

jhengyilin · 2026-04-23T23:29:28Z

feat(perception): monocular obstacle avoidance via optical flow τ-estimation

Branch: feature/optical-flow-obstacle-avoidance-1775 → main

Summary

Implements OpticalFlowModule — real-time Time-to-Contact (τ) estimation from
sparse optical flow on a single RGB camera. No depth sensor, no calibration, no
GPU. Designed as a complement to LiDAR proximity sensing, not a replacement.
Closes

Answers to issue

"Can optical flow be used for real-time obstacle avoidance?"
Yes, as a complement to LiDAR — not a replacement. It detects fast-approaching
obstacles via τ = 1/divergence with 93.9% precision when it fires, but recall
is 0.295, so it only catches ~30% of danger frames. It is blind to stationary
obstacles. Recommended: pair with LiDAR for static proximity; use optical flow
for dynamic and fast-approaching obstacles.

"How much does optical flow correlate with LiDAR?"
Weakly. Spearman r = +0.173 — correct direction but too noisy for a calibrated
timer. Use as a ranked urgency signal, not a precise TTC.

"What keypoint strategies? What does ORB-SLAM use?"
ORB-SLAM uses Oriented FAST for rotation-invariant BRIEF descriptor matching.
That orientation step is irrelevant for Lucas-Kanade, which tracks image patches
directly and has no orientation input. Plain FAST is the correct choice here.

Problem

The Go2 uses LiDAR for obstacle proximity but cannot answer "how fast is it
approaching?" This adds a camera-based early-warning signal for fast-approaching
or dynamic obstacles, complementing LiDAR rather than replacing it.

What Was Built

dimos/perception/optical_flow/
├── __init__.py
├── tac_estimator.py           ← divergence → τ math 
├── optical_flow_module.py     ← dimos Module: camera in, signals out
└── backends/lucas_kanade.py   ← FAST corners + Lucas-Kanade tracking

scripts/eval_optical_flow_gated.py   ← odom-gated evaluation vs LiDAR
scripts/visualize_optical_flow.py    ← live Rerun viewer (annotated frames + τ plot)
RESULTS.md                           ← theory, methodology, full results

Pipeline at a glance:

flowchart LR
    A["Camera RGB frame"] --> B["FAST keypoint detection"]
    B --> C["Lucas-Kanade tracking"]
    C --> D["Flow vectors (dx, dy)"]
    D --> E["5×5 grid divergence"]
    E --> F["τ = 1 / divergence"]
    F --> G{"τ < threshold?"}
    G -- yes --> H["DANGER signal"]
    G -- no --> I["CLEAR"]
    J["IMU ω (yaw rate)"] --> K{"|ω| > 0.3?"}
    K -- yes --> L["GATED — suppress danger"]
    K -- no --> G

Module Interface

Stream	Direction	Type	Description
`color_image`	In	`Image`	RGB/BGR camera feed
`angular_velocity`	In	`Any`	Yaw rate rad/s — optional
`danger_signal`	Out	`Bool`	True when min τ < threshold AND not turning
`tac_grid`	Out	`Any`	5×5 float array of per-cell τ values
`flow_visualization`	Out	`Image`	Debug overlay: flow vectors + grid

When angular_velocity is connected and |ω| > omega_max (default 0.3 rad/s),
danger_signal is forced False. If unused, _last_omega stays 0 and the gate
is transparent — backward compatible.

Evaluation Results

Latency: ~2.5 ms/frame

τ Correlation with LiDAR TTC (82 odom-gated frames: v_fwd > 0.15 m/s, |ω| < 0.15 rad/s)

Pearson r	Spearman r	RMSE
+0.144	+0.173	3.31

Correct sign — closer obstacle → smaller τ — but magnitude is noisy; use τ as a
ranked urgency signal, not a precise timer.

Binary Detection (GT: lidar_dist < 1.5 m, GT positive rate: 84%)

τ threshold	Precision	Recall	F1
3.0 (default)	0.939	0.295	0.449
5.8 (max F1)	0.918	0.533	0.675

At τ=3.0, 93.9% of alarms are confirmed by LiDAR. Tune tau_threshold per use-case.

Live visualization. scripts/visualize_optical_flow.py replays
unitree_office_walk through the same backend and streams annotated frames,
the tau grid, and a min_tau time-series into a Rerun viewer — scrub any
frame to see which cell fired and why.

Known Limitations

Limitation	Consequence	Mitigation
Stationary obstacles produce no flow	Blind to static walls at constant distance	LiDAR handles static proximity
τ value noisy (RMSE ≈ 3 s)	Not a precise collision timer	Ranked alarm; wire `tac_grid` to planner
Low-texture surfaces yield few keypoints	Reduced grid coverage	Edge-enhancing preprocessing

Files Changed

New — production code: dimos/perception/optical_flow/ (5 files)
New — evaluation: scripts/eval_optical_flow_gated.py, RESULTS.md
New — demo: scripts/visualize_optical_flow.py (Rerun viewer)
No existing files modified.

Monocular Time-to-Contact detection via FAST keypoints + Lucas-Kanade flow. Follows dimos Config pattern; rotation-gated via optional angular_velocity stream. Closes dimensionalOS#1775.

Eval: P=0.939 at τ=3.0 on unitree_office_walk. Visualization streams annotated frames + τ timeline to Rerun.

leshy · 2026-04-24T01:30:17Z

can you elaborate on why a mobile robot cannot avoid stationary obstacles with optical flow?

how can I test this? how did you test this? for example,

"How much does optical flow correlate with LiDAR?"
Weakly. Spearman r = +0.173 — correct direction but too noisy for a calibrated

can you show me how to reproduce this number?

jhengyilin · 2026-04-24T01:56:53Z

Testing

I evaluated the module offline against LiDAR ground truth on the unitree_office_walk dataset (51s indoor sequence). The eval script is at scripts/eval_optical_flow_gated.py — it runs every frame through the optical flow backend, filters for frames where the robot is moving forward (speed > 0.15 m/s) and not turning (|ω| < 0.15 rad/s) to isolate pure translation, then uses the LiDAR distance on those frames as ground truth to test optical flow against. It computes:

Correlation — Spearman r = +0.173 between optical flow τ and LiDAR TTC
Binary detection — Precision = 0.939, Recall = 0.295 at τ=3.0 (we take Ground Truth as LiDAR distance < 1.5m)
Latency — ~2.5 ms/frame

To reproduce, first place the unitree_office_walk dataset folder under data/, then run from repo root:

Evaluation:

PYTHONPATH=.venv/lib/python3.12/site-packages/rerun_sdk \
    uv run python3 scripts/eval_optical_flow_gated.py

Visualization (opens a Rerun viewer where you can scrub the timeline and see flow arrows, τ grid, and DANGER/CLEAR labels frame-by-frame):

PYTHONPATH=.venv/lib/python3.12/site-packages/rerun_sdk \
    uv run python3 scripts/visualize_optical_flow.py

Results land in results/optical_flow_gated.json. Full methodology and numbers are in RESULTS.md.

Stationary obstacles and optical flow

A mobile robot can avoid stationary obstacles with optical flow — as long as two conditions are met:

Relative motion exists — the robot is moving toward the obstacle. The robot's own forward motion produces an expanding flow field in the image (pixels diverge outward from the focus of expansion), which gives a finite τ = 1/divergence. The obstacle doesn't need to move on its own.
The obstacle has trackable texture — FAST needs to detect keypoints on the obstacle's surface (edges, corners, texture). A featureless surface may yield too few keypoints for a reliable divergence estimate in that grid cell.

The only case where optical flow truly can't help is when there is zero relative motion between the camera and the obstacle — e.g., the robot is stationary, or moving parallel to a wall at constant distance. No pixel displacement means zero divergence and no alarm. That's the specific case where LiDAR is still needed for proximity sensing.

leshy · 2026-04-24T02:48:32Z

you need to understand all code written and all decisions made.
you can do visual inspection, you don't need to do fancy LIDAR corelations, should be obvious if optical flow is detecting obstacles that are visible on screen or not
if doing fancy LIDAR corelation - make sure it's actually correct - so visualize expected obstacles on camera image (according to lidar) small timestamp differences between odom and video, the way you take into account camera lens distortions are critical (but this is not neccessary to do at all for the first pass, just if doing this, make sure it's correct, it's better to have no data then wrong data)
do we need this grid at all? I think there should be nicer algo that gives you a gradiant of rate of divergence of keypoints across an image? it's ok if not easy etc, just asking as I assume there is such a thing
keypoint detection algo used is heavy and sparse, you want light (easy to compute) and non-sparse keypoints. idk if this is a diff setting for your existing keypoint algo, or entirely differnet algo

jhengyilin added 3 commits April 23, 2026 19:06

feat(perception): OpticalFlowModule for τ-based obstacle avoidance

1f49946

Monocular Time-to-Contact detection via FAST keypoints + Lucas-Kanade flow. Follows dimos Config pattern; rotation-gated via optional angular_velocity stream. Closes dimensionalOS#1775.

feat: odom-gated evaluation and Rerun visualization for optical flow

d9bbc0b

Eval: P=0.939 at τ=3.0 on unitree_office_walk. Visualization streams annotated frames + τ timeline to Rerun.

docs: results writeup for dimensionalOS#1775

4e4f9b5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(perception): monocular obstacle avoidance via optical flow 1775#1908

feat(perception): monocular obstacle avoidance via optical flow 1775#1908
jhengyilin wants to merge 3 commits intodimensionalOS:devfrom
jhengyilin:feature/optical-flow-obstacle-avoidance-1775

jhengyilin commented Apr 23, 2026 •

edited by leshy

Loading

Uh oh!

leshy commented Apr 24, 2026 •

edited

Loading

Uh oh!

jhengyilin commented Apr 24, 2026

Uh oh!

leshy commented Apr 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jhengyilin commented Apr 23, 2026 • edited by leshy Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

feat(perception): monocular obstacle avoidance via optical flow τ-estimation

Summary

Answers to issue

Problem

What Was Built

Module Interface

Evaluation Results

Known Limitations

Files Changed

Uh oh!

leshy commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jhengyilin commented Apr 24, 2026

Uh oh!

leshy commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jhengyilin commented Apr 23, 2026 •

edited by leshy

Loading

leshy commented Apr 24, 2026 •

edited

Loading

leshy commented Apr 24, 2026 •

edited

Loading