Skip to content

hatchetProject/AutoHorizon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AutoHorizon

arXiv Project Page

AutoHorizon is a test-time method for automatically and dynamically determining the execution horizon for flow-based VLAs. This repo contains the example usage and PyTorch implementation of AutoHorizon on the Pi0.5 model over the LIBERO benchmark.

Installation

# Clone the repository
git clone https://github.com/hatchetproject/AutoHorizon.git
cd AutoHorizon

# Create environment and install dependencies
uv sync

# Download the JAX pretrained LIBERO checkpoint (gs://openpi-assets/checkpoints/pi05_libero)
# and convert it to PyTorch format
uv run examples/convert_jax_model_to_pytorch.py \
    --checkpoint_dir /path/to/jax/checkpoint \
    --config_name <config name> \
    --output_path /path/to/converted/pytorch/checkpoint

# Double-check that you have transformers 4.53.2 installed
uv pip show transformers

# Patch transformers with custom SigLIP/Gemma modifications
cp -r ./src/openpi/models_pytorch/transformers_replace/* \
    .venv/lib/python3.11/site-packages/transformers/

# Finally, install the LIBERO evaluation environment following examples/libero/README.md

Usage

Step 1: Start the policy server

bash run/serve/serve_libero_horizon.sh

Step 2: Run LIBERO evaluation

bash run/eval/eval_libero_horizon.sh

The script runs sequential evaluation on all LIBERO benchmarks for Static Oracle, Random, and AutoHorizon, with each experiment repeated 3 times. The evaluation can take a long time. Feel free to adjust the script according to your needs.

Replanning strategies

Strategy Flag Description
Elastic --elastic Attention-based soft-pointer horizon (ours)
Fixed --replan_steps N Execute exactly N steps then replan
Random --random Execute a random number of steps from the chunk
Action trigger --action_trigger Replan when consecutive action delta exceeds threshold
Uncertainty --uncertainty Draw multiple samples; replan when per-step std exceeds threshold

Hyper-parameters

AutoHorizon is much more stable across hyper-parameter choices than the other baselines, and the defaults are usually sufficient for reasonable performance. To maximize performance, you can tune the hyper-parameters in src/openpi/models_pytorch/pi0_pytorch.py, including attn_step_count, hold_thr, and max_entropy_q, as well as switching between pick_horizon_softpointer() and bidir_soft_pointer().

Citation

@misc{wang2026vlaknowslimits,
      title={VLA Knows Its Limits},
      author={Haoxuan Wang and Gengyu Zhang and Yan Yan and Ramana Rao Kompella and Gaowen Liu},
      year={2026},
      eprint={2602.21445},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2602.21445},
}

Acknowledgments

Built on OpenPI.

About

A test-time method for determining the execution horizon for flow-matching VLAs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages