-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Summary
I want to confirm whether the current version combination in this repository is expected to work as-is.
From my run, it looks like there may be an API mismatch:
uv.lockpinslerobotto git rev0cf864870cf29f4738d3ade893e6fd13fbd7cdb5(version0.1.0)src/openpi/training/mixture_dataset.pycallsLeRobotDataset(..., load_video=False)
In pinned lerobot revision 0cf864..., LeRobotDataset.__init__ does not appear to accept load_video (it uses download_videos instead).
The norm-stats script fails at dataset construction with:
TypeError: LeRobotDataset.__init__() got an unexpected keyword argument 'load_video'
Also, scripts/compute_norm_stats_sim.py currently uses bare except: and prints only path, which hides traceback unless locally changed.
Evidence
From this repo (policy/openpi-InternData-A1/uv.lock):
name = "lerobot"at line ~2002- git source contains rev
0cf864870cf29f4738d3ade893e6fd13fbd7cdb5 name = "datasets"version3.6.0at line ~603name = "pyarrow"version20.0.0at line ~3638
From this repo (policy/openpi-InternData-A1/src/openpi/training/mixture_dataset.py):
load_video=Falseis passed toLeRobotDataset(around line ~672)
From pinned upstream lerobot revision (0cf864...):
LeRobotDataset.__init__(..., download_videos=True, ...)- no
load_videoparameter
From this repo (policy/openpi-InternData-A1/scripts/compute_norm_stats_sim.py):
- bare
except:in two loops (around lines ~294 and ~313)
Reproduction
- Create/sync env from this repo lock (
policy/openpi-InternData-A1/uv.lock). - Run norm stats script with a valid dataset root.
- Observe crash at dataset construction stage.
Troubleshooting Timeline (What was tried and what happened)
- Initial run showed
0itand no effective processing. - After correcting invocation/path usage, script started iterating but still did not expose root cause because
compute_norm_stats_sim.pyhad bareexcept:. - After making traceback visible locally, first concrete failure was:
TypeError: LeRobotDataset.__init__() got an unexpected keyword argument 'load_video'
- After bypassing that argument mismatch locally for debugging, next failure was:
TypeError: stack(): argument 'tensors' (position 1) must be tuple of Tensors, not Column
- Then environment was aligned back toward the repository's original lock combination (
datasets==3.6.0,pyarrow==20.0.0) to compare behavior, and a new schema-level error appeared:
ValueError: Feature type 'List' not found. Available feature types: ['Value', 'ClassLabel', 'Translation', 'TranslationVariableLanguages', 'LargeList', 'Sequence', 'Array2D', 'Array3D', 'Array4D', 'Array5D', 'Audio', 'Image', 'Video', 'Pdf', 'VideoFrame']
This made it unclear whether the current dataset schema, script assumptions, and pinned dependency set are expected to be mutually compatible.
Questions
- Is
load_video=Falseexpected to be valid with the currently pinnedlerobotrevision inuv.lock? - If yes, am I using the script in a wrong way, or is there an environment/version assumption not documented yet?
- Is the current behavior of swallowing exceptions in
compute_norm_stats_sim.pyintentional? - For the same data, is
ValueError: Feature type 'List' not found ...expected under the repository's original dependency stack, or does it indicate a known schema-version mismatch?