Question: Is current uv.lock combination expected to work for `compute_norm_stats_sim.py`?

## Summary
I want to confirm whether the current version combination in this repository is expected to work as-is.

From my run, it looks like there may be an API mismatch:

- `uv.lock` pins `lerobot` to git rev `0cf864870cf29f4738d3ade893e6fd13fbd7cdb5` (version `0.1.0`)
- `src/openpi/training/mixture_dataset.py` calls `LeRobotDataset(..., load_video=False)`

In pinned `lerobot` revision `0cf864...`, `LeRobotDataset.__init__` does **not** appear to accept `load_video` (it uses `download_videos` instead).

The norm-stats script fails at dataset construction with:

```text
TypeError: LeRobotDataset.__init__() got an unexpected keyword argument 'load_video'
```

Also, `scripts/compute_norm_stats_sim.py` currently uses bare `except:` and prints only path, which hides traceback unless locally changed.

## Evidence
From this repo (`policy/openpi-InternData-A1/uv.lock`):

- `name = "lerobot"` at line ~2002
- git source contains rev `0cf864870cf29f4738d3ade893e6fd13fbd7cdb5`
- `name = "datasets"` version `3.6.0` at line ~603
- `name = "pyarrow"` version `20.0.0` at line ~3638

From this repo (`policy/openpi-InternData-A1/src/openpi/training/mixture_dataset.py`):

- `load_video=False` is passed to `LeRobotDataset` (around line ~672)

From pinned upstream `lerobot` revision (`0cf864...`):

- `LeRobotDataset.__init__(..., download_videos=True, ...)`
- no `load_video` parameter

From this repo (`policy/openpi-InternData-A1/scripts/compute_norm_stats_sim.py`):

- bare `except:` in two loops (around lines ~294 and ~313)

## Reproduction
1. Create/sync env from this repo lock (`policy/openpi-InternData-A1/uv.lock`).
2. Run norm stats script with a valid dataset root.
3. Observe crash at dataset construction stage.

## Troubleshooting Timeline (What was tried and what happened)
1. Initial run showed `0it` and no effective processing.
2. After correcting invocation/path usage, script started iterating but still did not expose root cause because `compute_norm_stats_sim.py` had bare `except:`.
3. After making traceback visible locally, first concrete failure was:

```text
TypeError: LeRobotDataset.__init__() got an unexpected keyword argument 'load_video'
```

4. After bypassing that argument mismatch locally for debugging, next failure was:

```text
TypeError: stack(): argument 'tensors' (position 1) must be tuple of Tensors, not Column
```

5. Then environment was aligned back toward the repository's original lock combination (`datasets==3.6.0`, `pyarrow==20.0.0`) to compare behavior, and a new schema-level error appeared:

```text
ValueError: Feature type 'List' not found. Available feature types: ['Value', 'ClassLabel', 'Translation', 'TranslationVariableLanguages', 'LargeList', 'Sequence', 'Array2D', 'Array3D', 'Array4D', 'Array5D', 'Audio', 'Image', 'Video', 'Pdf', 'VideoFrame']
```

This made it unclear whether the current dataset schema, script assumptions, and pinned dependency set are expected to be mutually compatible.

## Questions
1. Is `load_video=False` expected to be valid with the currently pinned `lerobot` revision in `uv.lock`?
2. If yes, am I using the script in a wrong way, or is there an environment/version assumption not documented yet?
3. Is the current behavior of swallowing exceptions in `compute_norm_stats_sim.py` intentional?
4. For the same data, is `ValueError: Feature type 'List' not found ...` expected under the repository's original dependency stack, or does it indicate a known schema-version mismatch?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: Is current uv.lock combination expected to work for `compute_norm_stats_sim.py`? #3

Summary

Evidence

Reproduction

Troubleshooting Timeline (What was tried and what happened)

Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question: Is current uv.lock combination expected to work for compute_norm_stats_sim.py? #3

Description

Summary

Evidence

Reproduction

Troubleshooting Timeline (What was tried and what happened)

Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Question: Is current uv.lock combination expected to work for `compute_norm_stats_sim.py`? #3