Implement a small caching layer for compiled kernels, downloaded model weights, and any expensive preprocessing artifacts, keyed by configuration and model version, to reduce wall‑clock time for repeated runs.
Why
- Many workflows (especially templates and CI) repeat similar runs with only minor config changes.
- Recomputing or re-downloading heavy artifacts each time wastes GPU and wall-clock time.
- A lightweight cache improves iteration speed without changing any single-run semantics.
What to do
- Define a cache directory convention (e.g.,
~/.cache/disco-diffusion or a configurable path) and basic versioning scheme.
- Cacheable items could include:
- Downloaded model weights / checkpoints.
- Compiled CUDA kernels or other JIT artifacts, where applicable.
- Precomputed embeddings or other expensive preprocessing outputs that are safe to reuse.
- Add hooks in the existing run path to:
- Check for cached assets before recomputing/downloading.
- Populate the cache on first computation.
- Ensure there is a simple way to invalidate or clear the cache when upgrading models or code.
Acceptance criteria
- Repeated runs with the same (or compatible) configuration reuse cached artifacts and are measurably faster.
- Cache behavior is transparent to users for single runs; correctness is unchanged.
- There is a documented way to clear or disable the cache if needed (e.g., env var or CLI flag).
Implement a small caching layer for compiled kernels, downloaded model weights, and any expensive preprocessing artifacts, keyed by configuration and model version, to reduce wall‑clock time for repeated runs.
Why
What to do
~/.cache/disco-diffusionor a configurable path) and basic versioning scheme.Acceptance criteria