-
Notifications
You must be signed in to change notification settings - Fork 532
Description
Git commit
git: 636d3cb
Operating System & Version
Ubuntu 24.04.3 LTS
GGML backends
CUDA
Command-line arguments used
[ "--diffusion-model", "/home/charles/coding/models/flux2/flux-2-klein-4b.safetensors", "--vae", "/home/charles/coding/models/flux2/flux2-vae.safetensors", "--llm", "/home/charles/coding/models/flux2/Qwen3-4B-Q6_K.gguf", "--cfg-scale", "1.0", "--steps", "4", "-v", "--diffusion-fa", "--width", "1024", "--height", "1024", "-p", ""A cat holding a beachball on the river bank."" ]
Steps to reproduce
Get the latents exported by diffusers from latents_3.zip.
Add the following code to load the repro data:
LOG_INFO("decoding %zu latents", final_latents.size());
FILE* f=fopen("/home/charles/coding/wishingwell/scripts/latents_3.raw", "rb");
fseek(f, 0, SEEK_END);
long fsize = ftell(f);
fseek(f, 0, SEEK_SET);
float *data = (float*)malloc(fsize + 1);
fread(data, fsize, 1, f);
fclose(f);
GGML_ASSERT(fsize == ggml_nbytes(final_latents[0]));
LOG_INFO("latent: %zu %zu %zu %zu", final_latents[0]->ne[0], final_latents[0]->ne[1], final_latents[0]->ne[2], final_latents[0]->ne[3]);
auto hack_latent = ggml_dup_tensor(work_ctx, final_latents[0]);
memcpy(hack_latent->data, data, fsize);
final_latents[0] = hack_latent;
After the line
LOG_INFO("generating %" PRId64 " latent images completed, taking %.2fs", final_latents.size(), (t3 - t1) * 1.0f / 1000);
In stable-diffusion.cpp
(also attached the python code to generate this file)
simple_inference.py
latents_3.zip
What you expected to happen
When working with flux2.klein I noticed artifacts I didn't see when using the python diffuses version before. I suspected the VAE so made a quick repro where diffusers outputs the raw VAE and then load it in stable-diffusion.cpp for decoding. The results are worse in particular notice the gradients on the ball.
This seems beyond expected difference of the two implementations and leads to unusable quality.
What actually happened
Noticeable quality degradation see Images. Same style artifacts happen when directly generating images from prompts. They seem mostly visible in smooth gradient areas.
This one is generated using stock stable-diffusion.cpp
Logs / error messages / stack trace
None
Additional context / environment details
NVIDIA GB10 Driver Version: 580.95.05 CUDA Version: 13.0