Skip to content

Feat/cherry pick flux2 ppt#967

Merged
helloyongyang merged 4 commits intomainfrom
feat/cherry-pick-flux2-ppt
Mar 31, 2026
Merged

Feat/cherry pick flux2 ppt#967
helloyongyang merged 4 commits intomainfrom
feat/cherry-pick-flux2-ppt

Conversation

@wangshankun
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request transitions the Flux2 Klein model to use flashinfer for RoPE and introduces support for image-to-image editing and multi-image inputs via directory loading. Key updates include a new calculate_dimensions helper, a get_custom_shape method for handling aspect ratios, and scheduler logic for latent interpolation in the 'edit' task variant. Review feedback highlights a critical logic error in the image-to-image interpolation where the reference image is effectively ignored, an undefined self.resolution attribute, and inconsistent default values for the VAE scale factor. Suggestions were also made to avoid side effects during configuration updates, restore dynamic channel selection for better flexibility, and correct a typo in the scheduler.

if custom_shape is not None:
width, height = custom_shape
else:
calculated_width, calculated_height, _ = calculate_dimensions(self.resolution * self.resolution, 16 / 9)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The attribute self.resolution is used here but does not appear to be defined within Flux2KleinRunner or its parent DefaultRunner. If self.resolution is undefined or not a numeric type, this will cause a runtime error. Please ensure it is properly initialized.

ref_img_latent = ref_img_latent.unsqueeze(0).to(AI_DEVICE, dtype=self.dtype)

image_latents = self._encode_image(input_image)
self.latents = (1 - self.sigmas[0]) * ref_img_latent + self.sigmas[0] * self.latents
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The interpolation logic for the edit task variant appears to be a no-op. Since self.sigmas[0] is 1.0 (as defined in set_timesteps), the expression (1 - self.sigmas[0]) * ref_img_latent + self.sigmas[0] * self.latents simplifies to self.latents, effectively ignoring ref_img_latent. This means the first image in the edit task is lost and the process starts from pure noise. If the intention was to support image-to-image editing with a specific strength, the logic needs to be adjusted.

width = round(width / 32) * 32
height = round(height / 32) * 32

return width, height, None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The third return value None in calculate_dimensions is unused by the caller and appears unnecessary. Consider removing it to simplify the function signature.

input_image = image_path

vae_scale_factor = self.config.get("vae_scale_factor", 16)
vae_scale_factor = self.config.get("vae_scale_factor", 8)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The default value for vae_scale_factor here is 8, but in __init__ (line 34) it is set to 16. This inconsistency can lead to unexpected behavior if the configuration key is missing. It's better to use a consistent default value throughout the class.

Comment on lines +229 to +230
as_maps = self.config.get("aspect_ratios", {})
as_maps.update(default_aspect_ratios)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using as_maps.update(default_aspect_ratios) may modify the configuration dictionary in-place if self.config.get returns a reference to the internal dictionary. This can lead to side effects where the configuration is permanently altered. It is safer to create a new dictionary merging the two.

Suggested change
as_maps = self.config.get("aspect_ratios", {})
as_maps.update(default_aspect_ratios)
as_maps = {**self.config.get("aspect_ratios", {}), **default_aspect_ratios}

packed_w = width // 2
packed_h = height // multiple_of
packed_w = width // multiple_of
packed_channels = 128
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Hardcoding packed_channels = 128 reduces the flexibility of the runner. The previous implementation dynamically retrieved this value from the model or configuration. Consider restoring that logic to support different model variants.

Comment on lines +172 to +173
imagge_latent = self._encode_image(img)
image_latents.append(imagge_latent)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There is a typo in the variable name imagge_latent. It should be image_latent for consistency and readability.

Suggested change
imagge_latent = self._encode_image(img)
image_latents.append(imagge_latent)
image_latent = self._encode_image(img)
image_latents.append(image_latent)

@helloyongyang helloyongyang merged commit f26e28b into main Mar 31, 2026
2 checks passed
@helloyongyang helloyongyang deleted the feat/cherry-pick-flux2-ppt branch March 31, 2026 04:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants