Conversation
There was a problem hiding this comment.
Code Review
This pull request transitions the Flux2 Klein model to use flashinfer for RoPE and introduces support for image-to-image editing and multi-image inputs via directory loading. Key updates include a new calculate_dimensions helper, a get_custom_shape method for handling aspect ratios, and scheduler logic for latent interpolation in the 'edit' task variant. Review feedback highlights a critical logic error in the image-to-image interpolation where the reference image is effectively ignored, an undefined self.resolution attribute, and inconsistent default values for the VAE scale factor. Suggestions were also made to avoid side effects during configuration updates, restore dynamic channel selection for better flexibility, and correct a typo in the scheduler.
| if custom_shape is not None: | ||
| width, height = custom_shape | ||
| else: | ||
| calculated_width, calculated_height, _ = calculate_dimensions(self.resolution * self.resolution, 16 / 9) |
There was a problem hiding this comment.
| ref_img_latent = ref_img_latent.unsqueeze(0).to(AI_DEVICE, dtype=self.dtype) | ||
|
|
||
| image_latents = self._encode_image(input_image) | ||
| self.latents = (1 - self.sigmas[0]) * ref_img_latent + self.sigmas[0] * self.latents |
There was a problem hiding this comment.
The interpolation logic for the edit task variant appears to be a no-op. Since self.sigmas[0] is 1.0 (as defined in set_timesteps), the expression (1 - self.sigmas[0]) * ref_img_latent + self.sigmas[0] * self.latents simplifies to self.latents, effectively ignoring ref_img_latent. This means the first image in the edit task is lost and the process starts from pure noise. If the intention was to support image-to-image editing with a specific strength, the logic needs to be adjusted.
| width = round(width / 32) * 32 | ||
| height = round(height / 32) * 32 | ||
|
|
||
| return width, height, None |
| input_image = image_path | ||
|
|
||
| vae_scale_factor = self.config.get("vae_scale_factor", 16) | ||
| vae_scale_factor = self.config.get("vae_scale_factor", 8) |
| as_maps = self.config.get("aspect_ratios", {}) | ||
| as_maps.update(default_aspect_ratios) |
There was a problem hiding this comment.
Using as_maps.update(default_aspect_ratios) may modify the configuration dictionary in-place if self.config.get returns a reference to the internal dictionary. This can lead to side effects where the configuration is permanently altered. It is safer to create a new dictionary merging the two.
| as_maps = self.config.get("aspect_ratios", {}) | |
| as_maps.update(default_aspect_ratios) | |
| as_maps = {**self.config.get("aspect_ratios", {}), **default_aspect_ratios} |
| packed_w = width // 2 | ||
| packed_h = height // multiple_of | ||
| packed_w = width // multiple_of | ||
| packed_channels = 128 |
| imagge_latent = self._encode_image(img) | ||
| image_latents.append(imagge_latent) |
There was a problem hiding this comment.
No description provided.