Issue with running Flux.1.dev on iGPU #2926

stsxxx · 2025-05-08T01:19:02Z

Describe the bug
I'm trying to run the Flux.1.dev model using fp16 on the integrated GPU of an Intel® Core™ Ultra 7 165H. I attempted to generate an image with 50 inference steps, but one single step takes an extremely long time (forever) to complete.
Additionally, I ran Stable Diffusion XL in fp16 under the same setup (50 steps, 1024×1024), and it completed in about 3 minutes per image, which is significantly faster than Flux.1.dev but still slow.

Expected behavior
I want to verify whether my Intel integrated GPU (iGPU) is correctly activated and being used during inference. Could you guide me on how to check its status and ensure it's being utilized properly in my setup? Or is the performance I'm seeing simply expected for this hardware?

Screenshots
The code i am using:

**Installation instructions
I am not using the notebook.
Models have all been converted usingoptimum-cli export openvino
GPU and NPU can be detected:

** Environment information **
openai==1.77.0
opencv-python== 4.11.0.86
openvino==2025.1.0
openvino-genai==2025.1.0.0
optimum==1.25.0.dev0
optimum-intel==1.23.0.dev0+590692f

The text was updated successfully, but these errors were encountered:

brmarkus · 2025-05-08T07:34:47Z

Instead of a screenshot could you show the used code as text, please? That would make reproduction much easier.
Is the code based on another sample or based on a Jupyter-Notebook?

Can you provide more details about your environment, OperatingSystem, version information?

stsxxx · 2025-05-08T07:55:47Z

Thank you for your reply. The code is shown below.
It's based on a simple diffusion model example from the OpenVINO repository.
I'm using Ubuntu 24.04.2 LTS, Python 3.12.3, torch 2.7.0


from pathlib import Path
import argparse
import openvino_genai
from PIL import Image
from tqdm import tqdm
import sys
import openvino as ov

seed = 42
num_inference_steps = 50
random_generator = openvino_genai.TorchGenerator(seed)
pbar = tqdm(total=num_inference_steps)

def callback(step, num_steps, latent):
    if num_steps != pbar.total:
        pbar.reset(num_steps)
    pbar.update(1)
    sys.stdout.flush()
    return False

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('model_dir')
    parser.add_argument('prompt')
    args = parser.parse_args()

    # device = 'GPU'  # GPU can be used as well
    pipe = openvino_genai.Text2ImagePipeline(args.model_dir, device="GPU")
    print(pipe)

    result = pipe.generate(args.prompt, num_inference_steps=num_inference_steps, generator=random_generator, callback=callback, height=1024, width=1024)
    # pipe = ov_genai.Image2ImagePipeline(model_dir, device.value)
    pbar.close()

    final_image = Image.fromarray(result.data[0])
    final_image.save("output.png")

if '__main__' == __name__:
    main()

command: python3 test.py "model directory" "cyberpunk cityscape like Tokyo and New York with tall buildings at dusk, golden hour, cinematic lighting"

brmarkus · 2025-05-08T08:51:58Z

I just tried to run the Jupyter notebook under
https://github.com/openvinotoolkit/openvino_notebooks/blob/e5a8aa127c9464a356a6767d2fb62b88ed21be3c/notebooks/flux.1-image-generation/flux.1-image-generation.ipynb

but it fails for me with

!huggingface-cli download {ov_model_id} --local-dir {model_dir}

requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/api/models/OpenVINO/FLUX.1-dev-int4-ov/revision/main
Repository Not Found for url: https://huggingface.co/api/models/OpenVINO/FLUX.1-dev-int4-ov/revision/main.

(being logged into HuggingFace, accepted license agreement, access-token provided)

EDIT: According to this issue #2792 it was working in March this year.

@eaidova was the model recently moved, renamed, removed, do you know? Or do my HuggingFace credentials (I'm located in Europe/Germany) not allow to access the model?
In HuggingFace, I see "Gated model You have been granted access to this model".

eaidova · 2025-05-08T09:00:32Z

@brmarkus dev model was never uploaed to huggingface hub under openvino account, it has license agreement limitations for that unfortunately, we have only schnell
https://huggingface.co/OpenVINO/FLUX.1-schnell-int4-ov

stsxxx · 2025-05-08T09:07:19Z

I was using optimum-cli export openvino --model black-forest-labs/FLUX.1-dev --task text-to-image --weight-format fp16 ov_model_flux/ to convert huggingface model into OpenVINO IR format and load the model from ov_model_flux/ directory by calling pipe = openvino_genai.Text2ImagePipeline('ov_model_flux/', device="GPU").

brmarkus · 2025-05-08T09:14:23Z

@brmarkus dev model was never uploaed to huggingface hub under openvino account, it has license agreement limitations for that unfortunately, we have only schnell https://huggingface.co/OpenVINO/FLUX.1-schnell-int4-ov

Ok, thank you - now I initiated download, conversion and compression using the Jupyter-Notebook for the model "black-forest-labs/FLUX.1-schnell".
This is going to take a while.

Then I will try to reproduce inference using CPU and GPU from my MS-Win11-Pro, 64GB system memory, Intel Core Ultra 7 155H, using the query "cyberpunk cityscape like Tokyo and New York with tall buildings at dusk, golden hour, cinematic lighting".

brmarkus · 2025-05-08T09:22:57Z

With the default parameters:

Pipeline settings
Input text: cyberpunk cityscape like Tokyo and New York with tall buildings at dusk, golden hour, cinematic lighting
Image size: 256 x 256
Seed: 42
Number of steps: 4

With the default checkbox "Use compressed models" activated.

Using the CPU
Progress-bar:
100% 4/4 [01:44<00:00, 22.83s/it]

brmarkus · 2025-05-08T09:29:28Z

Using the GPU, same parameters, same checkboxes, using "black-forest-labs/FLUX.1-schnell":

Progress-bar:
100% 4/4 [00:21<00:00, 2.98s/it]

Task-Manager showing GPU-utilization:

@stsxxx your code uses "num_inference_steps = 50", while the Juypter-Notebook uses "Number of steps: 4" only.

Is there a bigger difference between "FLUX.1-schnell" and "FLUX.1-dev"?

brmarkus · 2025-05-08T09:36:08Z

I think the term "Non-Commercial Use Only" wouldn't allow me to use "FLUX.1-dev"...

@stsxxx do you see similar values in your environment when using "FLUX.1-schnell" instead, to compare and check regarding your initial question "how to check its status and ensure it's being utilized properly in my setup"?

stsxxx · 2025-05-08T09:42:42Z

@brmarkus Thank you for your reply.
The reason I’m using FLUX.1-dev in FP16 with 50 total steps is due to my experimental setup. I’ll give FLUX.1-schnell with INT4 a try, but it would be ideal if FLUX.1-dev is supported.

Also, if possible, could you try running Stable Diffusion XL in FP16 with image size 1024x1024? Since I was able to run it successfully, we could use it as a comparison to check whether my GPU is being utilized correctly.

It’s late night on my end, so I’ll run FLUX.1-schnell tomorrow. Thank you again for your help!

stsxxx · 2025-05-09T21:10:10Z

@brmarkus
Hi, i tried to run Flux.1-schnell according to the code provided here: https://huggingface.co/OpenVINO/FLUX.1-schnell-fp16-ov. I used only 4 steps, but the issue persists—it hasn't completed even after 30 minutes.

brmarkus · 2025-05-09T21:37:23Z

Would you have a chance to run the Juypiter notebook

https://github.com/openvinotoolkit/openvino_notebooks/blob/e5a8aa127c9464a356a6767d2fb62b88ed21be3c/notebooks/flux.1-image-generation/flux.1-image-generation.ipynb
?
The notebook uses a compressed and quantized variant in OpenVINO IR format. That could REALLY really make a difference!!

On my CPU (and MS-Win11-Pro, 64GB RAM), I got "4/4 [01:44<00:00, 22.83s/it]" - less than 2 minutes for 4 iterations.
And on the GPU I got "4/4 [00:21<00:00, 2.98s/it]" - less than 30 seconds for 4 iterations.

stsxxx · 2025-05-09T21:51:12Z

I'll give it a try. Does this mean that other model formats aren't supported here? Even in the provided notebook, it mentions that you can use FLUX.1-dev by simply switching. I'm using the weights from your model hub, but it's not working—so I believe there may be another issue at play.

brmarkus · 2025-05-10T06:40:45Z

OpenVINO supports different formats (like ONNX and others), but there is also an optimized format called IntermediateRepresentation "IR".
Plus, in this case, INT4 is used, but also INT8 or FP16 or FP32 (or BF16) could be used.
In addition the modell got compressed.
You could use tools like "model_analyzer" to compare the different variants.
Depending on the underlying hardware, specific (CPU-)instructions are used.
In the IR format, there are two files, a XML- and a BIN-file. Please replace both when switching models.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Issue with running Flux.1.dev on iGPU #2926

Issue with running Flux.1.dev on iGPU #2926

stsxxx commented May 8, 2025

brmarkus commented May 8, 2025

Uh oh!

stsxxx commented May 8, 2025

Uh oh!

brmarkus commented May 8, 2025 •

edited

Loading

Uh oh!

eaidova commented May 8, 2025 •

edited

Loading

Uh oh!

stsxxx commented May 8, 2025

Uh oh!

brmarkus commented May 8, 2025

Uh oh!

brmarkus commented May 8, 2025 •

edited

Loading

Uh oh!

brmarkus commented May 8, 2025

Uh oh!

brmarkus commented May 8, 2025

Uh oh!

stsxxx commented May 8, 2025

Uh oh!

stsxxx commented May 9, 2025

Uh oh!

brmarkus commented May 9, 2025

Uh oh!

stsxxx commented May 9, 2025

Uh oh!

brmarkus commented May 10, 2025

Uh oh!

Issue with running Flux.1.dev on iGPU #2926

Issue with running Flux.1.dev on iGPU #2926

Comments

stsxxx commented May 8, 2025

brmarkus commented May 8, 2025

Uh oh!

stsxxx commented May 8, 2025

Uh oh!

brmarkus commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eaidova commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stsxxx commented May 8, 2025

Uh oh!

brmarkus commented May 8, 2025

Uh oh!

brmarkus commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brmarkus commented May 8, 2025

Uh oh!

brmarkus commented May 8, 2025

Uh oh!

stsxxx commented May 8, 2025

Uh oh!

stsxxx commented May 9, 2025

Uh oh!

brmarkus commented May 9, 2025

Uh oh!

stsxxx commented May 9, 2025

Uh oh!

brmarkus commented May 10, 2025

Uh oh!

brmarkus commented May 8, 2025 •

edited

Loading

eaidova commented May 8, 2025 •

edited

Loading

brmarkus commented May 8, 2025 •

edited

Loading