-
Notifications
You must be signed in to change notification settings - Fork 43
Description
I am trying to run Flux dev training on my v5p-16 setup and get the device_put Value error. Below are my run command via XPK and output
- Run the workload
xpk workload create --cluster ${CLUSTER_NAME} --docker-image us-docker.pkg.dev/northam-ce-mlai-tpu/pmotgi-artifacts-tpu/tpu-genai-maxdiffusion2 --workload diffusion2-sdxl-v5p-7 --tpu-type v5p-16 --num-slices=${NUM_SLICES} --command "huggingface-cli login --token && python src/maxdiffusion/train_flux.py src/maxdiffusion/configs/base_flux_dev.yml run_name="test-flux-train" output_dir="gs://pmotgi-v5p-8-cp/" save_final_checkpoint=True jax_cache_dir='/tmp/ckpt' ici_data_parallelism=1 ici_fsdp_parallelism=4 ici_tensor_parallelism=2 enable_single_replica_ckpt_restoring=True"
- Error logs:
XPK Start: Fri Oct 17 15:56:53 UTC 2025 �[33m⚠️ Warning: 'huggingface-cli login' is deprecated. Use 'hf auth login' instead.�[0m The token has not been saved to the git credentials helper. Pass
add_to_git_credential=Truein this function directly or
--add-to-git-credentialif using via
hfCLI if you want to set the git credential as well. Token is valid (permission: fineGrained). The token
prem-gcloud-tokenhas been saved to /root/.cache/huggingface/stored_tokens Your token has been saved to /root/.cache/huggingface/token Login successful. The current active token is:
prem-gcloud-tokenThe cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling
transformers.utils.move_cache()`.
0it [00:00, ?it/s]
0it [00:00, ?it/s]
INFO:2025-10-17 15:57:09,773:jax._src.distributed:161: Connecting to JAX distributed service on diffusion2-sdxl-v5p-7-slice-job-0-0.diffusion2-sdxl-v5p-7:8476
I1017 15:57:09.773838 134623746173824 distributed.py:161] Connecting to JAX distributed service on diffusion2-sdxl-v5p-7-slice-job-0-0.diffusion2-sdxl-v5p-7:8476
Config param activations_dtype: bfloat16
Config param adam_b1: 0.9
Config param adam_b2: 0.999
Config param adam_eps: 1e-08
Config param adam_weight_decay: 0
Config param allow_split_physical_axes: False
Config param attention: flash
Config param base_output_directory:
Config param base_shift: 0.5
Config param cache_latents_text_encoder_outputs: True
Config param caption_column: text
Config param center_crop: False
Config param checkpoint_dir: gs://pmotgi-v5p-8-cp/test-flux-train/checkpoints/
Config param checkpoint_every: -1
Config param clip_model_name_or_path: ariG23498/clip-vit-large-patch14-text-flax
Config param compile_topology_num_slices: -1
Config param controlnet_conditioning_scale: 0.5
Config param controlnet_from_pt: True
Config param controlnet_image: https://upload.wikimedia.org/wikipedia/commons/thumb/c/c1/Google_%22G%22_logo.svg/1024px-Google_%22G%22_logo.svg.png
Config param controlnet_model_name_or_path: diffusers/controlnet-canny-sdxl-1.0
Config param data_sharding: (('data', 'fsdp', 'tensor'),)
Config param dataset_config_name:
Config param dataset_name: diffusers/pokemon-gpt4-captions
Config param dataset_save_location: /tmp/pokemon-gpt4-captions_xl
Config param dataset_type: tf
Config param dcn_data_parallelism: 1
Config param dcn_fsdp_parallelism: -1
Config param dcn_tensor_parallelism: 1
Config param diffusion_scheduler_config: {'_class_name': 'FlaxEulerDiscreteScheduler', 'prediction_type': 'epsilon', 'rescale_zero_terminal_snr': False, 'timestep_spacing': 'trailing'}
Config param do_classifier_free_guidance: True
Config param enable_data_shuffling: True
Config param enable_mllog: False
Config param enable_profiler: False
Config param enable_single_replica_ckpt_restoring: True
Config param flash_block_sizes: {}
Config param flux_name: flux-dev
Config param from_pt: True
Config param gcs_metrics: False
Config param global_batch_size_to_load: 8
Config param global_batch_size_to_train_on: 8
Config param guidance_rescale: 0.0
Config param guidance_scale: 3.5
Config param hardware: tpu
Config param hf_access_token: None
Config param hf_data_dir:
Config param hf_train_files: None
Config param ici_data_parallelism: 1
Config param ici_fsdp_parallelism: 4
Config param ici_tensor_parallelism: 2
Config param image_column: image
Config param jax_cache_dir: /tmp/ckpt
Config param jit_initializers: True
Config param learning_rate: 1e-05
Config param learning_rate_schedule_steps: 1500
Config param lightning_ckpt:
Config param lightning_from_pt: True
Config param lightning_repo:
Config param log_period: 100
Config param logical_axis_rules: (('batch', 'data'), ('activation_batch', ('data', 'fsdp')), ('activation_heads', 'tensor'), ('activation_kv', 'tensor'), ('mlp', 'tensor'), ('embed', 'fsdp'), ('heads', 'tensor'), ('conv_batch', ('data', 'fsdp')), ('out_channels', 'tensor'), ('conv_out', 'fsdp'), ('layers_per_stage', None))
Config param lora_config: {'lora_model_name_or_path': [], 'weight_name': [], 'adapter_name': [], 'scale': [], 'from_pt': []}
Config param max_grad_norm: 1.0
Config param max_sequence_length: 512
Config param max_shift: 1.15
Config param max_train_samples: -1
Config param max_train_steps: 1500
Config param mesh_axes: ['data', 'fsdp', 'tensor']
Config param metrics_dir: gs://pmotgi-v5p-8-cp/test-flux-train/metrics/
Config param metrics_file:
Config param negative_prompt: purple, red
Config param norm_num_groups: 32
Config param num_inference_steps: 50
Config param num_slices: 1
Config param num_train_epochs: 1
Config param offload_encoders: True
Config param output_dir: gs://pmotgi-v5p-8-cp/
Config param per_device_batch_size: 1
Config param precision: DEFAULT
Config param pretrained_model_name_or_path: black-forest-labs/FLUX.1-dev
Config param profiler:
Config param profiler_steps: 10
Config param prompt: A magical castle in the middle of a forest, artistic drawing
Config param prompt_2: A magical castle in the middle of a forest, artistic drawing
Config param quantization:
Config param quantization_local_shard_count: 1
Config param random_flip: False
Config param resolution: 1024
Config param reuse_example_batch: False
Config param revision: refs/pr/95
Config param run_name: test-flux-train
Config param save_config_to_gcs: False
Config param save_final_checkpoint: True
Config param scale_lr: False
Config param seed: 0
Config param skip_first_n_steps_for_profiler: 5
Config param skip_jax_distributed_system: False
Config param snr_gamma: -1.0
Config param split_head_dim: True
Config param t5xxl_model_name_or_path: ariG23498/t5-v1-1-xxl-flax
Config param tensorboard_dir: gs://pmotgi-v5p-8-cp/test-flux-train/tensorboard/
Config param text_encoder_learning_rate: 4.25e-06
Config param time_shift: True
Config param timestep_bias: {'strategy': 'none', 'multiplier': 1.0, 'begin': 0, 'end': 1000, 'portion': 0.25}
Config param timing_metrics_file:
Config param tokenize_captions_num_proc: 4
Config param tokenizer_model_name_or_path: black-forest-labs/FLUX.1-dev
Config param total_train_batch_size: 8
Config param train_data_dir:
Config param train_new_flux: False
Config param train_split: train
Config param train_text_encoder: False
Config param transform_images_num_proc: 4
Config param unet_checkpoint:
Config param use_qwix_quantization: False
Config param warmup_steps_fraction: 0.1
Config param weights_dtype: bfloat16
Config param write_metrics: True
Config param write_timing_metrics: True
Found 8 devices.
2025-10-17 15:57:17.569584: E external/local_xla/xla/stream_executor/cuda/cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)
I1017 15:57:20.708552 134623746173824 font_manager.py:1639] generated new fontManager
I1017 15:57:21.558723 134623746173824 config.py:112] TensorFlow version 2.20.0 available.
I1017 15:57:21.559335 134623746173824 config.py:125] JAX version 0.7.2 available.
Devices: [TpuDevice(id=0, process_index=0, coords=(0,0,0), core_on_chip=0), TpuDevice(id=1, process_index=0, coords=(1,0,0), core_on_chip=0), TpuDevice(id=2, process_index=0, coords=(0,1,0), core_on_chip=0), TpuDevice(id=3, process_index=0, coords=(1,1,0), core_on_chip=0), TpuDevice(id=4, process_index=1, coords=(0,0,1), core_on_chip=0), TpuDevice(id=5, process_index=1, coords=(1,0,1), core_on_chip=0), TpuDevice(id=6, process_index=1, coords=(0,1,1), core_on_chip=0), TpuDevice(id=7, process_index=1, coords=(1,1,1), core_on_chip=0)] (num_devices: 8)
Decided on mesh: [[[TpuDevice(id=0, process_index=0, coords=(0,0,0), core_on_chip=0)
TpuDevice(id=1, process_index=0, coords=(1,0,0), core_on_chip=0)]
[TpuDevice(id=4, process_index=1, coords=(0,0,1), core_on_chip=0)
TpuDevice(id=5, process_index=1, coords=(1,0,1), core_on_chip=0)]
[TpuDevice(id=2, process_index=0, coords=(0,1,0), core_on_chip=0)
TpuDevice(id=3, process_index=0, coords=(1,1,0), core_on_chip=0)]
[TpuDevice(id=6, process_index=1, coords=(0,1,1), core_on_chip=0)
TpuDevice(id=7, process_index=1, coords=(1,1,1), core_on_chip=0)]]]
Creating checkpoing manager...
checkpoint dir: gs://pmotgi-v5p-8-cp/test-flux-train/checkpoints/
item_names: ('flux_state', 'flux_config', 'vae_state', 'vae_config', 'scheduler', 'scheduler_config', 'text_encoder_2_state', 'text_encoder_2_config')
I1017 15:57:21.784265 134623746173824 checkpoint_manager.py:694] [process=0][thread=MainThread] CheckpointManager init: checkpointers=None, item_names=('flux_state', 'flux_config', 'vae_state', 'vae_config', 'scheduler', 'scheduler_config', 'text_encoder_2_state', 'text_encoder_2_config'), item_handlers=None, handler_registry=None
I1017 15:57:21.784764 134623746173824 composite_checkpoint_handler.py:237] Deferred registration for item: "metrics". Adding handler <orbax.checkpoint._src.handlers.json_checkpoint_handler.JsonCheckpointHandler object at 0x7a57d46ccd70>
for item "metrics" and save args <class 'orbax.checkpoint._src.handlers.json_checkpoint_handler.JsonSaveArgs'>
and restore args <class 'orbax.checkpoint._src.handlers.json_checkpoint_handler.JsonRestoreArgs'>
to _handler_registry
.
I1017 15:57:21.784846 134623746173824 composite_checkpoint_handler.py:505] Initialized registry DefaultCheckpointHandlerRegistry({('metrics', <class 'orbax.checkpoint._src.handlers.json_checkpoint_handler.JsonSaveArgs'>): <orbax.checkpoint._src.handlers.json_checkpoint_handler.JsonCheckpointHandler object at 0x7a57d46ccd70>, ('metrics', <class 'orbax.checkpoint._src.handlers.json_checkpoint_handler.JsonRestoreArgs'>): <orbax.checkpoint._src.handlers.json_checkpoint_handler.JsonCheckpointHandler object at 0x7a57d46ccd70>}).
I1017 15:57:21.785199 134623746173824 abstract_checkpointer.py:35] orbax-checkpoint version: 0.11.25
I1017 15:57:21.785294 134623746173824 async_checkpointer.py:177] [process=0][thread=MainThread] Using barrier_sync_fn: <function get_barrier_sync_fn.._fn at 0x7a579c3b9080> timeout: 600 secs and primary_host=0 for async checkpoint writes
I1017 15:57:22.035987 134623746173824 checkpoint_manager.py:1757] Found 0 checkpoint steps in gs://pmotgi-v5p-8-cp/test-flux-train/checkpoints
Checkpoint manager created!
Restoring stable diffusion configs
.....
Downloading shards: 0%| | 0/2 [00:00<?, ?it/s]
Downloading shards: 50%|█████ | 1/2 [00:04<00:04, 4.77s/it]
Downloading shards: 100%|██████████| 2/2 [00:08<00:00, 4.02s/it]
Downloading shards: 100%|██████████| 2/2 [00:08<00:00, 4.13s/it]
.....
You set add_prefix_space
. The tokenizer needs to be converted from the slow tokenizers
/opt/venv/lib/python3.12/site-packages/huggingface_hub/file_download.py:945: FutureWarning: resume_download
is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True
.
warnings.warn(
The config attributes {'force_upcast': True, 'latents_mean': None, 'latents_std': None, 'mid_block_add_attention': True, 'shift_factor': 0.1159} were passed to FlaxAutoencoderKL, but are not expected and will be ignored. Please verify your config.json configuration file.
/opt/venv/lib/python3.12/site-packages/maxdiffusion/configuration_utils.py:262: FutureWarning: It is deprecated to pass a pretrained model name or path to from_config
.If you were trying to load a model, please use <class 'maxdiffusion.models.flux.transformers.transformer_flux_flax.FluxTransformer2DModel'>.load_config(...) followed by <class 'maxdiffusion.models.flux.transformers.transformer_flux_flax.FluxTransformer2DModel'>.from_config(...) instead. Otherwise, please make sure to pass a configuration dictionary instead. This functionality will be removed in v1.0.0.
deprecate("config-passed-as-path", "1.0.0", deprecation_message, standard_warn=False)
Load and port flux on TFRT_CPU_0
setup_initial_state for vae_state
loading state for vae_state
Could not find the item in orbax, creating state...
Generating train split: 0%| | 0/833 [00:00<?, ? examples/s]
Generating train split: 12%|█▏ | 100/833 [00:00<00:01, 473.91 examples/s]
Generating train split: 100%|██████████| 833/833 [00:00<00:00, 3576.54 examples/s]
Parameter 'function'=functools.partial(<function FluxTrainer.tokenize_captions at 0x7a579c3b8860>, caption_column='text', encoder=functools.partial(<bound method FluxPipeline.encode_prompt of FluxPipeline {
"_class_name": "FluxPipeline",
"_diffusers_version": "0.22.0.dev0",
"clip_encoder": [
"transformers",
"FlaxCLIPTextModel"
],
"clip_tokenizer": [
"transformers",
"CLIPTokenizer"
],
"flux": [
"maxdiffusion",
"FluxTransformer2DModel"
],
"scheduler": [
null,
null
],
"t5_encoder": [
"transformers",
"FlaxT5EncoderModel"
],
"t5_tokenizer": [
"transformers",
"T5TokenizerFast"
],
"vae": [
"maxdiffusion",
"FlaxAutoencoderKL"
]
}
, clip_tokenizer=CLIPTokenizer(name_or_path='ariG23498/clip-vit-large-patch14-text-flax', vocab_size=49408, model_max_length=77, is_fast=False, padding_side='right', truncation_side='right', special_tokens={'bos_token': '<|startoftext|>', 'eos_token': '<|endoftext|>', 'unk_token': '<|endoftext|>', 'pad_token': '<|endoftext|>'}, clean_up_tokenization_spaces=True, added_tokens_decoder={
49406: AddedToken("<|startoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True, special=True),
49407: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
}
), t5_tokenizer=T5TokenizerFast(name_or_path='ariG23498/t5-v1-1-xxl-flax', vocab_size=32100, model_max_length=512, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '', 'unk_token': '', 'pad_token': '', 'additional_special_tokens': ['<extra_id_0>', '<extra_id_1>', '<extra_id_2>', '<extra_id_3>', '<extra_id_4>', '<extra_id_5>', '<extra_id_6>', '<extra_id_7>', '<extra_id_8>', '<extra_id_9>', '<extra_id_10>', '<extra_id_11>', '<extra_id_12>', '<extra_id_13>', '<extra_id_14>', '<extra_id_15>', '<extra_id_16>', '<extra_id_17>', '<extra_id_18>', '<extra_id_19>', '<extra_id_20>', '<extra_id_21>', '<extra_id_22>', '<extra_id_23>', '<extra_id_24>', '<extra_id_25>', '<extra_id_26>', '<extra_id_27>', '<extra_id_28>', '<extra_id_29>', '<extra_id_30>', '<extra_id_31>', '<extra_id_32>', '<extra_id_33>', '<extra_id_34>', '<extra_id_35>', '<extra_id_36>', '<extra_id_37>', '<extra_id_38>', '<extra_id_39>', '<extra_id_40>', '<extra_id_41>', '<extra_id_42>', '<extra_id_43>', '<extra_id_44>', '<extra_id_45>', '<extra_id_46>', '<extra_id_47>', '<extra_id_48>', '<extra_id_49>', '<extra_id_50>', '<extra_id_51>', '<extra_id_52>', '<extra_id_53>', '<extra_id_54>', '<extra_id_55>', '<extra_id_56>', '<extra_id_57>', '<extra_id_58>', '<extra_id_59>', '<extra_id_60>', '<extra_id_61>', '<extra_id_62>', '<extra_id_63>', '<extra_id_64>', '<extra_id_65>', '<extra_id_66>', '<extra_id_67>', '<extra_id_68>', '<extra_id_69>', '<extra_id_70>', '<extra_id_71>', '<extra_id_72>', '<extra_id_73>', '<extra_id_74>', '<extra_id_75>', '<extra_id_76>', '<extra_id_77>', '<extra_id_78>', '<extra_id_79>', '<extra_id_80>', '<extra_id_81>', '<extra_id_82>', '<extra_id_83>', '<extra_id_84>', '<extra_id_85>', '<extra_id_86>', '<extra_id_87>', '<extra_id_88>', '<extra_id_89>', '<extra_id_90>', '<extra_id_91>', '<extra_id_92>', '<extra_id_93>', '<extra_id_94>', '<extra_id_95>', '<extra_id_96>', '<extra_id_97>', '<extra_id_98>', '<extra_id_99>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
0: AddedToken("", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
1: AddedToken("", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
2: AddedToken("", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
32000: AddedToken("<extra_id_99>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32001: AddedToken("<extra_id_98>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32002: AddedToken("<extra_id_97>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32003: AddedToken("<extra_id_96>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32004: AddedToken("<extra_id_95>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32005: AddedToken("<extra_id_94>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32006: AddedToken("<extra_id_93>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32007: AddedToken("<extra_id_92>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32008: AddedToken("<extra_id_91>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32009: AddedToken("<extra_id_90>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32010: AddedToken("<extra_id_89>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32011: AddedToken("<extra_id_88>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32012: AddedToken("<extra_id_87>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32013: AddedToken("<extra_id_86>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32014: AddedToken("<extra_id_85>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32015: AddedToken("<extra_id_84>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32016: AddedToken("<extra_id_83>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32017: AddedToken("<extra_id_82>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32018: AddedToken("<extra_id_81>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32019: AddedToken("<extra_id_80>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32020: AddedToken("<extra_id_79>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32021: AddedToken("<extra_id_78>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32022: AddedToken("<extra_id_77>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32023: AddedToken("<extra_id_76>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32024: AddedToken("<extra_id_75>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32025: AddedToken("<extra_id_74>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32026: AddedToken("<extra_id_73>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32027: AddedToken("<extra_id_72>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32028: AddedToken("<extra_id_71>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32029: AddedToken("<extra_id_70>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32030: AddedToken("<extra_id_69>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32031: AddedToken("<extra_id_68>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32032: AddedToken("<extra_id_67>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32033: AddedToken("<extra_id_66>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32034: AddedToken("<extra_id_65>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32035: AddedToken("<extra_id_64>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32036: AddedToken("<extra_id_63>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32037: AddedToken("<extra_id_62>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32038: AddedToken("<extra_id_61>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32039: AddedToken("<extra_id_60>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32040: AddedToken("<extra_id_59>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32041: AddedToken("<extra_id_58>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32042: AddedToken("<extra_id_57>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32043: AddedToken("<extra_id_56>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32044: AddedToken("<extra_id_55>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32045: AddedToken("<extra_id_54>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32046: AddedToken("<extra_id_53>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32047: AddedToken("<extra_id_52>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32048: AddedToken("<extra_id_51>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32049: AddedToken("<extra_id_50>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32050: AddedToken("<extra_id_49>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32051: AddedToken("<extra_id_48>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32052: AddedToken("<extra_id_47>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32053: AddedToken("<extra_id_46>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32054: AddedToken("<extra_id_45>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32055: AddedToken("<extra_id_44>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32056: AddedToken("<extra_id_43>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32057: AddedToken("<extra_id_42>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32058: AddedToken("<extra_id_41>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32059: AddedToken("<extra_id_40>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32060: AddedToken("<extra_id_39>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32061: AddedToken("<extra_id_38>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32062: AddedToken("<extra_id_37>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32063: AddedToken("<extra_id_36>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32064: AddedToken("<extra_id_35>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32065: AddedToken("<extra_id_34>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32066: AddedToken("<extra_id_33>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32067: AddedToken("<extra_id_32>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32068: AddedToken("<extra_id_31>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32069: AddedToken("<extra_id_30>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32070: AddedToken("<extra_id_29>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32071: AddedToken("<extra_id_28>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32072: AddedToken("<extra_id_27>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32073: AddedToken("<extra_id_26>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32074: AddedToken("<extra_id_25>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32075: AddedToken("<extra_id_24>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32076: AddedToken("<extra_id_23>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32077: AddedToken("<extra_id_22>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32078: AddedToken("<extra_id_21>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32079: AddedToken("<extra_id_20>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32080: AddedToken("<extra_id_19>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32081: AddedToken("<extra_id_18>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32082: AddedToken("<extra_id_17>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32083: AddedToken("<extra_id_16>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32084: AddedToken("<extra_id_15>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32085: AddedToken("<extra_id_14>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32086: AddedToken("<extra_id_13>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32087: AddedToken("<extra_id_12>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32088: AddedToken("<extra_id_11>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32089: AddedToken("<extra_id_10>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32090: AddedToken("<extra_id_9>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32091: AddedToken("<extra_id_8>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32092: AddedToken("<extra_id_7>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32093: AddedToken("<extra_id_6>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32094: AddedToken("<extra_id_5>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32095: AddedToken("<extra_id_4>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32096: AddedToken("<extra_id_3>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32097: AddedToken("<extra_id_2>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32098: AddedToken("<extra_id_1>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32099: AddedToken("<extra_id_0>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
}
), clip_text_encoder=<transformers.models.clip.modeling_flax_clip.FlaxCLIPTextModel object at 0x7a57d420bf80>, t5_text_encoder=<transformers.models.t5.modeling_flax_t5.FlaxT5EncoderModel object at 0x7a524019c140>, max_sequence_length=512, encode_in_batches=True, encode_batch_size=16)) of the transform datasets.arrow_dataset.Dataset._map_single couldn't be hashed properly, a random hash was used instead. Make sure your transforms and parameters are serializable with pickle or dill for the dataset fingerprinting and caching to work. If you reuse this transform, the caching mechanism will consider it to be different from the previous calls and recompute everything. This warning is only shown once. Subsequent hashing failures won't be shown.
W1017 15:59:04.227408 134623746173824 fingerprint.py:258] Parameter 'function'=functools.partial(<function FluxTrainer.tokenize_captions at 0x7a579c3b8860>, caption_column='text', encoder=functools.partial(<bound method FluxPipeline.encode_prompt of FluxPipeline {
"_class_name": "FluxPipeline",
"_diffusers_version": "0.22.0.dev0",
"clip_encoder": [
"transformers",
"FlaxCLIPTextModel"
],
"clip_tokenizer": [
"transformers",
"CLIPTokenizer"
],
"flux": [
"maxdiffusion",
"FluxTransformer2DModel"
],
"scheduler": [
null,
null
],
"t5_encoder": [
"transformers",
"FlaxT5EncoderModel"
],
"t5_tokenizer": [
"transformers",
"T5TokenizerFast"
],
"vae": [
"maxdiffusion",
"FlaxAutoencoderKL"
]
}
, clip_tokenizer=CLIPTokenizer(name_or_path='ariG23498/clip-vit-large-patch14-text-flax', vocab_size=49408, model_max_length=77, is_fast=False, padding_side='right', truncation_side='right', special_tokens={'bos_token': '<|startoftext|>', 'eos_token': '<|endoftext|>', 'unk_token': '<|endoftext|>', 'pad_token': '<|endoftext|>'}, clean_up_tokenization_spaces=True, added_tokens_decoder={
49406: AddedToken("<|startoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True, special=True),
49407: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
}
), t5_tokenizer=T5TokenizerFast(name_or_path='ariG23498/t5-v1-1-xxl-flax', vocab_size=32100, model_max_length=512, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'eos_token': '', 'unk_token': '', 'pad_token': '', 'additional_special_tokens': ['<extra_id_0>', '<extra_id_1>', '<extra_id_2>', '<extra_id_3>', '<extra_id_4>', '<extra_id_5>', '<extra_id_6>', '<extra_id_7>', '<extra_id_8>', '<extra_id_9>', '<extra_id_10>', '<extra_id_11>', '<extra_id_12>', '<extra_id_13>', '<extra_id_14>', '<extra_id_15>', '<extra_id_16>', '<extra_id_17>', '<extra_id_18>', '<extra_id_19>', '<extra_id_20>', '<extra_id_21>', '<extra_id_22>', '<extra_id_23>', '<extra_id_24>', '<extra_id_25>', '<extra_id_26>', '<extra_id_27>', '<extra_id_28>', '<extra_id_29>', '<extra_id_30>', '<extra_id_31>', '<extra_id_32>', '<extra_id_33>', '<extra_id_34>', '<extra_id_35>', '<extra_id_36>', '<extra_id_37>', '<extra_id_38>', '<extra_id_39>', '<extra_id_40>', '<extra_id_41>', '<extra_id_42>', '<extra_id_43>', '<extra_id_44>', '<extra_id_45>', '<extra_id_46>', '<extra_id_47>', '<extra_id_48>', '<extra_id_49>', '<extra_id_50>', '<extra_id_51>', '<extra_id_52>', '<extra_id_53>', '<extra_id_54>', '<extra_id_55>', '<extra_id_56>', '<extra_id_57>', '<extra_id_58>', '<extra_id_59>', '<extra_id_60>', '<extra_id_61>', '<extra_id_62>', '<extra_id_63>', '<extra_id_64>', '<extra_id_65>', '<extra_id_66>', '<extra_id_67>', '<extra_id_68>', '<extra_id_69>', '<extra_id_70>', '<extra_id_71>', '<extra_id_72>', '<extra_id_73>', '<extra_id_74>', '<extra_id_75>', '<extra_id_76>', '<extra_id_77>', '<extra_id_78>', '<extra_id_79>', '<extra_id_80>', '<extra_id_81>', '<extra_id_82>', '<extra_id_83>', '<extra_id_84>', '<extra_id_85>', '<extra_id_86>', '<extra_id_87>', '<extra_id_88>', '<extra_id_89>', '<extra_id_90>', '<extra_id_91>', '<extra_id_92>', '<extra_id_93>', '<extra_id_94>', '<extra_id_95>', '<extra_id_96>', '<extra_id_97>', '<extra_id_98>', '<extra_id_99>']}, clean_up_tokenization_spaces=False, added_tokens_decoder={
0: AddedToken("", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
1: AddedToken("", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
2: AddedToken("", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
32000: AddedToken("<extra_id_99>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32001: AddedToken("<extra_id_98>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32002: AddedToken("<extra_id_97>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32003: AddedToken("<extra_id_96>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32004: AddedToken("<extra_id_95>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32005: AddedToken("<extra_id_94>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32006: AddedToken("<extra_id_93>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32007: AddedToken("<extra_id_92>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32008: AddedToken("<extra_id_91>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32009: AddedToken("<extra_id_90>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32010: AddedToken("<extra_id_89>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32011: AddedToken("<extra_id_88>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32012: AddedToken("<extra_id_87>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32013: AddedToken("<extra_id_86>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32014: AddedToken("<extra_id_85>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32015: AddedToken("<extra_id_84>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32016: AddedToken("<extra_id_83>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32017: AddedToken("<extra_id_82>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32018: AddedToken("<extra_id_81>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32019: AddedToken("<extra_id_80>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32020: AddedToken("<extra_id_79>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32021: AddedToken("<extra_id_78>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32022: AddedToken("<extra_id_77>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32023: AddedToken("<extra_id_76>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32024: AddedToken("<extra_id_75>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32025: AddedToken("<extra_id_74>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32026: AddedToken("<extra_id_73>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32027: AddedToken("<extra_id_72>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32028: AddedToken("<extra_id_71>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32029: AddedToken("<extra_id_70>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32030: AddedToken("<extra_id_69>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32031: AddedToken("<extra_id_68>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32032: AddedToken("<extra_id_67>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32033: AddedToken("<extra_id_66>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32034: AddedToken("<extra_id_65>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32035: AddedToken("<extra_id_64>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32036: AddedToken("<extra_id_63>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32037: AddedToken("<extra_id_62>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32038: AddedToken("<extra_id_61>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32039: AddedToken("<extra_id_60>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32040: AddedToken("<extra_id_59>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32041: AddedToken("<extra_id_58>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32042: AddedToken("<extra_id_57>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32043: AddedToken("<extra_id_56>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32044: AddedToken("<extra_id_55>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32045: AddedToken("<extra_id_54>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32046: AddedToken("<extra_id_53>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32047: AddedToken("<extra_id_52>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32048: AddedToken("<extra_id_51>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32049: AddedToken("<extra_id_50>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32050: AddedToken("<extra_id_49>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32051: AddedToken("<extra_id_48>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32052: AddedToken("<extra_id_47>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32053: AddedToken("<extra_id_46>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32054: AddedToken("<extra_id_45>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32055: AddedToken("<extra_id_44>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32056: AddedToken("<extra_id_43>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32057: AddedToken("<extra_id_42>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32058: AddedToken("<extra_id_41>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32059: AddedToken("<extra_id_40>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32060: AddedToken("<extra_id_39>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32061: AddedToken("<extra_id_38>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32062: AddedToken("<extra_id_37>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32063: AddedToken("<extra_id_36>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32064: AddedToken("<extra_id_35>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32065: AddedToken("<extra_id_34>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32066: AddedToken("<extra_id_33>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32067: AddedToken("<extra_id_32>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32068: AddedToken("<extra_id_31>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32069: AddedToken("<extra_id_30>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32070: AddedToken("<extra_id_29>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32071: AddedToken("<extra_id_28>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32072: AddedToken("<extra_id_27>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32073: AddedToken("<extra_id_26>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32074: AddedToken("<extra_id_25>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32075: AddedToken("<extra_id_24>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32076: AddedToken("<extra_id_23>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32077: AddedToken("<extra_id_22>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32078: AddedToken("<extra_id_21>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32079: AddedToken("<extra_id_20>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32080: AddedToken("<extra_id_19>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32081: AddedToken("<extra_id_18>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32082: AddedToken("<extra_id_17>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32083: AddedToken("<extra_id_16>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32084: AddedToken("<extra_id_15>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32085: AddedToken("<extra_id_14>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32086: AddedToken("<extra_id_13>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32087: AddedToken("<extra_id_12>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32088: AddedToken("<extra_id_11>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32089: AddedToken("<extra_id_10>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32090: AddedToken("<extra_id_9>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32091: AddedToken("<extra_id_8>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32092: AddedToken("<extra_id_7>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32093: AddedToken("<extra_id_6>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32094: AddedToken("<extra_id_5>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32095: AddedToken("<extra_id_4>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32096: AddedToken("<extra_id_3>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32097: AddedToken("<extra_id_2>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32098: AddedToken("<extra_id_1>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
32099: AddedToken("<extra_id_0>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=True),
}
), clip_text_encoder=<transformers.models.clip.modeling_flax_clip.FlaxCLIPTextModel object at 0x7a57d420bf80>, t5_text_encoder=<transformers.models.t5.modeling_flax_t5.FlaxT5EncoderModel object at 0x7a524019c140>, max_sequence_length=512, encode_in_batches=True, encode_batch_size=16)) of the transform datasets.arrow_dataset.Dataset._map_single couldn't be hashed properly, a random hash was used instead. Make sure your transforms and parameters are serializable with pickle or dill for the dataset fingerprinting and caching to work. If you reuse this transform, the caching mechanism will consider it to be different from the previous calls and recompute everything. This warning is only shown once. Subsequent hashing failures won't be shown.
Running tokenizer on train dataset: 0%| | 0/833 [00:00<?, ? examples/s]
Running tokenizer on train dataset: 100%|██████████| 833/833 [01:26<00:00, 9.58 examples/s]
Running tokenizer on train dataset: 100%|██████████| 833/833 [02:00<00:00, 6.89 examples/s]
Transforming images: 0%| | 0/833 [00:00<?, ? examples/s]
Transforming images: 100%|██████████| 833/833 [06:27<00:00, 2.15 examples/s]
Transforming images: 100%|██████████| 833/833 [06:49<00:00, 2.03 examples/s]
Saving the dataset (0/8 shards): 0%| | 0/833 [00:00<?, ? examples/s]
Saving the dataset (0/8 shards): 13%|█▎ | 105/833 [00:00<00:01, 442.96 examples/s]
Saving the dataset (1/8 shards): 13%|█▎ | 105/833 [00:00<00:01, 442.96 examples/s]
Saving the dataset (1/8 shards): 25%|██▌ | 209/833 [00:00<00:01, 446.23 examples/s]
Saving the dataset (2/8 shards): 25%|██▌ | 209/833 [00:00<00:01, 446.23 examples/s]
Saving the dataset (2/8 shards): 38%|███▊ | 313/833 [00:00<00:01, 451.94 examples/s]
Saving the dataset (3/8 shards): 38%|███▊ | 313/833 [00:00<00:01, 451.94 examples/s]
Saving the dataset (3/8 shards): 50%|█████ | 417/833 [00:00<00:00, 455.77 examples/s]
Saving the dataset (4/8 shards): 50%|█████ | 417/833 [00:00<00:00, 455.77 examples/s]
Saving the dataset (4/8 shards): 63%|██████▎ | 521/833 [00:01<00:00, 452.50 examples/s]
Saving the dataset (5/8 shards): 63%|██████▎ | 521/833 [00:01<00:00, 452.50 examples/s]
Saving the dataset (5/8 shards): 75%|███████▌ | 625/833 [00:01<00:00, 452.24 examples/s]
Saving the dataset (6/8 shards): 75%|███████▌ | 625/833 [00:01<00:00, 452.24 examples/s]
Saving the dataset (6/8 shards): 88%|████████▊ | 729/833 [00:01<00:00, 451.05 examples/s]
Saving the dataset (7/8 shards): 88%|████████▊ | 729/833 [00:01<00:00, 451.05 examples/s]
Saving the dataset (7/8 shards): 100%|██████████| 833/833 [00:01<00:00, 452.55 examples/s]
Saving the dataset (8/8 shards): 100%|██████████| 833/833 [00:01<00:00, 452.55 examples/s]
Saving the dataset (8/8 shards): 100%|██████████| 833/833 [00:01<00:00, 451.72 examples/s]
Load and port flux on TFRT_CPU_0
setup_initial_state for flux_state
loading state for flux_state
Could not find the item in orbax, creating state...
Traceback (most recent call last):
File "/opt/venv/lib/python3.12/site-packages/maxdiffusion/train_utils.py", line 205, in transformer_engine_context
from transformer_engine.jax.sharding import global_shard_guard, MeshResource
ModuleNotFoundError: No module named 'transformer_engine'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/jax-ai-image/maxdiffusion/src/maxdiffusion/train_flux.py", line 45, in
app.run(main)
File "/opt/venv/lib/python3.12/site-packages/absl/app.py", line 316, in run
_run_main(main, args)
File "/opt/venv/lib/python3.12/site-packages/absl/app.py", line 261, in _run_main
sys.exit(main(argv))
^^^^^^^^^^
File "/jax-ai-image/maxdiffusion/src/maxdiffusion/train_flux.py", line 41, in main
train(config)
File "/jax-ai-image/maxdiffusion/src/maxdiffusion/train_flux.py", line 33, in train
trainer.start_training()
File "/opt/venv/lib/python3.12/site-packages/maxdiffusion/trainers/flux_trainer.py", line 114, in start_training
flux_state, flux_state_mesh_shardings, flux_learning_rate_scheduler = self.create_flux_state(
^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.12/site-packages/maxdiffusion/checkpointing/flux_checkpointer.py", line 108, in create_flux_state
flux_state = jax.device_put(flux_state, state_mesh_shardings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.12/site-packages/jax/_src/api.py", line 2729, in device_put
out_flat = dispatch._batched_device_put_impl(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.12/site-packages/jax/_src/dispatch.py", line 566, in _batched_device_put_impl
y = _device_put_impl(x, device=device, src=src, copy=cp, aval=aval)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.12/site-packages/jax/_src/dispatch.py", line 553, in _device_put_impl
return _device_put_sharding_impl(x, aval, device, copy)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.12/site-packages/jax/_src/dispatch.py", line 486, in _device_put_sharding_impl
raise ValueError(
ValueError: device_put's second argument must be a Device or a Sharding which represents addressable devices, but got NamedSharding(mesh=Mesh('data': 1, 'fsdp': 4, 'tensor': 2, axis_types=(Auto, Auto, Auto)), spec=PartitionSpec('tensor',), memory_kind=device). Please pass device or Sharding which represents addressable devices.
XPK End: Fri Oct 17 16:13:44 UTC 2025
EXIT_CODE=1
`