Skip to content

Unify server-side and model-side Config #2862

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: develop
Choose a base branch
from

Conversation

YuanRisheng
Copy link
Collaborator

@YuanRisheng YuanRisheng commented Jul 16, 2025

本PR主要做了如下工作:

  • 全局Config进行统一,删除fastdeploy/engine/config.py文件
  • 各类Config中的字段合理归置,并且参照vllm新增MultiModalConfig,ObservabilityConfig

Copy link

paddle-bot bot commented Jul 16, 2025

Thanks for your contribution!

@YuanRisheng YuanRisheng requested a review from Copilot July 21, 2025 12:42
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR unifies the server-side and model-side configuration system by consolidating the various config classes and eliminating the fastdeploy/engine/config.py file. The changes reorganize configuration fields across different specialized config classes (SchedulerConfig, CacheConfig, DeviceConfig, etc.) and introduce new config types (MultiModalConfig, ObservabilityConfig) following the vLLM pattern.

Key changes:

  • Unified configuration structure eliminating duplicate config handling
  • Added new specialized config classes for better organization
  • Migrated fields from ParallelConfig to appropriate specialized configs
  • Updated all references throughout the codebase to use the new config structure

Reviewed Changes

Copilot reviewed 41 out of 41 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
fastdeploy/worker/xpu_worker.py Updated to use model_config.dtype and device_config.device_ids instead of parallel_config
fastdeploy/worker/xpu_model_runner.py Migrated parallel_config references to scheduler_config and model_config appropriately
fastdeploy/worker/worker_process.py Added new config imports and updated field access patterns
fastdeploy/worker/worker_base.py Added cache_config reference
fastdeploy/worker/model_runner_base.py Added references to new specialized config objects
fastdeploy/worker/iluvatar_worker.py Updated config field access patterns
fastdeploy/worker/iluvatar_model_runner.py Migrated config references and updated observability_config usage
fastdeploy/worker/gpu_worker.py Updated device and cache config field access
fastdeploy/worker/gpu_model_runner.py Comprehensive config migration and field access updates
fastdeploy/worker/gcu_worker.py Updated device config field access
fastdeploy/worker/gcu_model_runner.py Migrated config references and added duplicate log statement
fastdeploy/worker/dcu_worker.py Updated cache config field access
fastdeploy/splitwise/splitwise_connector.py Updated scheduler_config field access patterns
fastdeploy/spec_decode/mtp.py Updated config field access and removed kv_cache_config reference
fastdeploy/spec_decode/base.py Updated config field access patterns
fastdeploy/scheduler/config.py Major refactoring to add config fields and validation logic
fastdeploy/rl/dynamic_weight_manager.py Updated model config field access
fastdeploy/output/token_processor.py Updated model_config field access
fastdeploy/model_executor/pre_and_post_process.py Updated import path
fastdeploy/model_executor/model_loader.py Updated config field access
fastdeploy/model_executor/layers/moe/fused_moe_backend_base.py Updated model_config field access
fastdeploy/model_executor/layers/backends/gcu/attention/mem_efficient_attn_backend.py Updated config field access patterns
fastdeploy/model_executor/layers/backends/gcu/attention/flash_attn_backend.py Updated config field access patterns
fastdeploy/model_executor/layers/attention/ (multiple files) Updated config field access patterns across attention backends
fastdeploy/model_executor/guided_decoding/ (multiple files) Updated config field access patterns
fastdeploy/input/preprocess.py Updated import path and ModelConfig usage
fastdeploy/entrypoints/llm.py Updated config field access
fastdeploy/engine/expert_service.py Updated config field access patterns throughout
fastdeploy/engine/engine.py Comprehensive config field access updates
fastdeploy/engine/config.py File deleted (975 lines removed)
fastdeploy/engine/args_utils.py Updated imports and config creation logic
Comments suppressed due to low confidence (1)

fastdeploy/scheduler/config.py:94

  • The field 'max_model_len' is being set in the constructor but later referenced as 'self.scheduler_config.max_model_len'. Consider ensuring consistent field naming and access patterns.
    Configuration class for GlobalScheduler (Redis-based).

batch_size=batch_size,
in_capturing=True,
expected_decode_len=expected_decode_len,
)
logger.info(f"Warm up the model with the batch size:{batch_size}, num tokens:{expected_decode_len}")
logger.info(f"Warm up the model with the batch size:{batch_size}, num tokens:{expected_decode_len}")
Copy link
Preview

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate log statement detected. This line appears to be a copy-paste error as it's identical to line 810.

Suggested change
logger.info(f"Warm up the model with the batch size:{batch_size}, num tokens:{expected_decode_len}")

Copilot uses AI. Check for mistakes.

@@ -23,7 +23,7 @@

try:
from paddle.nn.functional.flash_attention import flash_attention_v3_varlen
except:
except Exception:
Copy link
Preview

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using bare 'except Exception:' is too broad. Consider catching specific exceptions or using 'except ImportError:' since this appears to be handling an import failure.

Suggested change
except Exception:
except ImportError:

Copilot uses AI. Check for mistakes.

Comment on lines +355 to 356
except Exception:
pass
Copy link
Preview

Copilot AI Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using bare 'except Exception:' is too broad. Consider catching specific exceptions that might occur during process termination.

Suggested change
except Exception:
pass
except ProcessLookupError:
llm_logger.warning(f"Process {p.pid} does not exist.")
except PermissionError:
llm_logger.error(f"Permission denied while trying to kill process {p.pid}.")
except Exception as e:
llm_logger.exception(f"Unexpected error while killing process {p.pid}: {e}")
raise

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant