-
Notifications
You must be signed in to change notification settings - Fork 29
# Add fmperf Library and Update Dependencies #42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Chen Wang <Chen.Wang1@ibm.com>
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: wangchen615 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sending this out! Having the ability to deploy the model server and benchmark with different configurations makes sense. It would be good to get this working with the inference-perf library and clean up additional logic like report generation that is handled separately by inference-perf.
|
||
from kubernetes import client | ||
|
||
from fmperf.ModelSpecs import ModelSpec, TGISModelSpec, vLLMModelSpec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we call this library something else instead of fmperf? Maybe a name that makes it clear that it simplifies deployment of model server and the benchmarking tool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@achandrasekar , what would be the good library name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@achandrasekar , how about deployer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deployer sounds good to me.
from fmperf.Cluster import DeployedModel | ||
|
||
|
||
class WorkloadSpec: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have this deploy the inference-perf tool instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will take a look at your tool. Thanks, @achandrasekar
pd.set_option("future.no_silent_downcasting", True) | ||
|
||
|
||
def parse_results(results, print_df=False, print_csv=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to replace this with the reportgen in inference-perf.
- Combined dependencies from both branches - Preserved fmperf package configuration - Updated to latest upstream changes including new features and bug fixes - Resolved conflicts in pyproject.toml and pdm.lock
I am closing this for now to first wait for the community decision: whether we still want to expand inference-perf to have the orchestration, given that the llm-d-benchmark already provides the harness for orchestration. |
This PR adds the fmperf library and updates project dependencies to support it.
Changes
Added
fmperf
library with its core components:Cluster.py
: Kubernetes cluster managementModelSpecs.py
: Model specification handlingWorkloadSpecs.py
: Workload configurationutils/
: Utility modules for benchmarking, logging, and data processingUpdated project dependencies in
pyproject.toml
:pandas>=2.2.0
for data processingkubernetes>=29.0.0
for cluster managementpyyaml>=6.0.1
for configuration handlingFixed code quality issues:
__all__
exports for better module organizationFeatures
Testing
Notes