Skip to content

[Enhancement]: optional model_name for endpoint configuration production variants #40644

@isaac-smothers

Description

@isaac-smothers

Description

Currently, you cannot create sagemaker endpoints that can be utilized with sagemaker inference components. The TL;DR is that you need to connect your inference components to an endpoint deployed without a model attached to its production variant, and this isn't possible as model_name is a required property of an endpoint configuration's production variant. It shouldn't be.

Long winded version -
While terraform cannot yet create inference components, (there is a year old request for them however) to create them. These inference components get attached to existing endpoints. If you try to attach it to an endpoint that already has a running model, you get Invalid request provided: Inference Components are not supported in this Endpoint. Please make sure this endpoint can deploy inference components.

I did some research and found that in order to attach an inference component to an endpoint, the endpoint must have been configured without any models. This is done by not specifying a model_name when creating a production variant inside of a endpoint configuration. You can do this via boto (although not the UI), as modelName is not a required property (see here). However terraform's AWS package does specify it as required. This blocks you from being able to attach an inference component to a terraform-created endpoint.

Affected Resource(s) and/or Data Source(s)

aws_sagemaker_endpoint_configuration

Potential Terraform Configuration

resource "aws_sagemaker_endpoint_configuration" "ec" {
  name = "my-endpoint-config"

  production_variants {
    variant_name           = "variant-1"
    initial_instance_count = 1
    instance_type          = "ml.t2.medium"
  }

  tags = {
    Name = "foo"
  }
}

References

https://aws.amazon.com/blogs/machine-learning/easily-deploy-and-manage-hundreds-of-lora-adapters-with-sagemaker-efficient-multi-adapter-inference/

Would you like to implement a fix?

Yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementRequests to existing resources that expand the functionality or scope.service/sagemakerIssues and PRs that pertain to the sagemaker service.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions