Skip to content
This repository was archived by the owner on Jan 14, 2025. It is now read-only.
This repository was archived by the owner on Jan 14, 2025. It is now read-only.

Spike - IaaS-based Compute #593

@heoelri

Description

@heoelri

This documents replacing AKS with IaaS as the compute platform used for Azure Mission-Critical. Some specific scenarios might require the use of IaaS VMs instead of PaaS services. Potential reasons are:

  • Lack of knowledge and skills
  • Legacy workloads that require OS-level access or specific drivers and configurations
  • Performance requirements that cannot fullfilled in containers or PaaS services
  • Lack of support for 3rd-party workloads

Changes required compared to Mission-Critical-Online:

  • Removed AKS and replaced with VMSS
    • Requires a replacement for ingress e.g. AppGW (or FD?) - AppGw might make sense here - potentially with a PLS in front to expose it via AFD Premium
    • Requires different rollout process for the workload
    • Two VMSS one for Frontend (exposed via AppGw) one for Backend - not exposed hosting the backend processing
  • Removed ACR
  • Added shared image gallery (as global service for now) to store images

Scenarios to address:

  • Scalable / stateless workloads -> Virtual Machine Scale Sets
  • Static / stateful workloads -> Virtual Machines in an AV-Set

Open questions / findings:

  • boot diag storage for vmss does not support zrs
  • shared image gallery as a global service or per stamp?
  • can stateful workloads hosted in vmss in a meaningful way
  • what's the recommended (and most reliable) way to rollout software to (windows) vms?
  • where to store application/workload components? (pendant for acr in a more cloud-native scenario) storage accounts?
  • how to deal with dependencies like ADDS, WSFC, ..
  • database backends (on VMs) in or out of scope?

Recommendations:

  • Security
    • Disable username / password authentication when using Linux
    • Store VMSS credentials in Azure KeyVault
  • Compute
    • Same Zone considerations apply; spread across zones if possible OR consolidate in less than 3 zones if proximity is required and/or latency is a concern

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions