Skip to content

Better energy observability #78

@jaywonchung

Description

@jaywonchung

Depends on #30 (Prometheus metric exporter integration).

Energy metrics: How much energy is being consumed? How do users measure savings?

  • Grafana dashboard for cluster-wide energy usage and breakdowns to individual training jobs integrated with Zeus

Experiment managers: Each training experiment can be associated with its energy consumption (aggregate & over-time).

  • Weights & Biases
  • MLFlow

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions