Summary and Per Request Report Generation

We should define the shape of the output report and how to configure what gets reported. Keep in mind the output report may be extremely large or unreadable if all info is included by default such as per request metrics, per message inputs/outputs, etc.

ReportConfig API Proposal:
```
ReportConfig:
  Requests:
    Summary: null                   # If included, reports summary metrics for all requests (median, p50, p90, p99 for TPOT, input_len, output_len, etc)
    PerRequest:                     # If included, reports per request level metrics (start_time, end_time, input_len, output_len)
      IncludeInputs: boolean        # replace input_len with input request body
      IncludeOutputs: boolean       # replace output_len with output request body
  Prometheus:
    Summary: null                   # If included, report prometheus metrics query results with the window length equal to total experement time
    PerStage: null                  # If included, report prometheus metrics query results for each stage
    Periodic:
      Interval: uint                # Scrape metrics every {interval} seconds, include the results of each scrape including a timestamp
```

Related issues:
* https://github.com/kubernetes-sigs/inference-perf/issues/30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Summary and Per Request Report Generation #56

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Summary and Per Request Report Generation #56

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions