📊 Issue — req2run: cost/tokens/memory/turns metrics

## 1) Summary (EN)

Bring the **Agentic Programming Survey** metrics into req2run so experiments report resource usage and dialogue structure alongside success rates. This enables fairer comparisons and budgeting for Planning/Coding/Verifying agents.

---

## 2) Scope & Deliverables (EN)

* **Schema changes**: `metrics.json` to include `{ tokens: {prompt, completion, tool}, cost, memoryHitRatio, turns: {count, avgLen}, latency }`.
* **Instrumentation**: per‑phase, per‑agent logging hooks in ae CLI.
* **Reporters**: CLI table + JSON + CSV export.
* **Docs**: how to interpret metrics; budgeting examples.

---

## 3) Tasks / Checklist (EN)

* [ ] Define new JSON schema and version it (e.g., `v2`).
* [ ] Add token/cost trackers (OpenAI/others), memory lookups, turn counters.
* [ ] Implement exporters (stdout table / JSON / CSV).
* [ ] Add sample experiment configs and expected outputs.
* [ ] Update `docs/benchmarking.md` with guidelines.

---

## 4) Definition of Done (EN)

* req2run outputs metrics for **at least 3** sample tasks with reproducible numbers.
* CI validates schema and runs a smoke benchmark.

---

## 5) Risks & Mitigations (EN)

* **Provider variance** → normalize by rate cards and include raw token counts.
* **Privacy** → redact content while keeping counts; opt‑out flag.

---

## 6) References (EN)

* AI Agentic Programming Survey (2024)

---

# （日本語）req2run：cost/tokens/memory/turns メトリクス拡張

## 1) 概要（JP）

Agentic Programming Survey で提案された指標を **req2run** に取り込み、成功率だけでなく **資源消費と対話構造** を定量化します。Planning/Coding/Verifying 各エージェントの配分設計に役立てます。

## 2) 対象範囲と成果物（JP）

* **スキーマ変更**：`metrics.json` を拡張（tokens/cost/memoryHitRatio/turns/latency）。
* **計測**：各フェーズ・各エージェント単位でフックを追加。
* **レポート**：CLI 表示＋JSON/CSV 出力。
* **ドキュメント**：読み方・予算化例を記述。

## 3) タスク（JP）

* [ ] JSON スキーマの策定とバージョン付け（例：`v2`）。
* [ ] トークン/コスト、メモリ参照、ターン数の計測実装。
* [ ] エクスポータ（stdout/JSON/CSV）。
* [ ] サンプル実験と期待出力の追加。
* [ ] `docs/benchmarking.md` 更新。

## 4) 完了条件（JP）

* サンプル 3 件以上で再現可能なメトリクスが出力される。
* CI がスキーマ検証とスモークベンチを実行する。

## 5) リスクと対策（JP）

* **プロバイダ差** → レートカード基準の正規化＋生トークン数の併記。
* **プライバシ** → 内容はマスクしカウントのみ保存、オプトアウトを用意。

## 6) 参考文献（JP）

* Agentic Programming Survey（2024）

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

📊 Issue — req2run: cost/tokens/memory/turns metrics #1069

1) Summary (EN)

2) Scope & Deliverables (EN)

3) Tasks / Checklist (EN)

4) Definition of Done (EN)

5) Risks & Mitigations (EN)

6) References (EN)

（日本語）req2run：cost/tokens/memory/turns メトリクス拡張

1) 概要（JP）

2) 対象範囲と成果物（JP）

3) タスク（JP）

4) 完了条件（JP）

5) リスクと対策（JP）

6) 参考文献（JP）

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

📊 Issue — req2run: cost/tokens/memory/turns metrics #1069

Description

1) Summary (EN)

2) Scope & Deliverables (EN)

3) Tasks / Checklist (EN)

4) Definition of Done (EN)

5) Risks & Mitigations (EN)

6) References (EN)

（日本語）req2run：cost/tokens/memory/turns メトリクス拡張

1) 概要（JP）

2) 対象範囲と成果物（JP）

3) タスク（JP）

4) 完了条件（JP）

5) リスクと対策（JP）

6) 参考文献（JP）

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions