Skip to content

Commit 6ebbcbd

Browse files
authored
Merge branch 'main' into open-ai-compat
2 parents 9313e8f + 11a7ad9 commit 6ebbcbd

File tree

12 files changed

+171
-246
lines changed

12 files changed

+171
-246
lines changed

docs/agents/index.mdx

Lines changed: 42 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -152,7 +152,7 @@ Tools are controlled via an explicit **whitelist**. The `tools` array lists patt
152152

153153
**Inheritance:** Use `base` to inherit behavior from another agent:
154154

155-
- `base: plan` — Plan-mode behaviors (enables `ask_user_question`, `propose_plan`)
155+
- `base: plan` — Plan-mode behaviors and built-in planning guidance (enables `ask_user_question`, `propose_plan`)
156156
- `base: exec` — Exec-mode behaviors (standard coding workflow)
157157
- `base: <custom-agent-id>` — Inherit from any custom agent
158158

@@ -598,44 +598,51 @@ tools:
598598

599599
You are in Plan Mode.
600600

601-
- Every response MUST produce or update a plan—no exceptions.
602-
- Simple requests deserve simple plans; a straightforward task might only need a few bullet points. Match plan complexity to the problem.
603-
- Keep the plan scannable; put long rationale in `<details>/<summary>` blocks.
604-
- Plans must be **self-contained**: include enough context, goals, constraints, and the core "why" so a new assistant can implement without needing the prior chat.
605-
- When Plan Mode is requested, assume the user wants the actual completed plan; do not merely describe how you would devise one.
601+
- Every response MUST produce or update a plan.
602+
- Match the plan's size and structure to the problem.
603+
- Keep the plan self-contained and scannable.
604+
- Assume the user wants the completed plan, not a description of how you would make one.
606605

607-
## Investigation step (required)
606+
## Investigate only what you need
608607

609-
Before proposing a plan, identify what you must verify and delegate repo investigation to Explore
610-
sub-agents. Do not guess.
608+
Before proposing a plan, figure out what you need to verify and gather that evidence.
611609

612-
- Use Explore tasks for repo investigation (files, callsites, patterns, feasibility checks)
613-
whenever delegation is available.
614-
- Do not inspect repo files yourself to verify, enrich, or second-guess an Explore report.
615-
- If reports conflict, feel incomplete, or leave a specific gap, spawn another narrowly focused
616-
Explore task for that discrepancy.
617-
- If task delegation is unavailable in this workspace, use the narrowest read-only repo
618-
investigation needed to close that specific gap.
610+
- When delegation is available, use Explore sub-agents for repo investigation. In Plan Mode, only
611+
spawn `agentId: "explore"` tasks.
612+
- Give each Explore task specific deliverables, and parallelize them when that helps.
613+
- Trust completed Explore reports for repo facts. Do not re-investigate just to second-guess them.
614+
If something is missing, ambiguous, or conflicting, spawn another focused Explore task.
615+
- If task delegation is unavailable, do the narrowest read-only investigation yourself.
619616
- Reserve `file_read` for the plan file itself, user-provided text already in this conversation,
620-
and that narrow fallback—not for normal repo investigation.
621-
622-
When you do read the plan file itself, prefer `file_read` over `bash cat`: long bash output may be
623-
compacted, which can hide the middle of a document. Use `file_read` with offset/limit to page
624-
through larger files.
625-
626-
## Plan format
627-
628-
- Context/Why: Briefly restate the request, goals, and the rationale or user impact so the
629-
plan stands alone for a fresh implementer.
630-
- Evidence: List sources consulted (file paths, tool outputs, or user-provided info) and
631-
why they are sufficient. If evidence is missing, still produce a minimal plan and add a
632-
Questions section listing what you need to proceed.
633-
634-
- Implementation details: List concrete edits (file paths + symbols) in the order you would implement them.
635-
- Where it meaningfully reduces ambiguity, include **reasonably sized** code snippets (fenced code blocks) that show the intended shape of the change.
636-
- Keep snippets focused (avoid whole-file dumps); elide unrelated context with `...`.
637-
638-
Detailed plan mode instructions (plan file path, sub-agent delegation, propose_plan workflow) are provided separately.
617+
and that narrow fallback. When reading the plan file, prefer `file_read` over `bash cat` so long
618+
plans do not get compacted.
619+
- Wait for any spawned Explore tasks before calling `propose_plan`.
620+
621+
## Write the plan
622+
623+
- Use whatever structure best fits the problem: a few bullets, phases, workstreams, risks, or
624+
decision points are all fine.
625+
- Include the context, constraints, evidence, and concrete path forward somewhere in that
626+
structure.
627+
- Name the files, symbols, or subsystems that matter, and order the work so an implementer can
628+
follow it.
629+
- Keep uncertainty brief and local to the relevant step. Use `ask_user_question` when you need the
630+
user to decide something.
631+
- Include small code snippets only when they materially reduce ambiguity.
632+
- Put long rationale or background into `<details>/<summary>` blocks.
633+
634+
## Questions and handoff
635+
636+
- If you need clarification from the user, use `ask_user_question` instead of asking in chat or
637+
adding an "Open Questions" section to the plan.
638+
- Ask up to 4 questions at a time (2–4 options each; "Other" remains available for free-form
639+
input).
640+
- After you get answers, update the plan and then call `propose_plan` when it is ready for review.
641+
- After calling `propose_plan`, do not paste the plan into chat or mention the plan file path.
642+
- If the user wants edits to other files, ask them to switch to Exec mode.
643+
644+
Workspace-specific runtime instructions (plan file path, edit restrictions, nesting warnings) are
645+
provided separately.
639646
```
640647

641648
</Accordion>

src/browser/features/Messages/ChatBarrier/RetryBarrier.test.tsx

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -136,6 +136,26 @@ describe("RetryBarrier", () => {
136136
globalThis.document = undefined as unknown as Document;
137137
});
138138

139+
test("uses delayed-start copy while the first response is still starting", () => {
140+
currentWorkspaceState = createWorkspaceState({
141+
isStreamStarting: true,
142+
messages: [
143+
{
144+
type: "user",
145+
id: "user-1",
146+
historyId: "user-1",
147+
content: "Hello",
148+
historySequence: 1,
149+
},
150+
],
151+
});
152+
153+
const view = render(<RetryBarrier workspaceId="ws-1" />);
154+
155+
expect(view.getByText("Response startup is taking longer than expected")).toBeTruthy();
156+
expect(view.queryByText("Stream interrupted")).toBeNull();
157+
});
158+
139159
test("shows error details when manual resume fails before stream events", async () => {
140160
resumeStreamResult = {
141161
success: false,

src/browser/features/Messages/ChatBarrier/RetryBarrier.tsx

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -242,11 +242,23 @@ export const RetryBarrier: React.FC<RetryBarrierProps> = (props) => {
242242
const lastMessage = getLastNonDecorativeMessage(workspaceState.messages);
243243
const lastStreamError = lastMessage?.type === "stream-error" ? lastMessage : null;
244244
const interruptionReason = lastStreamError?.errorType === "rate_limit" ? "Rate limited" : null;
245+
const isWaitingForInitialResponse =
246+
lastMessage?.type === "user" && workspaceState.isStreamStarting;
245247

246248
let statusIcon: React.ReactNode = (
247249
<AlertTriangle aria-hidden="true" className="text-warning h-4 w-4 shrink-0" />
248250
);
249-
let statusText: React.ReactNode = <>{interruptionReason ?? "Stream interrupted"}</>;
251+
let statusText: React.ReactNode = (
252+
<>
253+
{interruptionReason ??
254+
// A trailing user message means the backend has not emitted stream-start yet.
255+
// Long init hooks (for example over SSH) can legitimately keep us here, so avoid
256+
// claiming the stream was interrupted until we have evidence that it actually was.
257+
(isWaitingForInitialResponse
258+
? "Response startup is taking longer than expected"
259+
: "Stream interrupted")}
260+
</>
261+
);
250262
let actionButton: React.ReactNode = (
251263
<button
252264
className="bg-warning font-primary text-background cursor-pointer rounded border-none px-4 py-2 text-xs font-semibold whitespace-nowrap transition-all duration-200 hover:-translate-y-px hover:brightness-120 active:translate-y-0 disabled:cursor-not-allowed disabled:opacity-50"

src/common/utils/tools/taskToolTypeGuards.ts

Lines changed: 0 additions & 47 deletions
This file was deleted.
Lines changed: 9 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,13 @@
11
import { getPlanFileHint, getPlanModeInstruction } from "./modeUtils";
22

33
describe("getPlanModeInstruction", () => {
4-
it("provides plan file path context", () => {
5-
const instruction = getPlanModeInstruction("/tmp/plan.md", false);
4+
it("threads the exact plan file path through both creation and resume flows", () => {
5+
const newPlanInstruction = getPlanModeInstruction("/tmp/plan.md", false);
6+
const existingPlanInstruction = getPlanModeInstruction("/tmp/plan.md", true);
67

7-
expect(instruction).toContain("Plan file path: /tmp/plan.md");
8-
expect(instruction).toContain("No plan file exists yet");
9-
expect(instruction).toContain("file_edit_* tools");
10-
});
11-
12-
it("routes ambiguous Explore reports to another narrow Explore task", () => {
13-
const instruction = getPlanModeInstruction("/tmp/plan.md", false);
14-
15-
expect(instruction).toContain("Anti-pattern: using `file_read` or `bash` in Plan Mode");
16-
expect(instruction).toContain("spawn another narrowly focused Explore task instead");
17-
expect(instruction).toContain("plan file itself");
18-
expect(instruction).toContain("user-provided text already in this conversation");
19-
expect(instruction).toContain("task delegation is unavailable in this workspace");
20-
expect(instruction).toContain("use the narrowest read-only investigation needed");
21-
expect(instruction).not.toContain(
22-
"only re-check if the report is ambiguous or contradicts other evidence"
23-
);
24-
});
25-
26-
it("indicates when plan file already exists", () => {
27-
const instruction = getPlanModeInstruction("/tmp/existing-plan.md", true);
28-
29-
expect(instruction).toContain("Plan file path: /tmp/existing-plan.md");
30-
expect(instruction).toContain("A plan file already exists");
31-
expect(instruction).toContain("read it to determine if it's relevant");
8+
expect(newPlanInstruction).toContain("/tmp/plan.md");
9+
expect(existingPlanInstruction).toContain("/tmp/plan.md");
10+
expect(newPlanInstruction).not.toEqual(existingPlanInstruction);
3211
});
3312
});
3413

@@ -37,13 +16,10 @@ describe("getPlanFileHint", () => {
3716
expect(getPlanFileHint("/tmp/plan.md", false)).toBeNull();
3817
});
3918

40-
it("includes post-compaction guidance and an ignore escape hatch", () => {
19+
it("returns a non-null hint keyed to the saved plan path", () => {
4120
const hint = getPlanFileHint("/tmp/plan.md", true);
4221

43-
if (!hint) throw new Error("expected non-null hint");
44-
45-
expect(hint).toContain("A plan file exists at: /tmp/plan.md");
46-
expect(hint).toContain("compaction/context reset");
47-
expect(hint).toContain("If it is unrelated to the current request, ignore it.");
22+
expect(hint).not.toBeNull();
23+
expect(hint).toContain("/tmp/plan.md");
4824
});
4925
});

src/common/utils/ui/modeUtils.ts

Lines changed: 7 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
/**
2-
* Generate the system instruction for Plan Mode with file path context.
2+
* Generate runtime-only instructions for plan-like agents.
33
*
4-
* This provides comprehensive plan mode behavioral instructions that are
5-
* injected for ALL plan-like agents (built-in Plan agent or custom agents
6-
* that enable propose_plan). The instructions are injected as
7-
* <additional-instructions> in the system prompt.
4+
* These instructions carry workspace-specific facts that agent specs cannot encode,
5+
* such as the exact plan file path, whether a plan already exists, and the
6+
* non-overridable file-edit restrictions enforced by plan mode.
7+
* Opinionated planning guidance lives in the agent spec so users can override it.
88
*/
99
export function getPlanModeInstruction(planFilePath: string, planExists: boolean): string {
1010
const exactPlanPathRule = planFilePath.startsWith("~/")
@@ -22,35 +22,10 @@ Build your plan incrementally by writing to or editing this file.
2222
NOTE: The plan file is the only file you are allowed to edit. Other than that you may only take READ-ONLY actions.
2323
${exactPlanPathRule}
2424
25-
Keep the plan crisp and focused on actionable recommendations:
26-
- Put historical context, alternatives considered, or lengthy rationale into collapsible \`<details>/<summary>\` blocks so the core plan stays scannable.
27-
- When listing implementation details, include **reasonably sized** code snippets (fenced code blocks) for key changes—enough to remove ambiguity, but avoid whole-file dumps. Use ellipses (...) to omit unrelated context.
28-
- **Aggressively prune completed or irrelevant content.** When sections become outdated—tasks finished, approaches abandoned, questions answered—delete them entirely rather than moving them to an appendix or marking them done. The plan should reflect current state, not accumulate history.
29-
- Each revision should leave the plan shorter or unchanged in scope, never longer unless the actual work grew.
30-
31-
If you need investigation (codebase exploration, tracing callsites, locating patterns, feasibility checks) before you can produce a good plan, delegate it to Explore sub-agents via the \`task\` tool:
32-
- In Plan Mode, you MUST ONLY spawn \`agentId: "explore"\` tasks. Do NOT spawn \`agentId: "exec"\` tasks in Plan Mode.
33-
- Use \`agentId: "explore"\` for read-only repo/code exploration and optional web lookups when relevant.
34-
- In each task prompt, specify explicit deliverables (what questions to answer, what files/symbols to locate, and the exact output format you want back).
35-
- Prefer running multiple Explore tasks in parallel with \`run_in_background: true\`, then use \`task_await\` (optionally with \`task_ids\`) until all spawned tasks are \`completed\`.
36-
- Trust Explore sub-agent reports as authoritative for repo facts (paths/symbols/callsites). Treat them as sufficient evidence for the plan.
37-
- Anti-pattern: using \`file_read\` or \`bash\` in Plan Mode to verify, enrich, or second-guess an Explore report. If a report is ambiguous, incomplete, or conflicts with another report, spawn another narrowly focused Explore task instead. This anti-pattern does not apply to reading or editing the plan file itself, to user-provided text already in this conversation, or to the narrowest read-only repo investigation needed when task delegation is unavailable in this workspace.
38-
- While Explore tasks run, and after they complete, do NOT perform repo exploration yourself if delegation is available. If task tools are disabled in this workspace, use the narrowest read-only investigation needed to close the specific gap, then synthesize the plan in this session.
39-
- Do NOT call \`propose_plan\` until you have awaited and incorporated sub-agent reports.
40-
41-
If you need clarification from the user before you can finalize the plan, you MUST use the ask_user_question tool.
42-
- Do not ask questions in a normal chat message.
43-
- Do not include an "Open Questions" section in the plan.
44-
- Ask up to 4 questions at a time (each with 2–4 options; "Other" is always available for free-form input).
45-
- After you receive answers, update the plan file and only then call propose_plan.
46-
- After calling propose_plan, do not repeat/paste the plan contents in chat; the UI already renders the full plan.
47-
- After calling propose_plan, do not say "the plan is ready at <path>" or otherwise mention the plan file location; it's already shown in the Plan UI.
48-
49-
When you have finished writing your plan and are ready for user approval, call the propose_plan tool.
5025
Do not make other edits in plan mode. You may have tools like bash but only use them for read-only operations.
5126
Read-only bash means: no redirects/heredocs, no rm/mv/cp/mkdir/touch, no git add/commit, and no dependency installs.
52-
53-
If the user suggests that you should make edits to other files, ask them to switch to Exec mode first!
27+
When the plan is ready for user review, call \`propose_plan\`.
28+
After calling \`propose_plan\`, do not paste the plan into chat or mention the plan file path.
5429
`;
5530
}
5631

0 commit comments

Comments
 (0)