refactor: use `ContextPipeline` to initialize `BasicCrawler`'s context idiomatically by barjin · Pull Request #3388 · apify/crawlee

barjin · 2026-02-05T15:05:48Z

Extracts all CrawlingContext initialization to ContextPipeline steps to tighten the control over the CrawlingContext contents.

Blocks #3380

…iomatically

Copilot

Pull request overview

This pull request refactors the context initialization logic in Crawlee's crawler architecture by moving all CrawlingContext setup into the ContextPipeline. This change provides tighter control over context construction and prepares the codebase for the upcoming session pool exclusivity changes in PR #3380.

Changes:

Introduces a new buildContextPipeline() method in BasicCrawler that handles all core context initialization (helpers, request fetching, session management, etc.)
Moves context pipeline invocation from runRequestHandler() to the runTaskFunction level in AutoscaledPool
Updates subclasses (HttpCrawler, BrowserCrawler, FileDownload) to call super.buildContextPipeline() and extend the pipeline idiomatically

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
packages/basic-crawler/src/internals/basic-crawler.ts	Adds `buildContextPipeline()` method for idiomatic context initialization; refactors `runTaskFunction` to invoke the pipeline at a higher level with improved error handling
packages/browser-crawler/src/internals/browser-crawler.ts	Updates to call `super.buildContextPipeline()` and adds `override` keyword for type safety
packages/http-crawler/src/internals/http-crawler.ts	Updates to call `super.buildContextPipeline()` instead of creating a new pipeline; moves `ContextPipeline` import to type-only import
packages/http-crawler/src/internals/file-download.ts	Updates to call `this.buildContextPipeline()` for consistency with the new architecture
packages/playwright-crawler/src/internals/adaptive-playwright-crawler.ts	Updates to apply result-bound helpers after pipeline execution to avoid being overwritten by base crawler helpers

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

packages/playwright-crawler/src/internals/adaptive-playwright-crawler.ts

janbuchar

this is more of a refactor, I'd say...

packages/basic-crawler/src/internals/basic-crawler.ts

…extHelpers The enqueueLinks helper was accidentally removed from the resultBoundContextHelpers, causing links to not be enqueued correctly through the RequestHandlerResult in the adaptive crawler.

…line building Start context pipelines from {} instead of lying about an empty object being a CrawlingContext. The pipeline gradually extends the type through compose() calls until it reaches the final CrawlingContext shape.

janbuchar

Just a bunch of nits, good stuff overall!

packages/basic-crawler/src/internals/basic-crawler.ts

…line

janbuchar

Only three comments, two of them are fairly important.

janbuchar · 2026-02-13T13:46:19Z

packages/basic-crawler/src/internals/basic-crawler.ts

+            })
+            .compose({
+                action: async (context) => {
+                    // AdaptivePlaywrightCrawler passes edited request directly into the pipeline, we don't want to override that


Huh. It feels somewhat wrong to mention a subclass here. I think we should be more upfront about the contextPipeline understanding some properties. Couldn't it start with {request?: Request}?

Speaking of AdaptivePlaywrightCrawler, it also wants to override context helpers that touch storages, and that might be worth adding to the initial context type of the ContextPipeline too...

janbuchar · 2026-02-13T13:49:50Z

packages/playwright-crawler/src/internals/adaptive-playwright-crawler.ts

-                        async (finalContext) => await this.requestHandler(finalContext),
-                    );
+                    await this.staticContextPipeline.call(subCrawlerContext, async (finalContext) => {
+                        Object.assign(finalContext, resultBoundContextHelpers);


This means that pre-navigation hooks, for example, will have direct access to the storage, which is a regression.

Ultimately, I'd like to implement the storage interception on a different level, without relying on shady monkey patching, so maybe we can get away with leaving this. Thoughts?

Yeah, looking deeper into the AdaptivePlaywrightCrawler impl, it could use some clean up (using ContextPipeline or not).

I can almost see the last contextPipelineBuilder step "idiomatically" overriding request and the helpers... but not quite (at least not without some refactoring).

Brother, I already cleaned it up. But of course, there is always more room for improvement.

packages/basic-crawler/src/internals/basic-crawler.ts

chore: use ContextPipeline to initialize Request and Session id…

57267ac

…iomatically

barjin self-assigned this Feb 5, 2026

barjin added the adhoc Ad-hoc unplanned task added during the sprint. label Feb 5, 2026

barjin added 3 commits February 5, 2026 16:08

chore: run linter

f3eb103

chore: fix basic-crawler

d878cc3

fix: implement AdaptiveCrawler tricks with the new context pipeline

42a47d4

barjin requested a review from Copilot February 6, 2026 14:23

Copilot started reviewing on behalf of barjin February 6, 2026 14:24 View session

Copilot AI reviewed Feb 6, 2026

View reviewed changes

packages/playwright-crawler/src/internals/adaptive-playwright-crawler.ts Show resolved Hide resolved

barjin marked this pull request as ready for review February 6, 2026 15:50

barjin requested review from janbuchar February 6, 2026 15:50

janbuchar reviewed Feb 6, 2026

View reviewed changes

packages/basic-crawler/src/internals/basic-crawler.ts Outdated Show resolved Hide resolved

packages/basic-crawler/src/internals/basic-crawler.ts Outdated Show resolved Hide resolved

packages/basic-crawler/src/internals/basic-crawler.ts Outdated Show resolved Hide resolved

barjin added 5 commits February 8, 2026 12:11

fix: restore missing enqueueLinks in adaptive crawler resultBoundCont…

7ae1ecf

…extHelpers The enqueueLinks helper was accidentally removed from the resultBoundContextHelpers, causing links to not be enqueued correctly through the RequestHandlerResult in the adaptive crawler.

chore: trim verbose JSDoc on buildContextPipeline()

74ed230

chore: remove unnecessary comment in adaptive-playwright-crawler

3ca5f40

fix: AdaptiveCrawler patches to BasicCrawler

ac4cd1f

barjin changed the title ~~chore: use ContextPipeline to initialize BasicCrawler's context idiomatically~~ refactor: use ContextPipeline to initialize BasicCrawler's context idiomatically Feb 9, 2026

janbuchar reviewed Feb 10, 2026

View reviewed changes

packages/basic-crawler/src/internals/basic-crawler.ts Outdated Show resolved Hide resolved

packages/basic-crawler/src/internals/basic-crawler.ts Outdated Show resolved Hide resolved

packages/basic-crawler/src/internals/basic-crawler.ts Outdated Show resolved Hide resolved

chore: apply PR comments

35b769c

janbuchar reviewed Feb 12, 2026

View reviewed changes

packages/basic-crawler/src/internals/basic-crawler.ts Show resolved Hide resolved

barjin added 2 commits February 13, 2026 12:15

refactor: consistent use of contextPipelineBuilder / buildContextPipe…

d08424b

…line

chore: apply PR suggestion

2564f85

barjin requested a review from janbuchar February 13, 2026 12:18

janbuchar reviewed Feb 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: use `ContextPipeline` to initialize `BasicCrawler`'s context idiomatically#3388

refactor: use `ContextPipeline` to initialize `BasicCrawler`'s context idiomatically#3388
barjin wants to merge 12 commits intov4from
chore/more-context-pipeline

barjin commented Feb 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

janbuchar left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

janbuchar left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

janbuchar left a comment

Uh oh!

janbuchar Feb 13, 2026

Uh oh!

janbuchar Feb 13, 2026

Uh oh!

barjin Feb 16, 2026

Uh oh!

janbuchar Feb 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

barjin commented Feb 5, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

janbuchar left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

janbuchar left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

janbuchar left a comment

Choose a reason for hiding this comment

Uh oh!

janbuchar Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

janbuchar Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

barjin Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

janbuchar Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants