-
Notifications
You must be signed in to change notification settings - Fork 425
Closed
Labels
enhancementNew feature or request.New feature or request.t-toolingIssues with this label are in the ownership of the tooling team.Issues with this label are in the ownership of the tooling team.
Milestone
Description
When using the file system storage client, Crawlee for Python is significantly slower than Crawlee for JavaScript.
Processing 1000 requests to a local HTTP server (pydantic models are loaded in advance):
- Crawlee TS - memory: ~1.8s
- Crawlee TS - FS: ~3s
- Scrapy - memory*: 1.9s
- Crawlee Py - new memory: ~1.5s
- Crawlee Py - new FS: ~13.6s
Optimize it.
Ideas:
- atomic write - several FS operations per 1 write
- index file for handled/sequence/forefront
Metadata
Metadata
Assignees
Labels
enhancementNew feature or request.New feature or request.t-toolingIssues with this label are in the ownership of the tooling team.Issues with this label are in the ownership of the tooling team.