Skip to content

Potential problem with reading batches when batches are deleted only on consumer close? #15

@hermit-crab

Description

@hermit-crab

Good day. Let's say we have a million requests inside a slot, then consumer defines either HCF_CONSUMER_MAX_REQUESTS = 15000 or HCF_CONSUMER_MAX_BATCHES = 150 or it just closes itself at after N hours. Then it also defines HCF_CONSUMER_DELETE_BATCHES_ON_STOP = True, so it only purges batches upon exiting.

In this case, since as far as I can tell there is no pagination for scrapycloud_frontier_slot.queue.iter(mincount) the consumer will be iteration only the initial MAX_NEXT_REQUESTS reading them over and over till it reaches either max requests / max batches / self enforced time limit, won't it?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions