-
Notifications
You must be signed in to change notification settings - Fork 6
Description
thanks for quick response, how exactly those slots can be leveraged to influence the order of request process?
I have 1 spider with both producer and consumer settings as shown below:
{'HCF_AUTH': _scrapy_cloud_key,
'HCF_PROJECT_ID': project_id,
'HCF_PRODUCER_FRONTIER': 'frontier',
'HCF_PRODUCER_NUMBER_OF_SLOTS': 1,
'HCF_PRODUCER_BATCH_SIZE': 300,
'HCF_PRODUCER_SLOT_PREFIX': 'links',
'HCF_CONSUMER_FRONTIER': 'frontier',
'HCF_CONSUMER_SLOT': 'links0',}
I am running spider for an interval of 10 minutes at depth 1. For some urls it finishes before 10 minutes but some url's takes more time so consumer part of crawler is not consuming all the links. The issue I am facing is when I am running spider more than one time it is not starting from start url but from a url which is previously saved to frontier (reading frontier batch before start url). Also how many slots can be created inside a frontier?
Originally posted by @Nishant-Bansal-777 in #26 (comment)