How To Set Different Ip According To Different Commands Of One Single Scrapy.spider?
I have a bunch of pages to scrape, about 200 000. I usually use Tor and Polipo proxy to hide my spiders behaviors even if they are polite, we never know. So if I login this is usel
Solution 1:
If you get a proxy list then you can use 'scrapy_proxies.RandomProxy' in DOWNLOADER_MIDDLEWARES to chose a random proxy from the list for every new page.
In the settings of your spider:
DOWNLOADER_MIDDLEWARES = {
'scrapy.downloadermiddlewares.retry.RetryMiddleware': 90,'scrapy_proxies.RandomProxy': 100,'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': 110,
}
PROXY_LIST = 'path/proxylist.txt'
PROXY_MODE = 0
With this method there is nothing to add to the spider script
Post a Comment for "How To Set Different Ip According To Different Commands Of One Single Scrapy.spider?"