Skip to content Skip to sidebar Skip to footer

How To Set Different Ip According To Different Commands Of One Single Scrapy.spider?

I have a bunch of pages to scrape, about 200 000. I usually use Tor and Polipo proxy to hide my spiders behaviors even if they are polite, we never know. So if I login this is usel

Solution 1:

If you get a proxy list then you can use 'scrapy_proxies.RandomProxy' in DOWNLOADER_MIDDLEWARES to chose a random proxy from the list for every new page.

In the settings of your spider:

DOWNLOADER_MIDDLEWARES = {
'scrapy.downloadermiddlewares.retry.RetryMiddleware': 90,'scrapy_proxies.RandomProxy': 100,'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': 110,
}

PROXY_LIST = 'path/proxylist.txt'
PROXY_MODE = 0

With this method there is nothing to add to the spider script

Post a Comment for "How To Set Different Ip According To Different Commands Of One Single Scrapy.spider?"