Bypassing Cloudfare DDoS in Scrapy

While doing web scraping I came across with a website who has implemented Cloudfare DDoS (Distributed Denial of Service) protection. DDoS is an attempt where a target host is attacked by multiple sources commonly to bring it down. Wikipedia. Cloudfare, apart from being a usual CDN also provides security features to the websites. One of which is the … Continue reading Bypassing Cloudfare DDoS in Scrapy


Process CSV files with multiprocessing in Pandas

Pandas gives you the ability to read large csv in chunks using a iterator. This way you don't have to load the full csv file into memory before you start processing. My objective was to extract, transform and load (ETL) CSV files that is around 15GB. Here is the code snippter that can be … Continue reading Process CSV files with multiprocessing in Pandas