Speaking HTTP2

Understanding HTTP First Many of us who work on the business application development rarely get a chance to go deeper into the networking side of computer science. This is totally acceptable as the whole idea of software development is "Abstraction". We tend to develop things on top of technologies and methodologies that have been previously … Continue reading Speaking HTTP2


Bypassing Cloudfare DDoS in Scrapy

While doing web scraping I came across with a website who has implemented Cloudfare DDoS (Distributed Denial of Service) protection. DDoS is an attempt where a target host is attacked by multiple sources commonly to bring it down. Wikipedia. Cloudfare, apart from being a usual CDN also provides security features to the websites. One of which is the … Continue reading Bypassing Cloudfare DDoS in Scrapy

Process CSV files with multiprocessing in Pandas

Pandas gives you the ability to read large csv in chunks using a iterator. This way you don't have to load the full csv file into memory before you start processing. https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html My objective was to extract, transform and load (ETL) CSV files that is around 15GB. Here is the code snippter that can be … Continue reading Process CSV files with multiprocessing in Pandas