All posts tagged BigData

Using Spark For Data Exploration

Spark is actively supported by Apache Open Source community, and it is used in production by many famous firms and companies. In this blog, the focus would be on productionizing Apache Spark. I will discuss the use cases of Spark and how to enable each of them on production environment. Currently, Spark has 2 deployment modes (Client , Cluster) with 3 »

Hive, a must known tool for any data engineer

Hive is a data warehouse system built on top of hadoop for allowing querying and managing data sets. Who ? Hive was created by Facebook and is currently highly adopted by many firms including Netflix, Facebook and Bookings. Why ? Actually not everyone is fond of writing java programs for every problem they have especially data analysts. Hive provides a high level »

Redis : Installation and configuration

Redis is a famous caching layer and in-memory database that is used in a lot of large-scale projects. Redis is used by Twitter GitHub, Pinterest, Snapchat, StackOverflow and Flickr. It supports data structures such as strings, hashes, lists, sets, sorted sets, bitmaps and geospatial indexes with radius queries. Some common usage scenarios can be found here Installation Whether you are »