Amazon Redshift

Redshift is a petabyte-scale data warehousing solution. It’s a column-based database designed for analytical workloads. Generally, a relational store like RDS would be used for OLTP workloads (e.g., queries, inserts, updates, and deletes), and Redshift would be used for OLAP (e.g., retrieval and analytics). Multiple databases become source data to be injected into a data warehouse solution such as Redshift.

Continue reading “Amazon Redshift”

Amazon Elastic Map Reduce (EMR)

Amazon Elastic MapReduce (EMR) is a tool for large-scale parallel processing of big data and other large data workloads. It’s based on the Apache Hadoop framework and is delivered as a managed cluster using EC2 instances. EMR is used for huge-scale log analysis, indexing, machine learning, financial analysis, simulations, bioinformatics, and many other large-scale applications.

Continue reading “Amazon Elastic Map Reduce (EMR)”