Skip to content
Rafael Vera Marañón Senior Data Engineer & Data Architect

Project

AWS EMR and Apache Spark data engineering project

Practical Spark processing setup on Amazon EMR.

AWS EMR and Apache Spark data engineering project
Context
Public AWS and Spark project documented on Medium.
Problem
Distributed Spark processing needs a configured EMR cluster and a repeatable execution guide.
Solution
Documented the setup of an Amazon EMR cluster with Spark and supporting deployment steps.
Impact
Shows hands-on distributed processing experience with managed Spark infrastructure.

Stack

AWSEMRApache SparkS3

Links

This is listed as a project because it includes a public repository and deployment notes.