Skip to content
Rafael Vera Marañón Senior Data Engineer & Data Architect

Project

Spark UI performance troubleshooting lab

Reproducible Apache Spark lab for diagnosing performance issues through Spark Web UI evidence.

Spark UI performance troubleshooting lab
Context
Public Spark performance lab documented on Medium.
Problem
Spark performance work often fails when teams cannot connect slow jobs, physical plans and runtime metrics to concrete evidence in Spark UI.
Solution
Built a Docker-based lab with Spark Standalone, Spark History Server, Scala cases, persistent event logs and optional Redpanda streaming cases.
Impact
Provides a practical way to compare baseline and optimized runs across jobs, stages, SQL plans, storage, executors and streaming evidence.

Stack

Apache SparkSpark UIScalaDockerRedpandaStructured Streaming

Links

This belongs in Projects because it has a public repository, reproducible local execution, technical documentation and a complete lab structure.