Project

PySpark ETL with S3 persistence

A PySpark ETL project applying SOLID-oriented structure and AWS S3 persistence.

Context: Public PySpark project documented on Medium.
Problem: ETL examples can become hard to maintain when extraction, transformation and persistence concerns are mixed together.
Solution: Structured a PySpark ETL pipeline with clearer responsibilities and persistence to S3.
Impact: Demonstrates attention to maintainable code structure in data engineering workflows.

Stack

PySparkAWS S3PythonETLSOLID

This project is included because the Medium article points to a public implementation repository.