Project
PySpark ETL with S3 persistence
A PySpark ETL project applying SOLID-oriented structure and AWS S3 persistence.
- Context
- Public PySpark project documented on Medium.
- Problem
- ETL examples can become hard to maintain when extraction, transformation and persistence concerns are mixed together.
- Solution
- Structured a PySpark ETL pipeline with clearer responsibilities and persistence to S3.
- Impact
- Demonstrates attention to maintainable code structure in data engineering workflows.
Stack
PySparkAWS S3PythonETLSOLID
Links
This project is included because the Medium article points to a public implementation repository.