Job Description
Design, build, and maintain scalable automated data pipelines (ETL/ELT) using technologies like Airflow, DBT, or Spark.
Create and optimize data warehouse schemas (Star, Snowflake) to ensure performant querying for downstream users.
Implement and manage cloud-based data storage solutions (e.g., Snowflake, BigQuery, Redshift).
Develop validation frameworks to ensure data integrity, accuracy, and security across all layers.
Monitor and tune system performance, identifying bottlenecks in complex queries or ingestion processes.
Partner with software engineers to integrate data from internal applications and with stakeholders to translate business requirements into technical specs.