Job Description
We are looking for a skilled and detail-oriented Senior AI Data Engineer with expertise in Python, data engineering frameworks to design, develop, and optimize scalable data solutions. The ideal candidate should have hands-on experience in data processing, ETL pipeline development, data integration, and analytics platforms while working with modern cloud-based AI and data ecosystems. You will be responsible for the architecture of our data pipelines and the deployment of production-grade AI models, all while mentoring a high-performing team of engineers. Oversee the design of scalable data architectures and AI workflows. Leverage your deep Python background to ensure code quality, maintainability, and architectural integrity.
KEY RESPONSIBILITIES
Design, develop, and maintain scalable data pipelines and ETL/ELT workflows.
Work with large datasets for data extraction, transformation, validation, and loading.
Develop and optimize data engineering solutions using Python and related libraries.
Implement and manage solutions using Microsoft Fabric components including Data Factory, Lakehouse, Warehouses, and Power BI integration.
Build data ingestion and transformation workflows from multiple structured and unstructured data sources.
Collaborate with business stakeholders, analysts, and development teams to understand data requirements and deliver scalable solutions.
Ensure data quality, governance, security, and performance optimization across platforms.
Perform troubleshooting, debugging, and enhancement of existing data workflows.
Create and maintain technical documentation for data models, pipelines, and processes.
Support automation and reporting initiatives through efficient data engineering practices.
TECHNOLOGIES
Languages
Expert-level Python (Native, Multiprocessing, Asynchronous programming).
Data Tools
Spark, Airflow,
MS Fabric,
Snowflake/Databricks, or NoSQL equivalents.
AI/ML
AI Foundry, Bedrock, TensorFlow, PyTorch, Scikit-learn, and experience with LLM orchestration.
Infrastructure
AWS/Azure, Docker, Kubernetes, and CI/CD for Data/ML.