Lead Data Engineering
Kachiguda, Hyderabad
3 years 10 months
Purpose:
We are seeking a hands-on Data Engineer with a strong focus on data ingestion to support the delivery of high-quality, reliable, and scalable data pipelines across our Data & AI ecosystem. This role is essential in enabling downstream analytics, machine learning, and business intelligence solutions by ensuring robust and automated data acquisition from various internal and external sources.
________________________________________
Key Responsibilities:
• Design, build, and maintain scalable and reusable data ingestion pipelines to onboard structured and semi-structured data from APIs, flat files, databases, and external systems.
• Work with Azure-native services (e.g., Data Factory, Azure Data Lake, Event Hubs) and tools like Databricks or Apache Spark for data ingestion and transformation.
• Develop and manage metadata-driven ingestion frameworks to support dynamic and automated onboarding of new sources.
• Collaborate closely with source system owners, analysts, and data stewards to define data ingestion specifications and implement monitoring/alerting on ingestion jobs.
• Ensure data quality, lineage, and governance principles are embedded into ingestion processes.
• Optimize ingestion processes for performance, reliability, and cloud cost efficiency.
• Support batch and real-time ingestion needs, including streaming data pipelines where applicable.
________________________________________
Key Qualifications:
• 3+ years of hands-on experience in data engineering – bonus: with a specific focus on data ingestion or integration.
• Hands-on experience with Azure Data Services (e.g., ADF, Databricks, Synapse, ADLS) or equivalent cloud-native tools.
• Experience in Python(PySpark) for data processing tasks. (bonus: SQL knowledge)
• Experience with ETL frameworks, orchestration tools, and working with API-based data ingestion.
• Familiarity with data quality and validation strategies, including schema enforcement and error handling.
• Good understanding of CI/CD practices, version control, and infrastructure-as-code (e.g., Terraform, Git).
• Bonus: Experience with streaming ingestion (e.g., Kafka, Event Hubs, Spark Structured Streaming).
Azure Data FactoryAzure Data LakeAzure IoT HubAzure Databricks+16