Sourced Job
Data Engineer
Marathahalli, Bangalore
3 years
Data ModelingETL TestingBig Data ProcessingCloud ServicesDatabricksApache AirflowPython BasicsSQLScala (Programming Language)Version ControlCI/CDUnit TestingSystem MonitoringProcess OptimizationSREMLOps (Machine Learning Operations)ContainerizationContainerizationGDPR ComplianceVideo Streaming
Job Description:
Responsibilities
- Design and maintain scalable data pipelines and ETL/ELT processes for structured and unstructured data.
- Build robust data models, data lakes, and data warehouses to support analytics and ML use cases.
- Collaborate with software engineers, data scientists, and business teams to deliver production-ready data solutions.
- Ensure data quality, governance, security, and lineage across all data assets.
- Automate data workflows and implement monitoring/alerting for operational excellence.
- Apply engineering best practices such as CI/CD, testing, code reviews, and performance optimization.
- Work with ML engineering teams to develop ML Ops pipelines for model deployment, monitoring, and scalability.
- Optimize cloud-based data platforms (AWS, Azure, GCP) for cost, performance, and security.
- Create documentation, standards, and guidelines to improve data engineering processes.
Required Skills & Qualifications
- Strong understanding of data engineering fundamentals: data modeling, pipelines, batch/streaming systems.
- Experience with Kafka, Spark, Flink, or similar streaming technologies.
- Proficient in cloud services (AWS/Azure/GCP) and cloud-native tools like BigQuery, Redshift, Synapse, Databricks, Snowflake.
- Skilled with modern orchestration frameworks (Airflow, Prefect, Dagster).
- Strong coding ability in Python, SQL, and one more language (Scala/Java preferred).
- Experience with CI/CD, version control, testing, observability, and performance tuning.
- Knowledge of operational excellence (SRE principles, monitoring, alerting).
- Familiarity with ML Ops tools such as MLflow, Kubeflow, Vertex AI, or Azure ML.
- Understanding of Docker, Kubernetes, and containerized workflows.
- Knowledge of data governance, security, access control, and GDPR/PII compliance.
Preferred Qualifications
- Experience designing large-scale data platforms for analytics and ML workloads.
- Exposure to real-time streaming architectures.
- Understanding of DevOps principles applied to data and ML pipelines.
- Strong problem-solving skills focused on scalability and reliability.
- Excellent communication skills and experience working with global cross-functional teams.

















































































