This role is for one of Weekday’s clients
Min Experience: 5 years
Location: Remote (India)
JobType: full-time
Requirements
REQUIREMENTS
- Proficient in
- Programming language: Python, PySpark , Scala
- Azure Environment: Azure Data Factory, Databricks, Key Vault, DevOps CI CD
- Storage/ Databases: ADLS Gen 2, Azure SQL DB, Delta Lake
- Data Engineering: Apache Spark, Hadoop, optimization, performance tuning, Data modelling
- Experience working with data sources such as Kafka and MongoDB is preferred.
- Experience with Automation of Test Cases of Big Data & ETL
- Pipelines and Agile Methodology
- Basic Understanding of ETL Pipelines
- A strong understanding of AI, machine learning, and data science concepts is highly beneficial.
- Strong analytical and problem-solving skills with attention to detail.
- Ability to work independently and as part of a team in a fast-paced environment.
- Excellent communication skills, able to collaborate with both technical and non-technical stakeholders.
- Experience designing and implementing scalable and optimized data architectures followed by all best practices.
- Strong understanding of data warehousing concepts, data lakes, and data modeling.
- Familiarity with data governance, data quality, and privacy regulations.
Key Responsibilities:
- Data Pipeline Development: Design, develop, and maintain scalable and efficient data pipelines to collect, process, and store data from various sources (e.g., databases, APIs, third-party services).
- Data Integration: Integrate and transform raw data into clean, usable formats for analytics and reporting, ensuring consistency, quality, and integrity.
- Data Warehousing: Build and optimize data warehouses to store structured and unstructured data, ensuring data is organized, reliable, and accessible.
- ETL Processes: Develop and manage ETL (Extract, Transform, Load) processes for data ingestion, cleaning, transformation, and loading into databases or data lakes.
- Performance Optimization: Monitor and optimize data pipeline performance to handle large volumes of data with low latency, ensuring reliability and scalability.
- Collaboration: Work closely with other product teams , TSO and business stakeholders to understand data requirements and ensure that data infrastructure supports analytical needs.
- Data Quality & Security: Ensure that data systems meet security and privacy standards, and implement best practices for data governance, monitoring, and error handling.
- Automation & Monitoring: Automate data workflows and establish monitoring systems to detect and resolve data issues proactively.
- Understand the broad architecture of the GEP's entire system as well as Analytics.
- Take full accountability for role, own development and results