Data Engineer

Weekday AI
Full-time
On-site

This role is for one of Weekday’s clients
Min Experience: 5 years
Location: Remote (India)
JobType: full-time

Requirements

REQUIREMENTS

  • Proficient in 
    • Programming language: Python, PySpark , Scala 
    • Azure Environment: Azure Data Factory, Databricks, Key Vault, DevOps CI CD
    • Storage/ Databases: ADLS Gen 2, Azure SQL DB, Delta Lake
    • Data Engineering: Apache Spark, Hadoop, optimization, performance tuning, Data modelling
    • Experience working with data sources such as Kafka and MongoDB is preferred.
  • Experience with Automation of Test Cases of Big Data & ETL
  • Pipelines and Agile Methodology
  • Basic Understanding of ETL Pipelines
  • A strong understanding of AI, machine learning, and data science concepts is highly beneficial.
  • Strong analytical and problem-solving skills with attention to detail.
  • Ability to work independently and as part of a team in a fast-paced environment.
  • Excellent communication skills, able to collaborate with both technical and non-technical stakeholders.
  • Experience designing and implementing scalable and optimized data architectures followed by all best practices.
  • Strong understanding of data warehousing concepts, data lakes, and data modeling.
  • Familiarity with data governance, data quality, and privacy regulations.

 

Key Responsibilities:

  • Data Pipeline Development: Design, develop, and maintain scalable and efficient data pipelines to collect, process, and store data from various sources (e.g., databases, APIs, third-party services).
  • Data Integration: Integrate and transform raw data into clean, usable formats for analytics and reporting, ensuring consistency, quality, and integrity.
  • Data Warehousing: Build and optimize data warehouses to store structured and unstructured data, ensuring data is organized, reliable, and accessible.
  • ETL Processes: Develop and manage ETL (Extract, Transform, Load) processes for data ingestion, cleaning, transformation, and loading into databases or data lakes.
  • Performance Optimization: Monitor and optimize data pipeline performance to handle large volumes of data with low latency, ensuring reliability and scalability.
  • Collaboration: Work closely with other product teams , TSO and business stakeholders to understand data requirements and ensure that data infrastructure supports analytical needs.
  • Data Quality & Security: Ensure that data systems meet security and privacy standards, and implement best practices for data governance, monitoring, and error handling.
  • Automation & Monitoring: Automate data workflows and establish monitoring systems to detect and resolve data issues proactively.
  • Understand the broad architecture of the GEP's entire system as well as Analytics.
  • Take full accountability for role, own development and results