Cognizant logo

Sr. Developer - 00065252281 - Site Reliability Engineering (SRE) Lead with Azure

Cognizant
Full-time
On-site
Arkansas
Digital

Role Overview:

We are looking for a proactive and detail-oriented SRE Engineer to support the reliability and performance of our systems. You will be responsible for monitoring, incident response, and continuous improvement of our infrastructure and services.

Key Responsibilities:

  • Perform day-to-day operations monitoring and incident management.
  • Respond to and drive resolution for P1/P2 incidents.
  • Implement observability and monitoring tools to ensure system health.
  • Support internal and external users with performance monitoring, troubleshooting, and root cause analysis.
  • Build and maintain dashboards for operational and user metrics using Grafana.
  • Contribute to automation efforts to reduce manual tasks and improve incident response.
  • Participate in testing during the Handover to Support process.
  • Assist with cloud administration across Azure, WCNP, and Edge environments.
  • Collaborate with cross-functional teams to enhance system stability and reliability.
  • Troubleshoot and resolve software issues efficiently.
  • Embrace and apply SRE best practices to improve platform resilience.

Required Skills:

  • Solid understanding of SRE methodologies.
  • Experience with Java, MVC Pattern, JDBC, RESTful APIs, and Spring Boot.
  • Familiarity with Azure Cloud and Python scripting.
  • Knowledge of observability tools.
  • Proficiency with ServiceNow, Slack/Teams, Xmatters, and Grafana.
  • Strong analytical and communication skills