Site Reliability Engineer

Sur
Full-time
On-site

Our US based client is looking for a mission-driven Site Reliability Engineer to support and scale the infrastructure powering their secure, mission-critical SaaS platform. 

You must be confident in operating and debugging both modern infrastructure (cloud-native, containerized services) and classic Windows production environments (IIS, SQL Server AlwaysOn, Service Broker), with the ability to respond to incidents quickly, support ongoing automation, and scale systems reliably.

Responsibilities

  1. Be part of the team that owns the uptime and performance of our core backend infrastructure (Windows + Linux)
  2. Maintain and enhance observability across systems using Kibana, CloudWatch, and custom telemetry
  3. Manage CI/CD pipelines, infrastructure as code (Terraform, Ansible), and deployment automation
  4. Support and maintain production Windows environments:
    • .NET Framework/Core apps running in IIS
    • SQL Server with AlwaysOn replication and Service Broker-based messaging
  5. Support and operate cloud-native services:
    • AWS Lambdas, DynamoDB, Postgres/Aurora, Redshift, Redis, and containerized workloads in Docker
  6. Participate in on-call rotation and incident response
  7. Collaborate closely with engineering teams to improve system reliability and deployment workflows

Requirements

  1. 5+ years of SRE, DevOps, or WebOps experience supporting production SaaS systems
  2. Strong experience with Windows Server, IIS, and .NET applications in production
  3. Hands-on experience with SQL Server administration, including AlwaysOn and Service Broker
  4. Proficiency in AWS operations, including Lambda, DynamoDB, CloudWatch, and IAM
  5. Familiarity with Postgres, Redis, Kibana/ElasticSearch, and centralized logging
  6. Experience with Docker, Terraform, and Ansible for infrastructure management
  7. Strong scripting skills (PowerShell, Python)
  8. Experience running and debugging containerized and distributed systems in production
  9. Excellent incident response and debugging skills

Benefits

Salary: $6,000 USD/month + Holidays

Unlimited PTO