Infrastructure/DevOps (Remote - Canada or US)

Jobgether
Full-time
On-site

This position is posted by Jobgether on behalf of Abnormal AI. We are currently looking for an Infrastructure/DevOps Engineer in Canada or the US.

In this fully remote role, you will play a pivotal part in enabling AI software engineers to innovate quickly by designing, building, and maintaining secure, scalable, and reliable infrastructure. You’ll collaborate closely with IT, security, and AI/ML engineering teams to ensure the systems powering AI experimentation, deployment, and monitoring are efficient and future-ready. The position combines systems engineering expertise with AI platform enablement, offering the opportunity to solve complex operational challenges and boost productivity across the organization. This role is perfect for a professional who values automation, operational excellence, and delivering measurable impact in a fast-moving, collaborative environment.

Accountabilities

  • Architect, manage, and optimize infrastructure supporting AI/ML pipelines, tools, and data platforms.
  • Implement and maintain containerization (Docker) and orchestration (Kubernetes) environments.
  • Develop CI/CD systems integrated with ML workflows to ensure reproducible experiments.
  • Collaborate with security and compliance teams to meet data protection and regulatory standards.
  • Automate provisioning and deployment using Infrastructure as Code tools (Terraform, Pulumi, or Ansible).
  • Monitor and troubleshoot systems using observability tools such as Prometheus, Grafana, and ELK stack.
  • Work with AI and software engineers to enhance platform performance and resource efficiency.
  • Produce and maintain clear, accessible documentation to support knowledge sharing.

Requirements

  • 4+ years’ experience in DevOps, Site Reliability, or Infrastructure Engineering.
  • Proficiency in cloud platforms (AWS preferred), Kubernetes, and Docker.
  • Skilled with Infrastructure as Code tools such as Terraform, Ansible, or Pulumi.
  • Strong scripting abilities in Python, Bash, or similar languages.
  • Hands-on experience with CI/CD systems like GitHub Actions, Jenkins, or CircleCI.
  • Solid understanding of networking, security, and identity management in cloud environments.
  • Experience supporting ML workloads and GPU-based infrastructure.
  • Proven ability to troubleshoot complex distributed systems and work cross-functionally.
  • Bonus: familiarity with MLOps tools (MLflow, Kubeflow, SageMaker), AI platform infrastructure, data platforms (Snowflake, Databricks, Hadoop), AWS certification, or experience in high-growth tech environments.

Benefits

  • Base salary range: $127,500 – $150,000 USD.
  • Eligibility for performance bonuses and Restricted Stock Units (RSUs).
  • Flexible, fully remote work across Canada or the US.
  • Comprehensive health coverage.
  • Equity and compensation philosophy designed to recognize skills, experience, and impact.
  • Opportunity to work with cutting-edge AI infrastructure in a high-impact role.


Jobgether is a Talent Matching Platform that partners with companies worldwide to connect top talent with the right opportunities through AI-driven matching.

When you apply, your profile goes through our AI-powered screening process, designed to identify top talent efficiently and fairly:
🔍 Our AI reviews your CV and LinkedIn profile in depth, analyzing skills, experience, and achievements.
📊 It compares your profile to the role’s requirements and past success patterns to determine a match score.
🎯 We automatically shortlist the top 3 candidates who best match the role.
🧠 If needed, our team conducts a manual review to ensure no outstanding profiles are overlooked.

This process is transparent, skills-based, and free of bias — focusing solely on your fit for the role. Once the shortlist is complete, it is shared directly with the hiring company, which then decides on next steps such as interviews or assessments.

Thank you for your interest!

#LI-CL1