ServiceNow

Senior Staff Machine Learning Engineer - DevOps/Site Reliability Engineer

Join ServiceNow in Santa Clara as a Senior Staff Machine Learning Engineer. Leverage ServiceNow skills to enhance AI infrastructure and reliability. 8+ years in DevOps/SRE required. Benefits include competitive pay, equity, and flexible time off.

The Mothership

Senior

ServiceNow Role Type:

Implementer

ServiceNow Modules:

DevOps

Predictive Intelligence

ServiceNow Certifications (nice to have):

Job description

Posted on:

July 21, 2025

Join ServiceNow's PLATO group as a Senior Staff Machine Learning Engineer - Site Reliability Engineer. Contribute to the design, development, and implementation of infrastructure, platform, deployment, and observability features that power AI workloads. Collaborate with researchers, AI engineers, and infrastructure teams to ensure GPU clusters perform efficiently and remain reliable.

Requirements

Experience in leveraging or critically thinking about how to integrate AI into work processes, decision-making, or problem-solving.
Proficient in prompt engineering and developing LLM based features
Experience with methods of training and fine tuning large language models, such as distilation, supervised fine-tunning and policy optimization
Experience in using AI productivity tools such as Cursor, Windsurf, etc
8+ years of experience with infrastructure and platform operations, deployments, SRE, and DevOps with a continued focus on improving Platform health
6+ years of experience operating highly-available distributed workloads on Kubernetes following a DevOps approach.
6+ years of development experience with Python, GoLang, Java or similar languages;
Experience with DevOps tooling (e.g. Helm / Ansible / Kubernetes / Prometheus /Splunk/ GitLab CI);
Strong working experience operating distributed systems built on Linux and J2EE;
Experience with software-defined networking, infrastructure as code and configuration management;
Experience building software for compliance and security in regulated environments

Benefits

base pay of $ 197,800 - $ 346,200
equity (when applicable)
variable/incentive compensation
health plans, including flexible spending accounts
a 401(k) Plan with company match
ESPP
matching donations
a flexible time away plan
family leave programs

Requirements Summary

8+ years of experience with infrastructure and platform operations, deployments, SRE, and DevOps, 6+ years of experience operating distributed workloads on Kubernetes, proficiency in programming languages such as Python, GoLang, Java

Senior Staff Machine Learning Engineer - DevOps/Site Reliability Engineer

Job description

Requirements

Benefits

Requirements Summary

Apply now

ServiceNow

More job openings

ServiceNow Developer & Administrator (Contract)

ServiceNow Technical Consultant

Job Title: Information Security Engineer - Vulnerability Management III