Marriott

FLEX Site Reliability Engineer - Service Availability Manager

Join Marriott as a FLEX Site Reliability Engineer in Bethesda, MD. Leverage ServiceNow for ITSM, ensuring service availability and performance. 5+ years IT experience required. Benefits include health, 401(k), and discounts.

Department - JobBoardly X Webflow Template
Direct Hire
Job Level - JobBoardly X Webflow Template
Mid-Level
ServiceNow Role Type:
ServiceNow Modules:
Department - JobBoardly X Webflow Template
DevOps
Department - JobBoardly X Webflow Template
IT Service Management
Department - JobBoardly X Webflow Template
Incident Management
Department - JobBoardly X Webflow Template
Integration Hub
ServiceNow Certifications (nice to have):
Department - JobBoardly X Webflow Template
Certified Implementation Specialist - IT Service Management

Job description

Date - JobBoardly X Webflow Template
Posted on:
 
March 28, 2025

The SRE Service Availability Manager plays a key role in ensuring the peak performance and availability of our Enterprise IT infrastructure and services. This position combines proactive site reliability engineering with adept incident command to lead our efforts in minimizing service disruptions and enhancing our technology landscape.

Requirements

  • 5+ years of experience in an information technology environment
  • 3 years of experience in information technology focused on IT Operations that include troubleshooting complex network, server, storage, and/or application issues.
  • 2 years minimum operations experience involving incident, problem, change, and release management that included leading calls and documenting outcomes.
  • Undergraduate degree or or equivalent experience/certification.
  • Ability to cover shifts in a 24x7x365 environment and on-call responsibilities.
  • Proficiency in scripting languages (Python, Shell) and familiarity with automation tools (such as Ansible, Jenkins).
  • Experience with cloud platforms (AWS, Azure, GCP), infrastructure as code, and containerization technologies.
  • Experience in incident command or incident management in a technology environment.
  • Strong problem-solving, organizational, and analytical skills.
  • ITIL Foundations v3+ Certification.
  • Demonstrated experience with ITSM suites, e.g., ServiceNow.
  • Demonstrated experience with various monitoring, performance, or capacity tools.
  • Experience with continuous integration/continuous deployment (CI/CD) pipelines and DevOps practices.
  • Familiarity with Site Reliability Engineering principles and concepts.
  • Strong leadership qualities, including decisiveness, and the ability to motivate teams, along with the ability to manage stressful situations calmly and effectively.
  • Ability to create constructive relationships, influence, and communicate with varying levels of associates and management.
  • Ability to solve complex, cross-functional issues.
  • Strong knowledge of Server, Storage, Network, Middleware, Application and Cloud technologies.
  • A high degree of curiosity and a drive to seek more efficient ways of delivering service.

Benefits

  • medical, dental, vision, health care flexible spending account, dependent care flexible spending account, life insurance, disability insurance, accident insurance, adoption expense reimbursements, paid parental leave, 401(k) plan, stock purchase plan, discounts at Marriott properties, commuter benefits, employee assistance plan, and childcare discounts.

Requirements Summary

5+ years of experience in IT, strong problem-solving and leadership skills, proficiency in scripting languages and automation tools, experience with cloud platforms and incident management