We are looking for a Lead Site Reliability Engineer to join our team. The candidate will work with multiple engineering teams, providing a provision for the SRE to shift between multiple engineering platforms as demanded by the work, vision and/or criticality of the projects. The role will focus on maximum availability, observability, reliability, security, and performance for Nike Digital Experiences.
Requirements
- Ability to observe, diagnose, and develop fixes for production issues quickly and efficiently
 - Ability to develop and drive real-time monitoring solutions that provide visibility into site health and key performance indicators
 - Strong communication skills (written and verbal)
 - Highly confident and capable of reporting and communicating high-value metrics to leadership
 - Working understanding of IT service management (Incident, Problem, Change and Knowledge management)
 - Ability to work across teams (business and technical) to continuously analyze system performance in production, troubleshoot consumer reported issues, and proactively identify areas in need of optimization
 - Practical experience in managing and leading application reliability practices for consumer-facing web and mobile experiences
 - Demonstrated negotiation and influencing skills
 - Passion for coaching, teaching, mentoring and learning
 - Bachelor’s degree in computer science, Information Systems, Business, or other relevant subject areas
 - 7+ years of professional experience in software development, operations, or support
 - Strong design and development experience with Java
 - Proficient with JavaScript on the frontend (React, Angular, etc.) and backend (Node.js) components
 - Kubernetes working knowledge and experience
 - Experience in other modern enterprise languages (functional or other – Scala, Python, Golang, etc.) is preferred
 - Basic understanding of DNS, Networking, Virtualization, Linux
 - Expertise in designing/building/supporting scalable cloud-based Micro Services
 - Experience with Docker and/or Serverless patterns
 - Experience with at least one No-SQL database like DynamoDb, Cassandra, etc.
 - Good understanding of RESTful APIs
 - Basic understanding of common tools for service management, agile, and observability: ServiceNow, Jira, Jenkins, Splunk, New Relic, SignalFx
 - Background with ITIL or Lean is a plus