The Lead Site Reliability Engineer will be hands-on and provide mentorship to other team members on core SRE principles and tools. They will participate in end-to-end operational aspects of Production environment, work on cloud systems, networks, databases, and drive incident lifecycle management. The role requires a highly skilled technology professional with excellent communication skills, strategic mindset, strong analytical and troubleshooting skills on AWS Cloud Platform.
Requirements
- Bachelors or Master's in Computer Science discipline
- 5+ years' experience focussed on Site Reliability Engineering or related position in AWS Cloud Platform
- At least 2 AWS Certifications are must (AWS Sysops Admin and Architects certifications preferred)
- Experience working with SQL, Windows Servers, Load balancers, Linux
- Deep experience with AWS, Docker and Kubernetes, CloudFormation, CloudWatch, CodeDeploy, DynamoDB, Lambda, SQS, Amazon FSX, Elastic Search and networking concepts
- Program at a high level in at least one language such as: Java, C#, Javascript, Python or Ruby
- Integration experience with PagerDuty, ServiceNow, Datadog, CloudWatch
- Good understanding of Site Reliability Engineering (SRE) philosophies, technologies, platforms and tools, SLO management, incident resolution, and automation
Benefits
- Hybrid Work Model
- Flexibility & Work-Life Balance
- Career Development and Growth
- Industry Competitive Benefits
- Culture: Globally recognized, award-winning reputation for inclusion and belonging, flexibility, work-life balance, and more
- Social Impact: Make an impact in your community with our Social Impact Institute
- Making a Real-World Impact: We are one of the few companies globally that helps its customers pursue justice, truth, and transparency