The Lead Site Reliability Engineer will be hands-on, providing mentorship to a growing SRE team on core SRE principles and tools. The individual will participate in end-to-end operational aspects of the Production environment, work on cloud systems, networks, databases, and help drive incident lifecycle management.
Requirements
- Skilled with cloud operations/administration in Amazon AWS.
- Tax/Accounting domain experience
- Bachelors or Master’s in Computer Science discipline.
- 5+ years’ experience focussed on Site Reliability Engineering or related position in AWS Cloud Platform.
- At least 2 AWS Certifications are must. (AWS Sysops Admin and Architects certifications preferred).
- Experience working with SQL, Windows Servers, Load balancers, Linux
- Deep experience with AWS, Docker and Kubernetes, CloudFormation, CloudWatch, CodeDeploy, DynamoDB, Lambda, SQS, Amazon FSX, Elastic Search and networking concepts are must.
- Program at a high level in at least one language such as: Java, C#, Javascript, Python or Ruby.
- Integration experience with PagerDuty, ServiceNow, Datadog, CloudWatch.
- Good understanding of Site Reliability Engineering (SRE) philosophies, technologies, platforms and tools, SLO management, incident resolution, and automation;
- Ability to explain technical concepts in clear, non-technical language
- Working knowledge of infrastructure components (e.g. routers, load balancers, cloud products, container systems, compute, storage, and networks)
- Knowledge of security and compliance standards such as SOC/PCI is a plus
Benefits
- Hybrid Work Model
- Flexibility & Work-Life Balance
- Career Development and Growth
- Industry Competitive Benefits
- Culture
- Social Impact
- Making a Real-World Impact