We are seeking a skilled Observability Engineer to design, implement, and support observability solutions that enhance the reliability, performance, and visibility of cloud infrastructure and applications. This role will focus on integrating tools like New Relic and PagerDuty, enabling proactive monitoring, alerting, and incident response.
Requirements
- Hands-on experience with observability tools such as New Relic (preferred), Datadog, Prometheus, Grafana, or similar.
- Experience with incident management platforms like PagerDuty (preferred), or ServiceNow ITOM.
- Experience with cloud platforms like AWS (preferred), Azure, or GCP and cloud-native architectures.
- Proficiency in scripting and automation (Python, Bash, or PowerShell).
- Understanding of SRE principles, including SLOs, SLIs, and error budgets.
- Working knowledge of IDP platforms and developer enablement tools.
- Practical experience using GitHub Copilot for automation and code generation.
- Strong troubleshooting and analytical skills.
Benefits
- Flexible work
- Healthcare including dental, vision, mental health, and well-being programs
- Financial well-being programs such as 401(k) and Employee Share Ownership Plan
- Paid time off and paid holidays
- Paid parental leave
- Family building benefits like adoption assistance, surrogacy, and cryopreservation
- Social well-being benefits like subsidized back-up child/elder care and tutoring
- Mentoring, coaching and learning programs
- Employee Resource Groups
- Disaster Relief