Thomson Reuters is seeking a Site Reliability Engineer to join the Case Center product within our Service Management, Technology team. The Site Reliability Engineer will support the reliability, performance, and operability of customer environments by contributing to routine change, incident, and problem management processes, as well as by driving continuous improvements in observability and automation across both non-production and production environments.
Requirements
- Lead proactive monitoring and health management for production and non-production environments;
- Own incident response for complex cases, including triage, stabilisation, root-cause analysis, post-incident review, and knowledge capture;
- Plan and execute standard installations, upgrades, migrations, configuration, and maintenance activities;
- Develop, configure, and support tooling for system monitoring, troubleshooting, and automation to improve repeatability and time-to-restore;
- Maintain and evolve observability (alerts, dashboards, runbooks) to reduce noise and improve mean-time-to-detect and mean-time-to-restore;
- Liaise with application development, content, customer service, and software/hardware support teams to manage escalations and coordinate change;
- Contribute to the development of automation and internal tooling (e.g., packaging, checks, and deployment pipelines) to increase operational throughput and consistency;
- Produce and maintain operational documentation, standards, and implementation patterns to support secure, repeatable, and compliant operations;
- Maintain accurate, auditable records for change, deployment, and security across environments;
- Participate in a collaborative on-call rotation;
Benefits
- Paid leave
- Two company-wide Mental Health Days off
- Access to the Headspace app
- Retirement savings
- Tuition reimbursement
- Employee incentive programs
- Resources for mental, physical, and financial wellbeing