Intact

Advisor - IT systems monitoring

Join Intact in Montréal as an Advisor - IT Systems Monitoring. Leverage ServiceNow for IT reliability, performance management, and cloud solutions. Benefits include flexible work, wellness programs, and a share purchase plan.

Department - JobBoardly X Webflow Template
Direct Hire
Job Level - JobBoardly X Webflow Template
Mid-Level
ServiceNow Role Type:
ServiceNow Modules:
Department - JobBoardly X Webflow Template
DevOps
Department - JobBoardly X Webflow Template
Event Management
Department - JobBoardly X Webflow Template
IT Operations Management
Department - JobBoardly X Webflow Template
Integration Hub
Department - JobBoardly X Webflow Template
Service Portal
ServiceNow Certifications (nice to have):
Department - JobBoardly X Webflow Template
Certified Implementation Specialist - Event Management

Job description

Date - JobBoardly X Webflow Template
Posted on:
 
May 13, 2025

We are seeking an experienced SRE specialist to join our SRE Practice team. As a key contributor to the reliability strategy, you will be responsible for implementing and maintaining a comprehensive reliability solution for on-premises and cloud applications and services. You will work collaboratively with various IT teams to enable and guide them in achieving site reliability.

Requirements

  • Solid expertise on the topic of IT reliability
  • Extensive experience with application performance management, IT infrastructure monitoring, and user experience monitoring.
  • Technical leadership experience.
  • Enterprise application, systems, and network monitoring expertise for on-premises and cloud applications.
  • Hands-on experience with Dynatrace, Elastic Search, and ServiceNow in instrumenting applications end-to-end with minimal supervision.
  • Solid knowledge of AI-OPS, anomaly detection, and event correlation solutions.
  • Comfortable with scripting or programming languages (Java, C++, GO, Python)
  • Experience with open telemetry.
  • Good knowledge of infrastructure protocols to gather element-level event data.
  • Good knowledge of open-source monitoring technologies.
  • Proficient with data lifecycles and aggregation, reporting, and web dashboards.
  • Proficient in ITIL event management and good basis in ITIL foundational concepts.
  • Hands-on experience with continuous integration tools.
  • Deep knowledge of reliability and Site Reliability Engineering (SRE).
  • Infrastructure and Networking: The candidate should be familiar with advanced networking tools like F5, Citrix, Cloudflare, etc. and be able to design custom hardware and software networking solutions.
  • Troubleshooting: The candidate should be proficient with advanced log analysis tools like Dynatrace and be able to develop and maintain automated testing and deployment tools.
  • Cloud Computing and Virtualization: The candidate should have hands-on experience with AWS, GCP, Azure, VirtualBox, Docker, Kubernetes and advanced cloud infrastructure tools like Terraform, Puppet, or Chef.
  • Distributed Systems and Scalability: The candidate should have knowledge of advanced distributed systems tools like Kubernetes and service meshes, and advanced distributed systems tools like Cassandra, Hadoop, or Spark.
  • Security and Compliance: The candidate should have knowledge of advanced security tools like HashiCorp Vault, AWS KMS, or Azure Key Vault and security best practices, firewalls, encryption, SSL/TLS.

Benefits

  • A financial rewards program that recognizes your success
  • An industry leading Employee Share Purchase Plan; we match 50% of net shares purchased
  • An extensive flex pension and benefits package, with access to virtual healthcare
  • Flexible work arrangements
  • Possibility to purchase up to 5 extra days off per year
  • An annual wellness account that promotes an active and healthy lifestyle
  • Access to tools and resources to support physical and mental health, embracing change and connecting with colleagues
  • A dynamic workplace learning ecosystem complete with learning journeys, interactive online content, and inspiring programs
  • Inclusive employee-led networks to educate, inspire, amplify voices, build relationships and provide development opportunities
  • Inspiring leaders and colleagues who will lift you up and help you grow
  • A Community Impact program, because what you care about is a part of what makes you different. And how you contribute to your community should be just as unique.

Requirements Summary

3-5 years of experience in IT reliability, application performance management, IT infrastructure monitoring, and user experience monitoring, with a strong focus on technical leadership and site reliability engineering