Citi

Lead Application Reliability Engineer

Lead Application Reliability Engineer at Citi, Irving, TX. 6+ years required. Manage threat-modeling platform reliability using ServiceNow ITSM, Linux, Kubernetes, Python, PostgreSQL, security compliance.

ServiceNow Role Type:
Department - JobBoardly X Webflow Template
System Administrator
ServiceNow Modules:
No items found.
ServiceNow Certifications (nice to have):
Department - JobBoardly X Webflow Template
Certified Implementation Specialist - IT Service Management
Department - JobBoardly X Webflow Template
Certified System Administrator

Job description

Date - JobBoardly X Webflow Template
Posted on:
 
November 13, 2025

Citi is seeking a Lead Application Reliability Engineer to ensure the continuous availability, optimal performance, and security of a critical threat-modeling application. The selected candidate will become the key engineer in supporting and advancing the platform used for threat-modeling process in Citi.

Requirements

  • 6+ years of relevant experience in an Engineering role, preferably in Financial Services or a large, complex, and/or global environment.
  • Experience managing and troubleshooting Linux Operating Systems(e.g., Red Hat Enterprise Linux (RHEL), CentOS, Ubuntu), including System Administration Tasks like User Management, Service Restarts, and File System Checks.
  • Proficiency in Scripting for Automation(e.g., Bash, Python) and with Configuration Management Tools(e.g., Ansible, Puppet, Chef) for system administration and infrastructure automation.
  • Experience with container orchestration using Helm and Kubernetes on platforms like AWS EKS, GCP GKE, or OpenShift.
  • Working knowledge of Relational Databases(e.g., PostgreSQL), including basic querying.
  • Proven track record of maintaining applications and their technology stacks compliant with security and configuration requirements, including successfully passing internal and external security audits by demonstrating secure configuration of applications and infrastructure (e.g., implementing least privilege access, hardening OS, managing firewall rules) and ensuring continuous compliance with regulatory standards (e.g., SOX, GDPR) through automated checks and reporting.
  • Demonstrated adherence to strict change control procedures, executing all changes (e.g., code deployments, infrastructure updates) through a formalized change management process (e.g., ITSM, ServiceNow) with proper documentation and approvals.
  • Experience with Ticketing Systems(e.g., Jira, ServiceNow).
  • Working understanding of Middleware Components (e.g., Nginx, Tomcat or equivalents).
  • Familiarity with Development Concepts(e.g., Git, CI/CD, Pipelines, SDLC).
  • Strong communication skills, both written and verbal, for technical and non-technical audiences.
  • Demonstrated analytical and diagnostic skills, with an ability to identify process improvements and best practices.
  • Ability to work independently, manage multiple tasks, take ownership of initiatives, and operate effectively in a matrixed environment under pressure and tight deadlines.

Benefits

  • Medical, dental & vision coverage
  • 401(k)
  • Life, accident, and disability insurance
  • Wellness programs
  • Paid time off packages
  • Paid holidays

Requirements Summary

6+ years of experience in an Engineering role, proficiency in scripting, and experience with container orchestration and database management