Lead Site Reliability Engineer
Salary & Market Data
Matched to BLS occupational data · Missouri
Job Description
Mastercard is seeking a Lead Site Reliability Engineer to drive reliability, scalability, and security for mission-critical financial services platforms. You will design and optimize cloud-native, highly available systems, implement SRE best practices, and lead incident response and postmortems. Partnering with IT and Cybersecurity teams, you’ll automate deployments, observability, and resilience testing while mentoring engineers. Ideal candidates bring deep experience with cloud, CI/CD, infrastructure-as-code, and securing large-scale, distributed systems in a regulated environment.
Lead design and operation of highly available, secure, and scalable financial services platforms.
Define and implement SRE best practices, including SLOs, SLIs, and error budgets.
Architect and maintain cloud-native infrastructure using infrastructure-as-code and automation.
Own incident response, root cause analysis, and postmortems for critical production issues.
Drive observability across systems with robust monitoring, logging, and alerting solutions.
Collaborate closely with IT and Cybersecurity teams to embed security and compliance into the stack.
Optimize performance, capacity planning, and cost management for large-scale distributed systems.
Mentor and guide engineers on SRE principles, tooling, and operational excellence.
Continuously improve CI/CD pipelines and deployment strategies for safer, faster releases.
Champion a culture of reliability, innovation, and continuous improvement within the team.
Site Reliability Engineering (SRE)
Public cloud platforms (AWS, GCP, or Azure)
Kubernetes and container orchestration
Linux systems engineering and administration
Infrastructure as Code (Terraform, CloudFormation, or similar)
CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions, or similar)
Monitoring and observability (Prometheus, Grafana, Datadog, Splunk, etc.)
Scripting/programming (Python, Go, or similar)
Security and compliance for financial/regulated environments
Incident management and on-call operations
Medical Insurance
Dental Insurance
Vision Insurance
Life Insurance
401k
AWS Certified Solutions Architect – Professional or equivalent cloud certification
CKA/CKAD (Kubernetes)
Security certification (e.g., CISSP, CCSP, or equivalent)
ITIL or SRE-focused training (e.g., Google SRE, SREcon coursework)
Lead design and operation of highly available, secure, and scalable financial services platforms.
Define and implement SRE best practices, including SLOs, SLIs, and error budgets.
Architect and maintain cloud-native infrastructure using infrastructure-as-code and automation.
Own incident response, root cause analysis, and postmortems for critical production issues.
Drive observability across systems with robust monitoring, logging, and alerting solutions.
Collaborate closely with IT and Cybersecurity teams to embed security and compliance into the stack.
Optimize performance, capacity planning, and cost management for large-scale distributed systems.
Mentor and guide engineers on SRE principles, tooling, and operational excellence.
Continuously improve CI/CD pipelines and deployment strategies for safer, faster releases.
Champion a culture of reliability, innovation, and continuous improvement within the team.
Site Reliability Engineering (SRE)
Public cloud platforms (AWS, GCP, or Azure)
Kubernetes and container orchestration
Linux systems engineering and administration
Infrastructure as Code (Terraform, CloudFormation, or similar)
CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions, or similar)
Monitoring and observability (Prometheus, Grafana, Datadog, Splunk, etc.)
Scripting/programming (Python, Go, or similar)
Security and compliance for financial/regulated environments
Incident management and on-call operations
Medical Insurance
Dental Insurance
Vision Insurance
Life Insurance
401k
AWS Certified Solutions Architect – Professional or equivalent cloud certification
CKA/CKAD (Kubernetes)
Security certification (e.g., CISSP, CCSP, or equivalent)
ITIL or SRE-focused training (e.g., Google SRE, SREcon coursework)
Important Notice
This listing was syndicated from Adzuna. We strive to keep information accurate, but do not assume responsibility for the content of this posting.
- Use the Apply button above to contact the employer directly
- Verify the employer and position details before applying
- Review our Terms of Service for listing policies