Reliability Lead
Your Role
- Defining and maintaining reliability standards (SLOs, SLIs, resilience patterns, monitoring requirements).
- Coordinating incident prevention and resolution across engineering, platform and DevOps teams.
- Ensuring observability tooling and dashboards provide actionable visibility into platform health.
- Reviewing changes for operational impact and ensuring readiness before deployment.
- Guiding the Reliability DevOps Engineers and aligning reliability practices across all streams.
Your Profile
- 7+ years of experience in reliability, SRE, platform operations, or cloud infrastructure roles.
- Strong understanding of monitoring, logging, alerting and resilience engineering practices.
- Hands‑on experience with cloud platforms (preferably GCP) and container orchestration.
- Excellent coordination skills and ability to work with engineering, infra, and DevOps teams.
- Proven experience managing engineering teams in AI/ML, platform, or cloud environments.
What you'll love about working with Capgemini
- We value flexibility and support a healthy work-life balance through remote and hybrid work options.
- Career development programs and certifications in cloud technologies.
- A diverse and inclusive workplace that fosters innovation and collaboration.
Mumbai (ex Bombay), IN