Cloud Infrastructure Engineer
At Capgemini Engineering, the world leader in engineering services, we bring together a global team of engineers, scientists, and architects to help the world’s most innovative companies unleash their potential. From autonomous cars to life-saving robots, our digital and software technology experts think outside the box as they provide unique R&D and engineering services across all industries. Join us for a career full of opportunities. Where you can make a difference. Where no two days are the same.
Job Description
Your role:
- Manage, troubleshoot, and optimize containerized applications and infrastructure deployed on Kubernetes, RedHat OpenShift, and OpenStack platforms.
- Serve as the Subject Matter Expert (SME) for core cloud infrastructure technologies, including advanced Linux (CentOS) system administration, Docker/Containers, and complex networking configurations.
- Lead the investigation and resolution of complex, high-severity customer issues, applying strong analytical knowledge to quickly diagnose problems across the entire cloud stack.
- Utilize your expertise to quickly identify root causes and implement effective, durable solutions for customer incidents.
- Prepare and conduct rigorous Root Cause Analysis (RCA) for critical incidents to identify systemic issues and prevent recurrence.
- Develop, test, and maintain robust automation scripts using Python and Ansible to streamline daily operational tasks and improve overall service efficiency.
- Identify and implement automation opportunities to reduce manual effort in maintenance and deployment activities.
- Provide end-to-end Escalation, Monitoring, and Emergency (EME) support, acting as a final escalation point to ensure service availability and meet SLAs.
- Liaise directly with customers team and internal teams to understand requirements and deliver tailored technical solutions.
- Stay current with industry best practices and emerging technologies in cloud and containerization.
Required:
- Linux Expertise: Strong knowledge and proven hands-on experience wth Linux administration.
- Networking Foundations: Strong knowledge of core networking principles (TCP/IP, routing, load balancing, firewalls) in a cloud environment.
- Containerization & Virtualization: Strong knowledge of Kubernetes orchestration, OpenStack platforms, and Docker/Containerization.
- Problem-Solving Mindset: Possess sharp troubleshooting skills combined with an analytical mindset to dissect and address complex challenges.
- Scripting and Automation: Solid Python scripting skills for task automation and system management.
- Configuration Management: Hands-on experience with Ansible for configuration management.’
- Root Cause Analysis (RCA): Expertise in preparation and implementation of RCAs.
- Escalation and Monitoring: Proven experience with EME (Escalation, Monitoring, and Emergency) management processes.
One or more certifications from the list below will be considered an added advantage:
- Red Hat Certified Specialist in Cloud Infrastructure (EX210).
- Red Hat Certified Engineer (RHCE) in Red Hat OpenStack (EX310).
- RHCSA, RHCE, CKA.
- EX280 (RedHat Certified Specialist in OpenShift Administration).
- EX380 (RedHat Certified Specialist in OpenShift Automation and API Management).
#LI-DC10
#LI-Remote
Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world, while creating tangible impact for enterprises and society. It is a responsible and diverse group of 340,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fueled by its market leading capabilities in AI, generative AI, cloud and data, combined with its deep industry expertise and partner ecosystem.
Bogota, CO