Data Engineer(Pyspark)
Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues around the world, and where you’ll be able to reimagine what’s possible. Join us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world.
Your Team
Insights & Data delivers state-of-the-art Data solutions. Our expertise primarily lies in Cloud & Big Data engineering, where we develop robust systems capable of processing extensive and complex datasets, utilizing specialized Cloud Data services across platforms like AWS, Azure, and GCP. We oversee the entire Software Development Life Cycle (SDLC) of these solutions, which involves not only leveraging data processing tools such as ETL but also extensive programming in languages like Python, Scala, or Java, coupled with the adoption of DevOps tools and best practices. The processed data is then made accessible to downstream systems through APIs, outbound interfaces, or is visualized via comprehensive reports and dashboards. Additionally, within our AI Center of Excellence, we undertake Data Science and Machine Learning projects with a focus on cutting-edge areas such as Generative AI, Natural Language Processing (NLP), Anomaly Detection, and Computer Vision.
Your Task
- Design and implement robust data pipelines using PySpark.
- Process and transform large datasets efficiently in distributed environments.
- Collaborate with data architects and analysts to deliver high-quality data solutions.
- Ensure data quality, consistency, and performance across systems.
- Participate in code reviews and contribute to technical improvements.
Your Profile
- You have hands-on experience in data engineering and can independently handle moderately complex tasks.
- You are proficient in PySpark and understand distributed data processing.
- You are comfortable working with Python for data transformation and automation.
- You have experience with relational and/or NoSQL databases.
- You communicate clearly and effectively in English.
Nice to Have
- Familiarity with cloud platforms (AWS, Azure, or GCP).
- Solid SQL skills and understanding of data modeling.
- Experience with orchestration tools (e.g., Airflow, Prefect).
- Exposure to CI/CD pipelines and DevOps practices.
- Knowledge of streaming technologies (e.g., Kafka, Spark Streaming).
- Experience working with Databricks in a production or development environment.
- Relevant certifications in data engineering or big data technologies.
What You'll Love About Working Here
- Well-being culture: medical care with Medicover, private life insurance, and Sports card. But we went one step further by creating our own Capgemini Helpline offering therapeutical support if needed and the educational podcast "Let's talk about wellbeing" which you can listen to on Spotify.
- Access to over 70 training tracks with certification opportunities (e.g., GenAI, Architects, Google) on our NEXT platform. Dive into a world of knowledge with free access to Education First languages platform, Pluralsight, TED Talks, Coursera and Udemy Business materials and trainings.
- Award-Winning Stability & Culture: Become part of an organization celebrated as "Top Employer Poland 2024" - in the audit our stable and supportive work environment scored 100%!
- Cutting-Edge Technology: Position yourself at the forefront of IT innovation, working with the latest technologies and platforms. Capgemini partners with top global enterprises, including 145 Fortune 500 companies.
Get To Know Us
Capgemini is committed to diversity and inclusion, ensuring fairness in all employment practices. We evaluate individuals based on qualifications and performance, not personal characteristics, striving to create a workplace where everyone can succeed and feel valued.Do you want to get to know us better? Check our Instagram — @capgeminipl or visit our Facebook profile — Capgemini Polska. You can also find us on YouTube.
About Capgemini
Capgemini is an AI-powered global business and technology transformation partner, delivering tangible business value. We imagine the future of organizations and make it real with AI, technology and people. With our strong heritage of nearly 60 years, we are a responsible and diverse group of over 420,000 team members in more than 50 countries. We deliver end-to-end services and solutions with our deep industry expertise and strong partner ecosystem, leveraging our capabilities across strategy, technology, design, engineering and business operations. The Group reported 2025 global revenues of €22.5 billion.
Gdańsk, PL Warszawa, PL Kraków, PL Wrocław, PL Poznań, PL