Data Engineer - PySpark
As a Data Engineer - PySpark within the Customer Digital and Data division, you will be responsible for building and maintaining systems that collect, store, process, and analyze data. You will work on data pipelines, data warehouses, and data lakes to ensure all data is accurate, accessible, and secure while driving innovation in the digital landscape.
Your primary responsibilities include hands-on development using Python and PySpark for data processing, building ETL pipelines with Ab Initio, and writing complex SQL queries. You will work in Unix environments and contribute as an individual contributor to the team's delivery goals. Building and maintaining data architecture pipelines that enable the transfer and processing of durable, complete, and consistent data is a core part of the role.
You will design and implement data warehouses and data lakes that handle appropriate data volumes and velocity while adhering to required security measures. Developing processing and analysis algorithms suited to the intended data complexity is expected. Collaboration with data scientists to build and deploy machine learning models is also part of the scope.
Experience with AI tools and Agentic AI, as well as cloud platforms particularly AWS, are highly valued additional skills. You will take responsibility for managing risk and strengthening controls in relation to your work, ensuring compliance with enterprise-wide risk management frameworks.