Data Engineering Manager at Penguin Random House UK π§π, leading a team of 5 talented data engineers. Combining Data Science and Engineering expertise to build robust pipelines and drive growth ππ₯. Passionate about unlocking the power of data π§ π‘ to inform data-driven strategies πΌπ. Prior experience at Palantir and at Santander.
π BIMA100 (Young Trailblazer in Tech)
Technical Skills
Soft Skills/Business Skills
Interests/Hobbies
β Developing data pipelines using Python, Spark & SQL β debugging any issues with the pipelines using the foundry platform within Palantir.
β Promoting a goal-setting and result-orientated attitude within the team using agile development techniques. Role further includes developing technical framework for data engineering for the healthcare data pipelines, which Palantir has a contract with the NHS.
β Liaising with technical and non-technical teams within Palantir to understand client requirements for the data pipelines and using Python-based ETL, with Git and CI/CD to ensure that the data scientists can operate.
β Chatbot optimization using natural language processing in Python to improve customer satisfaction and identify where agents/chatbots can improve responses and alert appropriate stakeholders when customer sentiment reaches a certain negative threshold.
β Mortgage churn model using Python and Plotly Dash to help improve customer retention by predicting customers who are likely to default on their mortgage. Improved customer retention by up to 3%.
β Entity resolution using Python to automate client remediation within the banks data lake using API calls from the third-party DueDil and implement a robust data engineering solution, presenting the project C-Suite stakeholders, technologies used include Python, Spark & Impala SQL.
β BBB reporting using SQL to create ETL scripts to automate the invoicing reports to the British Business Bank for COVID business loans. This replaced the slow manual reporting tools by processing 150,000 contract accruals daily, generating Β£28M revenue in 2020.
β Model experimentation using Python for a range of proof-of-concept models internally using AWS for deployment.
β Automating Data Lake ETL, CI/CD and model build in Docker images using Cloudera CDSW and AWS (Redshift, S3, Sagemaker, ECR).
β Operating using the agile project management methodology within a team of data scientists.
β Credit Risk using SAS to develop capital (economic + regulatory) models for the corporate bank.
β Technical presentations such as CI/CD, Git for version control and AWS to employees within the bank.
Managing the development of the Suneeta London website www.suneetalondon.co.uk and the respective CBD website www.suneetacbd.co.uk, whilst ensuring that we complied with legislation for the CBD website.
Attending a variety of events and acting on the companies behalf, most notable collaboration was being invited to the HQ of ASOS.