Daniel Davies, Developer in London, United Kingdom
Daniel is available for hire
Hire Daniel

Daniel Davies

Verified Expert  in Engineering

Data Engineer and Developer

Location
London, United Kingdom
Toptal Member Since
September 23, 2022

Daniel is a highly motivated data engineer who has used his skills to impact at a national level, from pipelines running national vaccination programs to research contributions with partner universities on the most pressing societal issues. He enjoys delving into the internals of extensive data systems, emphasizing Spark, and using his detailed knowledge to write code that delivers insights faster and more robust. He is most excited by pipelines that change organizations or entire industries.

Portfolio

Palantir
PySpark, Apache Airflow, Spark, TypeScript, SQL, PostgreSQL, Data Pipelines...
Bank of America
React, TypeScript, Python
University of California, Irvine
C#, .NET, IIS SQL Server, IIS

Experience

Availability

Part-time

Preferred Environment

MacOS, Linux

The most amazing...

...project I've delivered is a pipeline that combined data sources into a life-events dataset used in a COVID-19 study of national significance.

Work Experience

Data Engineer

2020 - PRESENT
Palantir
  • Developed impactful pipelines collaboratively—such as creating a life-events dataset for the UK care home residents that was used in a study of national significance and pipelines used to distribute vaccines across the UK.
  • Maintained multiple pipelines independently and optimized them to save client resources. Recent accomplishments involved taking an end-to-end pipeline from a five-hour job to three hours by learning about Spark internals and analyzing query plans.
  • Developed a training center application rolled out to the foundation stacks across the globe in partnership with the central enablement team, which was in the process of a patent application.
  • Involved in the hiring process (conducting tens of interviews) and onboarding over five new employees to the Spark environment, then ensured they were successful for their first few months at the company.
Technologies: PySpark, Apache Airflow, Spark, TypeScript, SQL, PostgreSQL, Data Pipelines, ETL, Hadoop

Software Engineer

2019 - 2019
Bank of America
  • Navigated a complex existing (TypeScript-React) codebase with significant configuration to successfully add a new application for the equity-trading team. It involved iterating with stakeholders and mapping requirements into the application.
  • Rolled out an extension to the existing Python back-end services, leveraging BofA's proprietary trading platform. Specifically, working with a tick database (kdb+) and ensuring that the application could deal with large loads of time series.
  • Took ownership of demos of developed applications on calls with very senior stakeholders at the bank.
Technologies: React, TypeScript, Python

Software Engineer

2019 - 2019
University of California, Irvine
  • Architected and executed a project independently to deliver a green-field capability for UCI that would enable donors to find professors working in relevant fields. It was a full-stack project where I produced the front end, back end, and hosting.
  • Built a search algorithm that met the client's criteria and enabled the searching of faculty in a way that was not previously possible. Optimized the initial iteration of the search algorithm to run from 20 seconds to under two seconds.
  • Created a stunning UI that was walk-up usable and was commended by the client.
Technologies: C#, .NET, IIS SQL Server, IIS

Contributions to Apache Spark

https://github.com/Daniel-Davies/spark
Contributed to the Apache Spark platform—to fix some of the PySpark date APIs and make some of the parameters to the PySpark functions more generic. Outside of code contributions, I regularly post issues to the maintainers on Apache's Jira.

Data Pipelines for COVID-19 Care Home Study

https://www.thelancet.com/journals/lanhl/article/PIIS2666-7568(21)00093-3/fulltext
Built data pipelines that powered some of the most crucial COVID-19 research in the UK during the height of the pandemic, which was eventually published in the Lancet medical journal and widely reported in media. The project specifically involved combining multiple data sources (with different schemas, update frequencies, and stakeholders) and creating a 'life-events' dataset for people in care homes. The 'event' data asset was used in several reporting verticals, most notably in the antibody prevalence of care home residents and vaccine effectiveness in older people.

Frameworks

Spark, .NET, Hadoop

Libraries/APIs

PySpark, React

Tools

Apache Airflow

Storage

Data Pipelines, PostgreSQL, IIS SQL Server

Other

Programming, Data Engineering, Foundry, IIS

Languages

TypeScript, SQL, C#, Python, Scala

Paradigms

ETL

Platforms

MacOS, Linux

2019 - 2020

Master's Degree in Computer Science

University of California - Irvine, CA

2016 - 2020

Master's Degree in Computer Science

University of Bristol - Bristol, England

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring