James McKay, Developer in Christchurch, Canterbury, New Zealand
James is available for hire
Hire James

James McKay

Verified Expert  in Engineering

Machine Learning Developer

Location
Christchurch, Canterbury, New Zealand
Toptal Member Since
July 27, 2021

James brings a strong computational methods and software development background to find robust and concise solutions to business problems. He's experienced working with everything from environmental data to international accounts. He originally trained as a physicist, receiving a Ph.D. from Imperial College London, and has published work in leading scientific journals. He has held roles as a software engineer and senior analyst, delivering high-quality insights from large and complex data sets.

Portfolio

Contact Energy
Spark SQL, Spark, Python, SQL, Apache Airflow
Statistics New Zealand
R, Python, Analysis, Statistics, Data Visualization, Data Processing...
Statistics New Zealand
R, RStudio Shiny

Experience

Availability

Part-time

Preferred Environment

Ubuntu, Python, R, C++, Amazon Web Services (AWS), Git

The most amazing...

...thing I have developed recently was a highly successful data pipeline and visualization tool that went into production in less than three weeks.

Work Experience

Data Engineer

2021 - 2022
Contact Energy
  • Built data products from large internal data sets.
  • Communicated with stakeholders within the business to gather requirements and build products to fit. The analysis work required that I help identify issues with the data and how to best achieve the requirements.
  • Used PySpark and SQL primarily, orchestrated by Airflow, to extract, transform, and load data sets within an enterprise environment.
Technologies: Spark SQL, Spark, Python, SQL, Apache Airflow

Senior Design Analyst

2019 - 2021
Statistics New Zealand
  • Developed a data pipeline and an output tool to get economic, health, and social data out to the public within weeks of the New Zealand COVID-19 lockdown. This solution was highly generic, scalable, and fit for purpose and is still in production.
  • Developed a framework for building a processing system with a large number of business rules. This framework enables analysts to build business logic in a robust way, with a test framework around all parts of the system.
  • Raised the technical capability of other staff through training and mentoring.
  • Redeveloped the Indicators Aotearoa website as a Shiny R application, which improved the app's ability to be kept up to date, with a clean metadata-driven approach to managing the content and robust pipelines for the data it presents.
Technologies: R, Python, Analysis, Statistics, Data Visualization, Data Processing, Machine Learning, Software Development

Design Analyst

2019 - 2020
Statistics New Zealand
  • Produced a statistical model for estimating the expenditure of international visitors to New Zealand based on electronic card expenditure. This model had to be rushed into production when traditional survey methods were not possible in 2020.
  • Built a data processing pipeline for hundreds of millions of records from electronic card expenditure data. This involved data cleaning, fuzzy matching, aggregation, and visualization.
  • Developed a visualization tool for New Zealand's import and exports in goods and services. This made a large and detailed data set easy for users to view, download, and understand.
Technologies: R, RStudio Shiny

Software Engineer

2018 - 2019
Verizon Connect
  • Developed a React application for customers to visualize and manage their data.
  • Maintained, fixed bugs, and developed new features within a large legacy codebase, alongside dozens of other developers. This required working in an Agile environment, writing tests, and carefully managing changes.
  • Maintained and contributed to a C# application which provided search functionality to end users via a REST API.
Technologies: React, C#, JavaScript, TypeScript, Visual Studio, Agile, Software Development

Doctoral Candidate

2015 - 2018
Imperial College London
  • Independently led a study into the computational methods used to compute higher-order quantum corrections to dark matter particle masses. This resulted in two lead author publications, conference presentations and a sophisticated piece of software.
  • Published the statistical study of a particular class of dark matter models as the lead author. This involved developing and running computational physics code in production on some of the world's largest supercomputers.
  • Tutored undergraduate students, attended international conferences and meetings, spent working visits at a number of universities, and collaborated with other scientists around the world.
  • Made major contributions to a novel study of statistical sampling algorithms for high energy physics. This involved running managing production scans using a range of different algorithms and different dimensionalities to draw meaningful comparisons.
Technologies: Advanced Physics, C++, Python, Data Visualization, Bayesian Statistics, Statistical Methods, Computational Physics, Presentations, Technical Writing

The New Zealand River Guide

https://www.riverguide.co.nz/
The New Zealand RiverGuide was developed to provide live environmental data alongside recreational guides for activities in New Zealand freshwater. It also provides functionality for users to edit descriptions, post public notices, and log their activities. This was a personal project in collaboration with an environmental scientist with a vision to both share and collect data about freshwater recreation in New Zealand.

This project involved a React front-end application, a high-performance back end service delivering data from over 1,000 live environment data sources, and a content management system for handling user data and content.

Estimating Visitor Expenditure in New Zealand

https://www.stats.govt.nz/methods/estimating-visitor-expenditure-in-new-zealand-during-the-june-2020-quarter
International visitor expenditure in New Zealand has previously been measured through a survey collection. This method was not available as a result of COVID-19 and low border crossings, so we needed to use an alternative method to measure this key macroeconomic statistic.

I developed this model to estimate expenditure based on historical survey data and electronic card expenditure data. Having only card data at the aggregate level, it was important to keep the model simple while still trying to extract as much information as possible.

COVID-19 Data Portal

https://www.stats.govt.nz/experimental/covid-19-data-portal
As New Zealand entered a lockdown to prevent the spread of COVID-19, there was a need for accessible, high frequency, and up-to-date data. With so many data sets available and coming in fast, we wanted to produce something to get data out as quickly as possible.

I developed this Shiny R application and a data transformation pipeline to easily and quickly load and update data sets from a range of different sources and in a range of different formats. The simple and clean front end was driven entirely by configuration yet was flexible enough to accommodate various types of data.

This project was handed off late in 2020 and continues to be maintained, with it now showing hundreds of different data sets on the same core architecture that I initially developed.

The Cosmological Rest Frame

https://academic.oup.com/mnras/article/457/3/3285/2588916
For my master's research, I extended work on the definition of a cosmic rest frame. Using a sample of 4,534 galaxies, I performed an optimization to find the reference frame that minimized the observed variation in the expansion of the local universe. This analysis ruled out the cosmic microwave background as the cosmological frame of rest and was published in the Monthly Notices of the Royal Astronomical Society.

Global Fits of Dark Matter Models

One key outcome from my research as a PhD candidate was on the impact of experimental constraints on a particular class of dark matter models. These models aim to describe the nature of dark matter, and their parameters can be constrained by experimental data.

This work uses both frequentist and Bayesian statistical methods, along with sophisticated optimization algorithms, to make inferences on the viable parameter spaces for these models.

Languages

Python, R, Python 3, C++, JavaScript, TypeScript, Fortran, C#, SQL

Tools

LaTeX, Dplyr, MATLAB, Visual Studio, GIS, Git, Spark SQL, Apache Airflow

Other

Research, Advanced Physics, Optimization, Analysis, Technical Writing, Data Processing, Software Development, Machine Learning, Data Visualization, Differential Equations, Bayesian Inference & Modeling, Discrete Mathematics, Applied Mathematics, Statistics, User Experience (UX), Bayesian Statistics, Statistical Methods, Computational Physics, Presentations

Libraries/APIs

React, Pandas, Matplotlib, Scikit-learn

Platforms

Ubuntu, Visual Studio Code (VS Code), Amazon Web Services (AWS)

Frameworks

RStudio Shiny, Boost, Spark

Paradigms

Agile

2015 - 2018

Ph.D. in Physics

Imperial College London - London, United Kingdom

2014 - 2015

Master's Degree in Physics

University of Canterbury - Christchurch, New Zealand

2010 - 2013

Bachelor's Degree with First Class Honors in Mathematical Physics

University of Canterbury - Christchurch, New Zealand

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring