Andrew Burnie, Developer in London, United Kingdom
Andrew is available for hire
Hire Andrew

Andrew Burnie

Verified Expert  in Engineering

Data Scientist and Machine Learning Developer

Location
London, United Kingdom
Toptal Member Since
July 7, 2020

Andrew is an expert in extracting commercially valuable insights from data, with a specialty in text analysis. He has experience in the retail, insurance, and other financial sectors. Andrew earned a PhD in social media analysis, funded by the UK's national institute for data science and AI, and a master's degree in management and economics from the University of Cambridge. He enjoys using machine learning and statistics to help companies get the most out of their data.

Portfolio

The Alan Turing Institute
Predictive Modeling, Statistics, Statistical Analysis, Pandas, NumPy, Gensim...
ERS (A Lloyd's Syndicate and the UK's Largest Specialist Motor Insurer)
Regression, XGBoost, Predictive Modeling, Statistics, Statistical Analysis...
Hitachi Consulting
Customer Analytics, Regression, XGBoost, Predictive Modeling, Statistics...

Experience

Availability

Part-time

Preferred Environment

Windows, Linux, Python

The most amazing...

...thing I've developed was a groundbreaking NLP algorithm to extract from social media the likeliest causes of major changes in the price of Bitcoin and Ethereum.

Work Experience

Doctoral Student (PhD Completed)

2017 - 2020
The Alan Turing Institute
  • Earned a PhD at The Alan Turing Institute, the UK's national institute for data science and artificial intelligence.
  • Created a new, nonparametric NLP algorithm (DDPWI) to extract from social media text the words associated with declining Bitcoin prices. Published in Royal Society Open Science.
  • Applied neural networks in a new Word2vec-based topic modeling algorithm to detect which topics discussed on social media were associated with phasic shifts in Bitcoin prices. Presented at ACM SIGIR 2019.
  • Showed the one-off effect of regulatory bans on Bitcoin and the recurring effects of rival innovations on the Ether price.
  • Demonstrated how nonparametric correlation networks could be applied to explore the associations between the prices of different financial assets. This work has been cited 15 times.
Technologies: Predictive Modeling, Statistics, Statistical Analysis, Pandas, NumPy, Gensim, SpaCy, Fintech, SciPy, Scikit-learn, Text Analytics, Causal Inference, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), GPT, Topic Modeling, Neural Networks, Time Series, Windows, Linux, Keras, Python, Natural Language Toolkit (NLTK), Social Media APIs, Cryptocurrency, TensorFlow, Data Science, Deep Learning, Web Scraping, Artificial Intelligence (AI), Machine Learning

Head of Data Science

2017 - 2017
ERS (A Lloyd's Syndicate and the UK's Largest Specialist Motor Insurer)
  • Set up the data science team at the UK's largest specialist motor insurer with accountability at the board level.
  • Enabled the fine-tuning of insurance premiums based on customized, specialized, nonparametric technologies rather than the standard linear models offered by Willis Towers Watson software.
  • Pioneered the application of Random Forests, XGBoost, and GLM along with approaches to feature selection that included Random Forest techniques, including Boruta and statistical testing, to generate prediction models.
Technologies: Regression, XGBoost, Predictive Modeling, Statistics, Statistical Analysis, Pandas, NumPy, Insurance, Tidyverse, Fintech, SciPy, Scikit-learn, Random Forests, Linear Regression, Windows, Python, R, Data Science, Big Data, Artificial Intelligence (AI), Machine Learning, Boruta, Generalized Linear Model (GLM)

Data Scientist

2016 - 2017
Hitachi Consulting
  • Introduced regression approaches (linear, lasso, ridge, elastic net, and logistic) to feature selection to improve the customer experience by transforming data into actionable insights.
  • Identified potential root causes of sales revenue in a marketing project using machine learning feature selection approaches.
  • Developed data science project templates used in machine learning and customer analytics projects.
Technologies: Customer Analytics, Regression, XGBoost, Predictive Modeling, Statistics, Statistical Analysis, Pandas, NumPy, Marketing, Tidyverse, SciPy, Scikit-learn, Consulting, Text Analytics, GPT, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Random Forests, Time Series, Linear Regression, Windows, SQL, Python, R, Data Science, Web Scraping, Big Data, Artificial Intelligence (AI), Machine Learning, Azure

Data Scientist

2016 - 2016
Model Citizens Ltd
  • Managed customer data analysis identifying opportunities and threats for a retailer.
  • Performed a data audit and analysis for an entertainment company.
  • Pioneered the use of clustering algorithms to classify customers into different types.
Technologies: Customer Analytics, Regression, XGBoost, Predictive Modeling, Statistics, Statistical Analysis, Pandas, NumPy, Marketing, Tidyverse, SciPy, Scikit-learn, Consulting, Random Forests, Windows, SQL, Python, R, Data Science, Web Scraping, Big Data, Artificial Intelligence (AI), Machine Learning

Postgraduate Student

2015 - 2016
Grenoble École de Management
  • Analyzed the financial and nonfinancial drivers of internet company valuations.
  • Determined that financial statements could explain only about one-third of the variation in the price-to-sales ratio.
  • Published in the journal, Bankers, Markets & Investors.
Technologies: Plotly, Regression, Statistics, Statistical Analysis, Time Series, Linear Regression, R, Data Science

Internship

2014 - 2014
DN Capital
  • Advised DN Capital on the threats and opportunities facing the European venture capital market.
  • Provided guidance to DN Capital on how best to position itself to mitigate threats and take advantage of opportunities.
  • Published a report that was awarded a First Class Degree by the University of Cambridge.
Technologies: Web Scraping, CB Insights, Pitch Books

Causal Inference Framework for Extracting Insights from Social Media Text

I constructed a causal inference framework, based on healthcare epidemiology principles, that identified the likeliest causes of major changes in the price of Bitcoin and Ethereum. This involved using Python, NLTK, and Gensim to code a pipeline that processed social media text and then extracted topics of interest. This was published in Frontiers in Blockchain.

Languages

Python, JavaScript, R, SQL, Julia, Java

Libraries/APIs

Social Media APIs, SpaCy, Natural Language Toolkit (NLTK), Scikit-learn, SciPy, TensorFlow, NumPy, Pandas, REST APIs, Keras, Tidyverse, XGBoost

Tools

Gensim, Azure Machine Learning, Plotly, IBM Watson

Paradigms

Data Science

Storage

JSON

Other

Machine Learning, Neural Networks, Topic Modeling, Artificial Intelligence (AI), Natural Language Processing (NLP), Causal Inference, Asset Valuation, Cryptocurrency, Linear Regression, Random Forests, Statistical Analysis, Big Data, Text Analytics, Web Scraping, Fintech, Consulting, Time Series, Deep Learning, Algorithms, Sentiment Analysis, NLU, Chatbots, Microsoft Azure, Deep Neural Networks, APIs, GPT, Generative Pre-trained Transformers (GPT), Cloud Platforms, Document Processing, OCR, Custom BERT, Full-stack, Web App UI, Statistics, Predictive Modeling, Generalized Linear Model (GLM), Boruta, Regression, Customer Analytics, Pitch Books, Chatbot Conversation Design, IBM Watson Assistant

Frameworks

Selenium, Azure Bot Framework

Platforms

Azure, Linux, Windows, CB Insights

Industry Expertise

Marketing, Insurance

2017 - 2020

PhD in Computer Science

University College London (UCL) - London, England, United Kingdom

2018 - 2018

Master of Arts Degree in Management Studies and Economics

University of Cambridge - Cambridge, England, United Kingdom

2014 - 2016

Master of Science Degree in Finance

Grenoble École de Management - London, England, United Kingdom

2011 - 2014

Bachelor of Arts Degree in Management Studies and Economics

University of Cambridge - Cambridge, England, United Kingdom

JANUARY 2021 - PRESENT

Rasa Certified Chatbot Developer

Rasa

NOVEMBER 2020 - PRESENT

HackerRank Gold for Java

HackerRank

MAY 2020 - PRESENT

HackerRank Gold for Python

HackerRank

FEBRUARY 2018 - PRESENT

Sequence Models

deeplearning.ai | via Coursera

FEBRUARY 2018 - PRESENT

Neural Networks and Deep Learning

deeplearning.ai | via Coursera

FEBRUARY 2017 - PRESENT

Machine Learning Specialization

University of Washington | via Coursera

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring