Andrew Collier, Developer in Newbury, United Kingdom
Andrew is available for hire
Hire Andrew

Andrew Collier

Verified Expert  in Engineering

Bio

Andrew picked up programming and data analysis skills while working as an experimental physicist. He now works as a data scientist. His tools of choice are R and Python, with a lot of SQL thrown for good measure. Andrew also uses Docker extensively and has worked with both AWS and Azure. He has a particular passion for web scraping and is also an accomplished speaker and trainer.

Portfolio

Unrival Limited
Python, Unstructured Data Analysis, Web Crawlers, Data Analysis...
Fathom Data
Linux, SQL, Python, R, Web Scraping, Machine Learning, Bash, RStudio Shiny...
Toptal
Python, R, Git, Amazon Web Services (AWS), Docker, Web Scraping...

Experience

  • Linux - 15 years
  • Data Science - 10 years
  • Web Scraping - 10 years
  • Machine Learning - 10 years
  • Python - 10 years
  • R - 8 years
  • SQL - 6 years
  • RStudio Shiny - 5 years

Availability

Part-time

Preferred Environment

Bash, Linux, Git, Jupyter, Docker, Python, Amazon Web Services (AWS), SQL, R

The most amazing...

...system I've developed has been running autonomously in Antarctica for over a decade.

Work Experience

Web Crawling Specialist

2020 - PRESENT
Unrival Limited
  • Developed a web scraper for extracting data from large social media platform for B2B marketing product.
  • Generated automated reports in HTML and PDF using scraped data.
  • Used the Watson APIs to parse and analyze scraped data.
  • Used the Bing Maps API to geolocate locations in scraped data.
  • Developed a flexible web scraping framework to gather data from over 100 different companies' C-suite pages.
Technologies: Python, Unstructured Data Analysis, Web Crawlers, Data Analysis, Amazon Web Services (AWS), MySQL, SQL, Bing API, Large-scale Web Crawlers, APIs, Amazon S3 (AWS S3), Data Visualization, Pandas, Algorithms, Flask, JavaScript

Founder | Data Scientist

2017 - PRESENT
Fathom Data
  • Cleaned, prepared, and analyzed data: the process was done in both R and Python.
  • Built machine learning and deep learning models in both R and Python. Many of the models were subsequently deployed behind APIs.
  • Managed a team of data scientists and coordinated and interfaced with clients.
  • Automated documentation. Used R Markdown to generate reports and presentations automatically.
  • Developed and managed package: a number of packages for R and Python were constructed and maintained.
  • Prepared and gave lectures and presentations—training and speaking at conferences and workshops.
Technologies: Linux, SQL, Python, R, Web Scraping, Machine Learning, Bash, RStudio Shiny, Data Science, Automation, ArcGIS, Geospatial Data, Technical Writing

Freelance Data Scientist

2016 - PRESENT
Toptal
  • Built robust web scraper for extracting data for persons and organizations from LinkedIn and Sales Navigator.
  • Constructed PostgreSQL database for storing medical and drug data. Implemented ETL pipeline.
  • Used Python and spaCy to extract salient information from LinkedIn profiles and blog posts.
Technologies: Python, R, Git, Amazon Web Services (AWS), Docker, Web Scraping, Machine Learning, Bash, RStudio Shiny, Data Science, SQL, Automation, Amazon S3 (AWS S3)

Founder/Data Scientist

2008 - PRESENT
Exegetic Analytics
  • Conducted data analyses for clinical trials.
  • Developed a conformance analysis system for use in printing industry.
  • Implemented a Kagi Charts indicator in MQL4.
  • Conducted analysis of the effects of news events on FOREX trading using data scraped off myfxbook.
  • Initiated Durban R User Group and Durban Data Science Meetup.
Technologies: Linux, SQL, Python, R, Web Scraping, Machine Learning, Bash, RStudio Shiny, Data Science

Python Engineer

2023 - 2024
HumanOS
  • Designed and implemented a database. Set up on Amazon RDS.
  • Created a Flask API to interface the database to desktop and mobile apps.
  • Integrated the API with a 3rd-party (WeFitter) API to gather wearable data.
Technologies: Python, PostgreSQL, Flask, APIs, Amazon EC2, Amazon S3 (AWS S3), WebSockets, Amazon RDS

Python Data Analyst and Tech Writer | Loom Tutorial Screencasts

2022 - 2023
Domino Data Lab
  • Created videos and tutorial content for existing and new features.
  • Updated and maintained documentation. Added automation to the website build.
  • Provided feedback and bug reporting on new features.
Technologies: R, Python, Pandas, Technical Writing, JavaScript

R Engineer - Shiny App

2019 - 2022
BluePath Solutions LLC.
  • Developed multiple Shiny apps for interacting with data.
  • Developed a web crawler to extract pharmaceutical pricing data.
  • Designed and built a database using PostgreSQL; deployed on Amazon RDS.
Technologies: R, Data Science, Machine Learning, Amazon S3 (AWS S3), Data Visualization, Algorithms, Flask

Content Creator

2018 - 2019
Datacamp
  • Designed the content of an online course about machine learning with Spark.
  • Developed the course content, script, and associated material.
  • Created slides, recorded video and audio, and edited content.
  • Continued maintenance of the course and responded to issues raised by students.
Technologies: Spark, Python

Senior Data Scientist

2013 - 2017
Derivco
  • Coded a game recommendation engine.
  • Developed a game/player anomaly detection system.
  • Automated routine analyses.
  • Automated report generation.
  • Initiated Data Science Working Group.
Technologies: Linux, Microsoft Excel, SQL, Python, R, Web Scraping, Bash, RStudio Shiny, Data Science, Data Visualization, Pandas

Honorary Senior Lecturer

2004 - 2015
University of KwaZulu-Natal
  • Developed an autonomous observation system for experiments in Antarctica.
  • Applied machine learning techniques to lightning distributions.
  • Mentored students in R and data analysis.
  • Presented analytical results at numerous international conferences.
  • Published research results in international journals.
Technologies: Linux, MATLAB, Octave, R, Technical Writing

{emayili}

https://github.com/datawookie/emayili
An R package for sending emails.

The package has minimal dependencies and exposes a tidy API for writing and sending emails. It has detailed documentation and an extensive test suite.

The package has also been the subject of a number of blog posts and conference/meetup talks.

Trundler R Package

https://github.com/datawookie/trundler
An R wrapper for the Trundler API.

Trundler is a service that aggregates retail price data acquired via web scraping. The data are available via an API. This package provides a consistent set of functions for accessing the API from R.

Trundler Python Package

https://github.com/datawookie/trundlerpy
An R wrapper for the Trundler API.

Trundler is a service that aggregates retail price data acquired via web scraping. The data are available via an API. This package provides a consistent set of functions for accessing the API from R.

Scientific Advisor

Supervised two Ph.D. and numerous M.Sc. theses in the field of space physics.
2001 - 2006

Ph.D. Degree in Space Physics

Royal Institute of Technology - Stockholm, Sweden

1996 - 1998

M.Sc. Degree in Nuclear Physics

University of Potchefstroom - Potchefstroom, South Africa

1990 - 1993

B.Sc. (Hons) Degree in Physics & Mathematics

University of Natal - Durban, South Africa

JUNE 2006 - PRESENT

PhD

Royal Institute of Technology

Libraries/APIs

REST APIs, Beautiful Soup, Bing API, ArcGIS, Pandas

Tools

Microsoft Excel, Jupyter, Git, MATLAB

Languages

Python, SQL, Bash, R, Octave, C++, CSS, HTML, Sed, JavaScript

Platforms

Linux, RStudio, Docker, Amazon Web Services (AWS), Amazon EC2

Frameworks

Selenium, Scrapy, Flask, Django, RStudio Shiny, Spark

Paradigms

Automation

Storage

Amazon S3 (AWS S3), MongoDB, Neo4j, MySQL, PostgreSQL

Other

Machine Learning, Web Scraping, Task Automation, Regular Expressions, Visualization, Data Science, Statistics, Data Analysis, Artificial Intelligence (AI), Technology Consulting, Data Visualization, Technical Writing, Algorithms, Bayesian Statistics, Unstructured Data Analysis, Web Crawlers, Large-scale Web Crawlers, APIs, Geospatial Data, WebSockets, Amazon RDS

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring