Shane is available for hire

Shane Keller

Verified Expert in Engineering

Machine Learning Developer

Location

San Francisco, CA, United States

Toptal Member Since

July 14, 2020

Shane is a machine learning engineer with skills in data science, data engineering, and cloud automation. He has a track record of developing big data applications and experience with all aspects of building production-grade machine learning systems, including big data collection, model development, model deployment, and infrastructure.

Machine Learning Python 3 SQL Pandas TensorFlow Keras Anaconda Amazon Web Services (AWS)Java 8 Terraform Docker Google Cloud Platform (GCP)Cassandra Spark MacOS Dask

Portfolio

Level

Kubernetes, SQL, Python

scikit-learn

Scikit-learn

Temple Capital

Amazon Web Services (AWS), SQL, Python

Experience

SQL - 6 years Python 3 - 6 years TensorFlow - 4 years Amazon Web Services (AWS) - 4 years Data Science - 4 years Pandas - 4 years Scikit-learn - 4 years Machine Learning - 3 years

Availability

Part-time

Preferred Environment

Amazon Web Services (AWS), Google Cloud Platform (GCP), DataGrip, PyCharm, Jupyter, Linux, Unix, MacOS

The most amazing...

...thing I've done is use data science and machine learning to help build an automated trading system that earns millions of dollars.

Work Experience

Senior Data Scientist

2020 - PRESENT

Level

Joined Level as its second-most senior data scientist. Level is a fast-growing healthtech startup backed by First Round Capital and other elite VC firms.
Built machine learning models to predict insurance costs, detect fraud, and determine the quality of service providers. These models are key to Level's competitive advantage.
Performed exploratory data analysis on various data sets to deliver critical business insights and inform product development.

Technologies: Kubernetes, SQL, Python

Contibutor

2020 - PRESENT

scikit-learn

Read statistics papers related to data sampling to find evidence for new stratified regression data sampling feature.
Contributed to docs on machine learning and data science best practices.
Advocated for new features such as p-value support in linear regression models.

Technologies: Scikit-learn

Senior Data Engineer

2018 - 2020

Temple Capital

Hired as employee #1 at a hedge fund backed by Pantera Capital and Bain Capital that used machine learning to find profitable trading opportunities.
Worked with a machine learning stack that used proprietary statistical features, XGBoost and Random Forest regression models, and Bayesian hyperparameter optimization to train profitable time series prediction models.
Used pandas and SQL to create dashboards that analyzed trading strategy performance and identified changes to our trading system, increasing trading profits by up to 3%..
Explored historical market data to find interesting patterns and trends that could support novel trading strategies.
Built the fund's data platform and data lake on AWS using Postgres (RDS), S3, Presto (Athena), Batch, Step Functions, and ECS. The platform was used by our machine learning platform to ingest TBs of data and discover new trading strategies.
Served as the sole engineer on the DevOps and cloud automation side, and set up a robust and stable system that needed very little maintenance and had almost zero downtime over 1.5 years. Deployed hundreds of machines with Docker and Terraform.

Technologies: Amazon Web Services (AWS), SQL, Python

Senior Software Engineer

2016 - 2018

Fitbit

Worked with the data science team to investigate and predict user behavior to increase Fitbit user retention.
Used pandas and SQL to investigate and debug system outages to ensure a seamless Fitbit user experience.
Served as a lead contributor to the Fitbit API Authorization Service, a distributed system that handles authentication and authorization using OAuth 2. The system receives 80,000 RPS, serving all third party apps, Fitbit mobile, and Fitbit web apps.
Built migration framework used to convert the Auth Service database from MySQL to Cassandra, providing service scalability with no API downtime incurred. Designed the Cassandra schema.

Technologies: Java, SQL, Python

Experience

Outperforming Google Cloud AutoML Vision with Tensorflow

https://medium.com/@skeller88

Built a binary classifier that used deep learning to detect clouds in satellite images with 95% recall and 91% precision, and outperformed Google Cloud AutoML by 3% and 9%, respectively. Technologies used were Keras/Tensorflow, scikit-learn, Dask, Docker, and Google Cloud Platform. Published in Towards Data Science.

Skills

Languages

Python 3, SQL, Java 8, Python, Java

Libraries/APIs

Pandas, Scikit-learn, TensorFlow, Keras

Paradigms

Data Science

Platforms

Anaconda, Amazon Web Services (AWS), Google Cloud Platform (GCP), Docker, MacOS, Unix, Linux, Kubernetes

Other

Machine Learning

Tools

Terraform, Jupyter, PyCharm, DataGrip

Storage

Cassandra

Frameworks

Spark

Education

2014 - 2014

Continuing Education in Data Mining, Computer Science

Stanford University - Palo Alto, CA

2006 - 2010

Bachelor's Degree in Neuroscience

University of Southern California - Los Angeles, CA

Certifications

JUNE 2014 - PRESENT

Software Engineering

Hack Reactor

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring