Shane Keller, Machine Learning Developer in San Francisco, CA, United States
Shane Keller

Machine Learning Developer in San Francisco, CA, United States

Member since July 14, 2020
Shane is a machine learning engineer with skills in data science, data engineering, and cloud automation. He has a track record of developing big data applications and experience with all aspects of building production-grade machine learning systems, including big data collection, model development, model deployment, and infrastructure.
Shane is now available for hire

Portfolio

Experience

Location

San Francisco, CA, United States

Availability

Part-time

Preferred Environment

Amazon Web Services (AWS), Google Cloud Platform (GCP), DataGrip, PyCharm, Jupyter, Linux, Unix, MacOS

The most amazing...

...thing I've done is use data science and machine learning to help build an automated trading system that earns millions of dollars.

Employment

  • Senior Data Scientist

    2020 - PRESENT
    Level
    • Joined Level as its second-most senior data scientist. Level is a fast-growing healthtech startup backed by First Round Capital and other elite VC firms.
    • Built machine learning models to predict insurance costs, detect fraud, and determine the quality of service providers. These models are key to Level's competitive advantage.
    • Performed exploratory data analysis on various data sets to deliver critical business insights and inform product development.
    Technologies: Kubernetes, SQL, Python
  • Contibutor

    2020 - PRESENT
    scikit-learn
    • Read statistics papers related to data sampling to find evidence for new stratified regression data sampling feature.
    • Contributed to docs on machine learning and data science best practices.
    • Advocated for new features such as p-value support in linear regression models.
    Technologies: Scikit-learn
  • Senior Data Engineer

    2018 - 2020
    Temple Capital
    • Hired as employee #1 at a hedge fund backed by Pantera Capital and Bain Capital that used machine learning to find profitable trading opportunities.
    • Worked with a machine learning stack that used proprietary statistical features, XGBoost and Random Forest regression models, and Bayesian hyperparameter optimization to train profitable time series prediction models.
    • Used pandas and SQL to create dashboards that analyzed trading strategy performance and identified changes to our trading system, increasing trading profits by up to 3%..
    • Explored historical market data to find interesting patterns and trends that could support novel trading strategies.
    • Built the fund's data platform and data lake on AWS using Postgres (RDS), S3, Presto (Athena), Batch, Step Functions, and ECS. The platform was used by our machine learning platform to ingest TBs of data and discover new trading strategies.
    • Served as the sole engineer on the DevOps and cloud automation side, and set up a robust and stable system that needed very little maintenance and had almost zero downtime over 1.5 years. Deployed hundreds of machines with Docker and Terraform.
    Technologies: Amazon Web Services (AWS), SQL, Python
  • Senior Software Engineer

    2016 - 2018
    Fitbit
    • Worked with the data science team to investigate and predict user behavior to increase Fitbit user retention.
    • Used pandas and SQL to investigate and debug system outages to ensure a seamless Fitbit user experience.
    • Served as a lead contributor to the Fitbit API Authorization Service, a distributed system that handles authentication and authorization using OAuth 2. The system receives 80,000 RPS, serving all third party apps, Fitbit mobile, and Fitbit web apps.
    • Built migration framework used to convert the Auth Service database from MySQL to Cassandra, providing service scalability with no API downtime incurred. Designed the Cassandra schema.
    Technologies: Java, SQL, Python

Experience

  • Outperforming Google Cloud AutoML Vision with Tensorflow
    https://medium.com/@skeller88

    Built a binary classifier that used deep learning to detect clouds in satellite images with 95% recall and 91% precision, and outperformed Google Cloud AutoML by 3% and 9%, respectively. Technologies used were Keras/Tensorflow, scikit-learn, Dask, Docker, and Google Cloud Platform. Published in Towards Data Science.

Skills

  • Languages

    Python 3, SQL, Java 8, Python, Java
  • Libraries/APIs

    Pandas, Scikit-learn, TensorFlow, Keras
  • Paradigms

    Data Science
  • Platforms

    Anaconda, Amazon Web Services (AWS), Google Cloud Platform (GCP), Docker, MacOS, Unix, Linux, Kubernetes
  • Other

    Machine Learning
  • Tools

    Terraform, Jupyter, PyCharm, DataGrip
  • Storage

    Cassandra
  • Frameworks

    Spark

Education

  • Continuing Education in Data Mining, Computer Science
    2014 - 2014
    Stanford University - Palo Alto, CA
  • Bachelor's Degree in Neuroscience
    2006 - 2010
    University of Southern California - Los Angeles, CA

Certifications

  • Software Engineering
    JUNE 2014 - PRESENT
    Hack Reactor

To view more profiles

Join Toptal
Share it with others