Steve Thomas, Developer in San Francisco, CA, United States
Steve is available for hire
Hire Steve

Steve Thomas

Verified Expert  in Engineering

Software Developer

Location
San Francisco, CA, United States
Toptal Member Since
July 20, 2020

Steve is a software engineer passionate about machine learning and automating data-intensive workflows. His goal is to free scientists and BI analysts from creating and maintaining complex cloud environments so they can focus on what they do best. He has experience collaborating with the interdisciplinary perception teams at some of the top autonomous vehicle companies and building tools to automate the model training lifecycle.

Portfolio

Engine ML
Amazon Web Services (AWS), InfluxDB, Elasticsearch, PyTorch, TensorFlow, Gradle...
Self-employed
Amazon Web Services (AWS), Docker, PyTorch, TensorFlow, C++, Python

Experience

Availability

Part-time

Preferred Environment

Amazon Web Services (AWS), Linux, Kubernetes, Docker, IntelliJ IDEA, PyCharm

The most amazing...

...thing I've developed was my company's entire machine learning experiment tracking product offering so users could collaborate and visually compare their runs.

Work Experience

Software Engineer/Resident Deep Learning Expert

2018 - 2020
Engine ML
  • Designed and programmed our local, freemium offering that allowed users to run deep learning experiments on their own hardware, persist all relevant logs and metrics, and compare the results in the engine dashboard alongside their cloud jobs.
  • Led research showing how layer-wise optimizers (e.g. LAMB) can train object detectors with large batch sizes in a fraction of the time without performance degradation. Results can be found on our company blog at https://bit.ly/35gfM0P.
  • Designed and programmed an alerting service that notified users when their experiments entered a terminal state or when their experiments potentially entered a race condition by analyzing the experiment's log output and GPU utilization.
  • Designed and programmed a feature to pre-fetch training data from S3 buckets, storing it in an in-memory read-through cache using Alluxio and Alluxio’s FUSE-based POSIX API, resulting in up to a 5x speedup when reading a remote file.
  • Built a cat detector that was trained live in five minutes on 64 GPUs at VentureBeat Transform 2019 using a TensorFlow implementation of RetinaNet. The demonstration by our CEO can be found at https://bit.ly/2YdMbnr.
Technologies: Amazon Web Services (AWS), InfluxDB, Elasticsearch, PyTorch, TensorFlow, Gradle, Kotlin, Python, Kubernetes, Docker

Independent Machine Learning Researcher

2016 - 2018
Self-employed
  • Achieved the status of “Top Contender” in The Lyft Perception Challenge 2018, a semantic segmentation competition, using a tweaked version of Google’s DeepLabV3 with ResNet-152 as the backbone.
  • Designed and integrated perception, behavior planning, trajectory generation, and controller modules so Udacity’s driverless car could safely navigate a road with traffic lights (https://github.com/sathomas2/CarND-Capstone-Solution).
  • Taught myself the major developments in deep learning and the mathematical theory behind them by reading and replicating papers.
Technologies: Amazon Web Services (AWS), Docker, PyTorch, TensorFlow, C++, Python

Training Mask-RCNN 10x Faster with LAMB

I ran experiments proving that large-batch training does not negatively impact the performance of object detection neural networks. I demonstrated that by choosing the correct optimizer as well as the correct learning rate schedule, training time can be reduced from weeks to less than a day by using a large cluster of distributed GPUs. Furthermore, I performed all my experiments on AWS spot instances, drastically reducing the computing costs.

Train a Cat Detector Live on 64 GPUs in Less Than Five Minutes

https://www.youtube.com/watch?v=GKVbPFpEBHk&feature=youtu.be&t=6724
I designed and coded a cat detector that was trained live in five minutes on 64 GPUs at VentureBeat Transform 2019 using a TensorFlow implementation of RetinaNet. The demonstration by our CEO can be found on YouTube. I was responsible for creating the model as well as creating the AWS cloud environment and Kubernetes cluster, where the model was trained.

Languages

Python, Kotlin, C++

Libraries/APIs

TensorFlow, PyTorch

Platforms

Docker, Kubernetes, Amazon Web Services (AWS), Linux

Other

Leadership, Communication, Teamwork, Deep Learning, Algorithms, Machine Learning, Object Detection, Computer Vision, Analytical Thinking, Robot Operating System (ROS), Localization, PID Controllers, Reinforcement Learning

Frameworks

Django

Tools

PyCharm, IntelliJ IDEA, Gradle

Storage

Elasticsearch, InfluxDB

2012 - 2014

Master's Degree in Philosophy

New York University - New York, NY

2006 - 2010

Bachelor's Degree in English and Economics

Bowdoin College - Brunswick, ME

APRIL 2018 - PRESENT

Self-Driving Car Engineer Nanodegree

Udacity

APRIL 2018 - PRESENT

Graph Search, Shortest Paths, and Data Structures

Coursera

APRIL 2018 - PRESENT

Divide and Conquer, Sorting and Searching, and Randomized Algorithms

Coursera

SEPTEMBER 2017 - PRESENT

Deep Learning Foundation Nanodegree

Udacity

FEBRUARY 2017 - PRESENT

Machine Learning

Coursera

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring