Steve is available for hire

Steve Thomas

Verified Expert in Engineering

Software Developer

Location

San Francisco, CA, United States

Toptal Member Since

July 20, 2020

Steve is a software engineer passionate about machine learning and automating data-intensive workflows. His goal is to free scientists and BI analysts from creating and maintaining complex cloud environments so they can focus on what they do best. He has experience collaborating with the interdisciplinary perception teams at some of the top autonomous vehicle companies and building tools to automate the model training lifecycle.

Deep Learning Python TensorFlow PyTorch Docker Amazon Web Services (AWS)Kubernetes Machine Learning Computer Vision Kotlin Django Algorithms Robot Operating System (ROS)C++PyCharm

Portfolio

Engine ML

Amazon Web Services (AWS), InfluxDB, Elasticsearch, PyTorch, TensorFlow, Gradle...

Self-employed

Amazon Web Services (AWS), Docker, PyTorch, TensorFlow, C++, Python

Experience

Amazon Web Services (AWS) - 4 years TensorFlow - 4 years PyTorch - 4 years Docker - 4 years Python - 4 years Kotlin - 2 years Kubernetes - 2 years Django - 1 year

Availability

Part-time

Preferred Environment

Amazon Web Services (AWS), Linux, Kubernetes, Docker, IntelliJ IDEA, PyCharm

The most amazing...

...thing I've developed was my company's entire machine learning experiment tracking product offering so users could collaborate and visually compare their runs.

Work Experience

Software Engineer/Resident Deep Learning Expert

2018 - 2020

Engine ML

Designed and programmed our local, freemium offering that allowed users to run deep learning experiments on their own hardware, persist all relevant logs and metrics, and compare the results in the engine dashboard alongside their cloud jobs.
Led research showing how layer-wise optimizers (e.g. LAMB) can train object detectors with large batch sizes in a fraction of the time without performance degradation. Results can be found on our company blog at https://bit.ly/35gfM0P.
Designed and programmed an alerting service that notified users when their experiments entered a terminal state or when their experiments potentially entered a race condition by analyzing the experiment's log output and GPU utilization.
Designed and programmed a feature to pre-fetch training data from S3 buckets, storing it in an in-memory read-through cache using Alluxio and Alluxio’s FUSE-based POSIX API, resulting in up to a 5x speedup when reading a remote file.
Built a cat detector that was trained live in five minutes on 64 GPUs at VentureBeat Transform 2019 using a TensorFlow implementation of RetinaNet. The demonstration by our CEO can be found at https://bit.ly/2YdMbnr.

Technologies: Amazon Web Services (AWS), InfluxDB, Elasticsearch, PyTorch, TensorFlow, Gradle, Kotlin, Python, Kubernetes, Docker

Independent Machine Learning Researcher

2016 - 2018

Self-employed

Achieved the status of “Top Contender” in The Lyft Perception Challenge 2018, a semantic segmentation competition, using a tweaked version of Google’s DeepLabV3 with ResNet-152 as the backbone.
Designed and integrated perception, behavior planning, trajectory generation, and controller modules so Udacity’s driverless car could safely navigate a road with traffic lights (https://github.com/sathomas2/CarND-Capstone-Solution).
Taught myself the major developments in deep learning and the mathematical theory behind them by reading and replicating papers.

Technologies: Amazon Web Services (AWS), Docker, PyTorch, TensorFlow, C++, Python

Experience

Training Mask-RCNN 10x Faster with LAMB

I ran experiments proving that large-batch training does not negatively impact the performance of object detection neural networks. I demonstrated that by choosing the correct optimizer as well as the correct learning rate schedule, training time can be reduced from weeks to less than a day by using a large cluster of distributed GPUs. Furthermore, I performed all my experiments on AWS spot instances, drastically reducing the computing costs.

Train a Cat Detector Live on 64 GPUs in Less Than Five Minutes

https://www.youtube.com/watch?v=GKVbPFpEBHk&feature=youtu.be&t=6724

I designed and coded a cat detector that was trained live in five minutes on 64 GPUs at VentureBeat Transform 2019 using a TensorFlow implementation of RetinaNet. The demonstration by our CEO can be found on YouTube. I was responsible for creating the model as well as creating the AWS cloud environment and Kubernetes cluster, where the model was trained.

Skills

Languages

Python, Kotlin, C++

Libraries/APIs

TensorFlow, PyTorch

Platforms

Docker, Kubernetes, Amazon Web Services (AWS), Linux

Other

Leadership, Communication, Teamwork, Deep Learning, Algorithms, Machine Learning, Object Detection, Computer Vision, Analytical Thinking, Robot Operating System (ROS), Localization, PID Controllers, Reinforcement Learning

Frameworks

Django

Tools

PyCharm, IntelliJ IDEA, Gradle

Storage

Elasticsearch, InfluxDB

Education

2012 - 2014

Master's Degree in Philosophy

New York University - New York, NY

2006 - 2010

Bachelor's Degree in English and Economics

Bowdoin College - Brunswick, ME

Certifications

APRIL 2018 - PRESENT

Self-Driving Car Engineer Nanodegree

Udacity

APRIL 2018 - PRESENT

Graph Search, Shortest Paths, and Data Structures

Coursera

APRIL 2018 - PRESENT

Divide and Conquer, Sorting and Searching, and Randomized Algorithms

Coursera

SEPTEMBER 2017 - PRESENT

Deep Learning Foundation Nanodegree

Udacity

FEBRUARY 2017 - PRESENT

Machine Learning

Coursera

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring