Michael is available for hire

Michael Henning

Verified Expert in Engineering

Performance Tuning Developer

Location

New York, NY, United States

Toptal Member Since

June 28, 2021

Michael's professional experience includes deep learning and computer vision at Facebook and a venture-funded startup. He enjoys freelancing because of the flexibility and variety it provides. Michael is looking forward to working with clients who can use his specialized skills in deep learning and performant systems, as well as clients who need more general back-end development expertise.

C++Git C Deep Learning Python Python 3 PyTorch Rust Django Google Cloud Amazon Web Services (AWS)OpenGL Linux Visual Studio Code (VS Code)Java CNN

Portfolio

Facebook

C++, Python, PyTorch

GrokStyle

C++, Python, PyTorch, Django, Amazon Web Services (AWS), Google Cloud

Google

C++, Image Processing

Experience

Python - 3 years Performance Tuning - 3 years C++ - 3 years Git - 3 years C - 2 years PyTorch - 2 years Convolutional Neural Networks (CNN) - 2 years Image Processing - 1 year

Availability

Part-time

Preferred Environment

Linux, Visual Studio Code (VS Code), Git

The most amazing...

...improvement I've made to a production system cut search cost and latency by a factor of four at a startup where the search was our core focus.

Work Experience

Computer Vision Software Engineer

2019 - 2021

Facebook

Debugged and configured a production image inference system in C++ that processes thousands of queries per second. Solved a problem that negatively impacted accuracy for all image models and proactively caught a subtle use-after-free bug.
Contributed research directions and code to GrokNet, a model for fine-grained image recognition across several business verticals, including developing a novel way to aggregate contrastive loss pairs across GPUs.
Co-developed SimSearchNet++, a model for near-duplicate image detection.

Technologies: C++, Python, PyTorch

Software Engineer

2018 - 2019

GrokStyle

Improved throughput of image loading by over eight times to support multi-GPU training.
Led migration of the core training and inference stack from Caffe to PyTorch.
Cut search cost and latency by four times by finding and implementing optimization opportunities in a core search routine, including replacing a previous BLAS-accelerated implementation with a custom C++ implementation.

Technologies: C++, Python, PyTorch, Django, Amazon Web Services (AWS), Google Cloud

Software Engineering Intern

2017 - 2017

Google

Ported an image super-resolution algorithm to run on custom coprocessors.
Leveraged symmetries to fit large lookup tables into 16KB of on-chip memory.
Wrote a fixed-point square root approximation which achieved a two-time speedup.

Technologies: C++, Image Processing

Experience

GrokNet

I was integral to the development of a model for fine-grained image recognition across several business verticals.

My contributions included writing a training system in Python that could effectively load images fast enough to maintain high GPU utilization during distributed model training runs while maintaining high developer flexibility, developing a novel way to aggregate contrastive loss pairs across GPUs that improved model accuracy, and catching several discrepancies between test systems and production which impacted the final accuracy of the model.

SimSearchNet++

https://ai.facebook.com/blog/heres-how-were-using-ai-to-help-detect-misinformation

I contributed accuracy and speed improvements to a neural network used for duplicate image detection, including comparisons between a variety of different architectures and quantization into int8. I was also involved in the deployment of this model to production traffic, where I proactively caught a subtle issue that caused early load testing results to underestimate resource usage by order of magnitude and also managed to track down a subtle bug related to receiving different results for a fraction of a percent of production traffic which ended up being related to differences in CPU microarchitectures in our server cluster.

Skills

Languages

C++, C, Python, Rust, Python 3, Java, Go, OCaml

Tools

Git

Other

Memory Management, Performance Tuning, Deep Learning, Convolutional Neural Networks (CNN), Image Processing

Libraries/APIs

PyTorch, OpenGL

Frameworks

Django, Presto

Paradigms

Functional Programming

Platforms

Linux, Visual Studio Code (VS Code), Amazon Web Services (AWS)

Storage

Google Cloud, PostgreSQL

Education

2014 - 2018

Bachelor's Degree in Computer Science

Cornell University - Ithaca, NY, USA

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring