Michael Henning
Verified Expert in Engineering
Performance Tuning Developer
New York, NY, United States
Toptal member since June 28, 2021
Michael's professional experience includes deep learning and computer vision at Facebook and a venture-funded startup. He enjoys freelancing because of the flexibility and variety it provides. Michael is looking forward to working with clients who can use his specialized skills in deep learning and performant systems, as well as clients who need more general back-end development expertise.
Portfolio
Experience
- Python - 3 years
- Performance Tuning - 3 years
- C++ - 3 years
- Git - 3 years
- C - 2 years
- PyTorch - 2 years
- Convolutional Neural Networks (CNNs) - 2 years
- Image Processing - 1 year
Availability
Preferred Environment
Linux, Visual Studio Code (VS Code), Git
The most amazing...
...improvement I've made to a production system cut search cost and latency by a factor of four at a startup where the search was our core focus.
Work Experience
Computer Vision Software Engineer
- Debugged and configured a production image inference system in C++ that processes thousands of queries per second. Solved a problem that negatively impacted accuracy for all image models and proactively caught a subtle use-after-free bug.
- Contributed research directions and code to GrokNet, a model for fine-grained image recognition across several business verticals, including developing a novel way to aggregate contrastive loss pairs across GPUs.
- Co-developed SimSearchNet++, a model for near-duplicate image detection.
Software Engineer
GrokStyle
- Improved throughput of image loading by over eight times to support multi-GPU training.
- Led migration of the core training and inference stack from Caffe to PyTorch.
- Cut search cost and latency by four times by finding and implementing optimization opportunities in a core search routine, including replacing a previous BLAS-accelerated implementation with a custom C++ implementation.
Software Engineering Intern
- Ported an image super-resolution algorithm to run on custom coprocessors.
- Leveraged symmetries to fit large lookup tables into 16KB of on-chip memory.
- Wrote a fixed-point square root approximation which achieved a two-time speedup.
Experience
GrokNet
My contributions included writing a training system in Python that could effectively load images fast enough to maintain high GPU utilization during distributed model training runs while maintaining high developer flexibility, developing a novel way to aggregate contrastive loss pairs across GPUs that improved model accuracy, and catching several discrepancies between test systems and production which impacted the final accuracy of the model.
SimSearchNet++
Education
Bachelor's Degree in Computer Science
Cornell University - Ithaca, NY, USA
Skills
Libraries/APIs
PyTorch, OpenGL
Tools
Git
Languages
C++, C, Python, Rust, Python 3, Java, Go, OCaml
Frameworks
Django, Presto
Paradigms
Functional Programming
Platforms
Linux, Visual Studio Code (VS Code), Amazon Web Services (AWS)
Storage
Google Cloud, PostgreSQL
Other
Memory Management, Performance Tuning, Deep Learning, Convolutional Neural Networks (CNNs), Image Processing
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring