William Guss, Developer in San Francisco, United States
William is available for hire
Hire William

William Guss

Verified Expert  in Engineering

Artificial Intelligence Developer

Location
San Francisco, United States
Toptal Member Since
October 18, 2022

William is a computer scientist and mathematician who conducted doctoral research in deep learning theory and reinforcement learning at Carnegie Mellon University under Ruslan Salakhutdinov. He founded MineRL to develop general AI in Minecraft via human priors. In 2020, MineRL was acquired by OpenAI, where he served as a research scientist, codeveloping GitHub Copilot, Codex, and algorithms for imitation learning and alignment. Currently, William runs Lydian, an ML and AI consulting firm.

Portfolio

GhostWrite
Blitz, Node.js, JavaScript, React, Segment...
OpenAI
PyTorch, MPI, Deep Learning, Deep Reinforcement Learning...
Carnegie Mellon University
Deep Learning, Torch, React, Node.js, Express.js, Mathematica, Optimization...

Experience

Availability

Part-time

Preferred Environment

Visual Studio Code (VS Code), Python, JavaScript, C#, PyTorch, TensorFlow, Node.js, Generative Pre-trained Transformers (GPT)

The most amazing...

...tool I've developed is Copilot, a large language model capable of programming at near human level.

Work Experience

Chief Operating Officer

2022 - PRESENT
GhostWrite
  • Developed GhostWrite, an AI email assistant product, growing from one to 2,000 users in two months with net positive cash flow.
  • Managed a team of three engineers and two marketeers in the product's development, rollout, roadmap, and business development.
  • Built the entire web and ML stack from zero to one: Heroku-based web services, a Delta Lake data management system, segment IO analytics, and cluster ML deployment for high throughput product delivery.
Technologies: Blitz, Node.js, JavaScript, React, Segment, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Reinforcement Learning, Web Extensions, Artificial Intelligence (AI)

Research Scientist

2020 - 2022
OpenAI
  • Developed methodologies for scaling large language models using human feedback, imitation learning (ExpIt), and reinforcement learning from human feedback. I also uncovered data feedback and computed scaling laws for model alignment.
  • Codeveloped Codex, a large language model capable of programming at a human level.
  • Cocreated the first release of Copilot, the VS Code AI autocomplete extension which was acquired by GitHub.
  • Handled the handover of my acquired project, MineRL, developing large-scale Minecraft AI models, and organized four official NeurIPS conference workshops and competitions.
  • Built new imitation learning algorithms based on distance-to-measure techniques from computational topology and applied them to procedural generation environments.
  • Conducted extensive testing of state-of-the-art imitation learning algorithms in complex environments.
Technologies: PyTorch, MPI, Deep Learning, Deep Reinforcement Learning, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Clustering, Computer Vision, Cluster, Kubernetes, Docker, GPU Computing, NVIDIA CUDA, Jupyter Notebook, Artificial Intelligence (AI)

Doctoral Researcher

2017 - 2021
Carnegie Mellon University
  • Developed the first method for application of computational topology to deep neural networks: https://arxiv.org/pdf/1802.04443.pdf.
  • Created the MineRL project, a large-scale effort to reproduce general human intelligence in open-world domains through internet-scale behavioral cloning. OpenAI later acquired this project.
  • Managed 10+ team members in Japan, USA, England, India, and Germany, as well as several research interns across many projects. See the list of publications at: https://scholar.google.com/citations?view_op=list_works&hl=en&hl=en&user=5bB_sFcAAAAJ.
  • Created infinite dimensional extensions to deep neural networks: https://arxiv.org/pdf/1612.04799.pdf. I also proved the very first universal approximation theorem for these networks: https://arxiv.org/pdf/1910.01545.pdf.
Technologies: Deep Learning, Torch, React, Node.js, Express.js, Mathematica, Optimization, Computational Topology, Artificial Intelligence (AI)

Visiting Researcher

2019 - 2019
Freie Universität Berlin
  • Acted as a visiting researcher at the university's Institute of Mathematics, working in the discrete topology and geometry group.
  • Developed a theoretical framework for analyzing deep neural networks using algebraic topology.
  • Built new theoretical foundations for analyzing neural hyperplane arrangements. These formations are central to neural codes and compression theories of neural network learning.
Technologies: Computational Topology, GPU Computing, Artificial Intelligence (AI)

Chief Technology Officer

2017 - 2019
InfoPlay
  • Worked for InfoPlay, a cryptocurrency hedge fund applying stochastic gradients to markets.
  • Developed long-term roadmaps for technology acquisition and strategized the development of proprietary methodologies.
  • Created a novel reinforcement learning approach for acting multi-modal non-stationary environments, leveraging multiple asynchronous data sources.
  • Led and managed a small team to implement the technology roadmap.
  • Managed the development of a deep reinforcement learning infrastructure for online trading in financial markets.
Technologies: Deep Learning, Deep Reinforcement Learning, TensorFlow, PyTorch, Artificial Intelligence (AI)

Founder, Director of Research, and President

2015 - 2017
Machine Learning at Berkeley
  • Successfully launched and managed six research teams studying deep learning theory and applications.
  • Researched deep active learning, a bridge between deep learning and active learning, using policy and selection steps inspired by AlphaGo.
  • Acted as the project manager on OpenBrain, a massively asynchronous recurrent neurocomputational approach to artificial general intelligence.
  • Theorized and implemented a new ML algorithm and generalized artificial neural networks.
  • Collaborated with the International Computer Science Institute, researching new layer functions for complex neural networks on Fourier spectrum data and developing reinforcement learning techniques to perform multiple-model car fleet driving.
  • Led one of twelve sponsored teams competing to pursue the development of conversational AI and received a $100,000 grant from Amazon for a year-long project (Alexa Prize).
  • Built a generative information retrieval model using neural Turing machines and inverse reinforcement learning.
  • Managed the organizational recruiting and retention process.
Technologies: Deep Learning, Deep Reinforcement Learning, Computer Vision, Management, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Artificial Intelligence (AI)

Machine Learning Engineer

2015 - 2017
Bonsai
  • Architected and implemented a new AI/ML back end for classification and deep reinforcement learning.
  • Designed and implemented HyperLearner, a generative hyper parameter suggestion back end for metamachine learning optimization using manifold embeddings.
  • Built a neural network descriptor language for wrapping a variety of deep neural network models.
  • Codeveloped the patent for a "searchable database of trained artificial intelligence objects that can be reused, reconfigured, and recomposed, into one or more subsequent artificial intelligence models" (patent number: US 10,586,173).
Technologies: Python, TensorFlow, Deep Reinforcement Learning, Deep Learning, Clustering, Artificial Intelligence (AI)

Copilot and Codex

We introduced Codex, a generative pre-trained transformer (GPT) language model fine-tuned on publicly available code from GitHub, and studied its Python code-writing capabilities. A distinct production version of Codex powers GitHub Copilot.

On HumanEval, a new evaluation set we released to measure functional correctness for synthesizing programs from docstrings, our model solved 28.8% of the problems, while GPT-3 solved 0% and GPT-J solved 11.4%. Furthermore, we found that repeated sampling from the model was a surprisingly effective strategy for producing working solutions to difficult prompts. Using this method, we solved 70.2% of our problems with 100 samples per problem.

Careful investigation of our model revealed its limitations, including difficulty with docstrings describing long chains of operations and binding operations to variables.

Finally, we discussed the potential broader impacts of deploying powerful code generation technologies, covering safety, security, and economics.

I codeveloped the first Copilot extension at OpenAI.

MineRL

http://www.minerl.io
Deep reinforcement learning has had many significant successes, including a superhuman performance at Dota and Go. However, there are several challenges ahead if we want to apply it in the real world, including sample efficiency, task specification, and exploration. We believe addressing these challenges will require an open-world environment and human data.

To spur research on open-world environments with human data, we released MineRL, a suite of environments within Minecraft, alongside a large-scale dataset of human gameplay within those environments.

Besides the challenges discussed above, these environments also highlight a variety of other research challenges, including open-world multi-agent interactions, long-term planning, vision, control, navigation, and explicit and implicit subtask hierarchies. We also released a flexible framework to define new Minecraft tasks.

GhostWrite

http://www.ghostwrite.rip
GhostWrite is an AI-based email automation solution built on large-language models. Aimed at automating away the inbox, this tool enables users to write emails instantly with just a few instructions. The application comprises a central web service and distributed web-browser extensions that augment the user's webmail client functionality.

Research Paper: Characterizing the Capacity of Neural Networks using Algebraic Topology

https://arxiv.org/pdf/1802.04443.pdf
The learnability of different neural architectures can be characterized directly by computable measures of data complexity.

In this paper, we reframed the problem of architecture selection as understanding how data determines the most expressive and generalizable architectures suited to that data beyond inductive bias. After suggesting algebraic topology as a measure for data complexity, we showed that the power of a network to express the topological complexity of a dataset in its decision region is a strictly limiting factor in its ability to generalize. We then provided the first empirical characterization of the topological capacity of neural networks.

Our empirical analysis showed that neural networks exhibit topological phase transitions at every level of dataset complexity. This observation allowed us to connect existing theory to empirically driven conjectures on the choice of architectures for fully-connected neural networks.

Research Paper: Universal Approximation by Neural Networks

https://arxiv.org/pdf/1910.01545.pdf
In this line of work, I answered the open question of universal approximation of nonlinear operators F: X -> Y when X and Y are both infinite-dimensional.

We showed that for a large class of different infinite analogues of neural networks, any continuous map can be approximated arbitrarily closely with some mild topological conditions on X. Additionally, we provided the first lower-bound on the minimal number of input and output units required by a finite approximation to an infinite neural network to guarantee that it can uniformly approximate any nonlinear operator using samples from its inputs and outputs.
2017 - 2022

PhD in Machine Learning

Carnegie Mellon University - Pittsburgh, PA, USA

2017 - 2019

Master's Degree in Machine Learning

Carnegie Mellon University - Pittsburgh, PA, USA

2015 - 2017

Bachelor's Degree in Pure Mathematics

University of California, Berkeley - Berkeley, CA, USA

Libraries/APIs

PyTorch, TensorFlow, Node.js, MPI, React

Tools

Cluster, Mathematica

Languages

Python, JavaScript, C#

Platforms

Jupyter Notebook, Docker, Kubernetes, NVIDIA CUDA

Paradigms

Management

Frameworks

Express.js

Other

Generative Pre-trained Transformers (GPT), Deep Learning, Deep Reinforcement Learning, Natural Language Processing (NLP), Clustering, Torch, Optimization, Computational Topology, GPU Computing, Computer Vision, Reinforcement Learning, Artificial Intelligence (AI), Blitz, Segment, Web Extensions, Web Development, Program Synthesis, Extensions, Mathematics, Programming, Machine Learning

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring