Verified Expert in Engineering
Artificial Intelligence Developer
William is a computer scientist and mathematician who conducted doctoral research in deep learning theory and reinforcement learning at Carnegie Mellon University under Ruslan Salakhutdinov. He founded MineRL to develop general AI in Minecraft via human priors. In 2020, MineRL was acquired by OpenAI, where he served as a research scientist, codeveloping GitHub Copilot, Codex, and algorithms for imitation learning and alignment. Currently, William runs Lydian, an ML and AI consulting firm.
The most amazing...
...tool I've developed is Copilot, a large language model capable of programming at near human level.
Chief Operating Officer
- Developed GhostWrite, an AI email assistant product, growing from one to 2,000 users in two months with net positive cash flow.
- Managed a team of three engineers and two marketeers in the product's development, rollout, roadmap, and business development.
- Built the entire web and ML stack from zero to one: Heroku-based web services, a Delta Lake data management system, segment IO analytics, and cluster ML deployment for high throughput product delivery.
- Developed methodologies for scaling large language models using human feedback, imitation learning (ExpIt), and reinforcement learning from human feedback. I also uncovered data feedback and computed scaling laws for model alignment.
- Codeveloped Codex, a large language model capable of programming at a human level.
- Cocreated the first release of Copilot, the VS Code AI autocomplete extension which was acquired by GitHub.
- Handled the handover of my acquired project, MineRL, developing large-scale Minecraft AI models, and organized four official NeurIPS conference workshops and competitions.
- Built new imitation learning algorithms based on distance-to-measure techniques from computational topology and applied them to procedural generation environments.
- Conducted extensive testing of state-of-the-art imitation learning algorithms in complex environments.
Carnegie Mellon University
- Developed the first method for application of computational topology to deep neural networks: https://arxiv.org/pdf/1802.04443.pdf.
- Created the MineRL project, a large-scale effort to reproduce general human intelligence in open-world domains through internet-scale behavioral cloning. OpenAI later acquired this project.
- Managed 10+ team members in Japan, USA, England, India, and Germany, as well as several research interns across many projects. See the list of publications at: https://scholar.google.com/citations?view_op=list_works&hl=en&hl=en&user=5bB_sFcAAAAJ.
- Created infinite dimensional extensions to deep neural networks: https://arxiv.org/pdf/1612.04799.pdf. I also proved the very first universal approximation theorem for these networks: https://arxiv.org/pdf/1910.01545.pdf.
Freie Universität Berlin
- Acted as a visiting researcher at the university's Institute of Mathematics, working in the discrete topology and geometry group.
- Developed a theoretical framework for analyzing deep neural networks using algebraic topology.
- Built new theoretical foundations for analyzing neural hyperplane arrangements. These formations are central to neural codes and compression theories of neural network learning.
Chief Technology Officer
- Worked for InfoPlay, a cryptocurrency hedge fund applying stochastic gradients to markets.
- Developed long-term roadmaps for technology acquisition and strategized the development of proprietary methodologies.
- Created a novel reinforcement learning approach for acting multi-modal non-stationary environments, leveraging multiple asynchronous data sources.
- Led and managed a small team to implement the technology roadmap.
- Managed the development of a deep reinforcement learning infrastructure for online trading in financial markets.
Founder, Director of Research, and President
Machine Learning at Berkeley
- Successfully launched and managed six research teams studying deep learning theory and applications.
- Researched deep active learning, a bridge between deep learning and active learning, using policy and selection steps inspired by AlphaGo.
- Acted as the project manager on OpenBrain, a massively asynchronous recurrent neurocomputational approach to artificial general intelligence.
- Theorized and implemented a new ML algorithm and generalized artificial neural networks.
- Collaborated with the International Computer Science Institute, researching new layer functions for complex neural networks on Fourier spectrum data and developing reinforcement learning techniques to perform multiple-model car fleet driving.
- Led one of twelve sponsored teams competing to pursue the development of conversational AI and received a $100,000 grant from Amazon for a year-long project (Alexa Prize).
- Built a generative information retrieval model using neural Turing machines and inverse reinforcement learning.
- Managed the organizational recruiting and retention process.
Machine Learning Engineer
- Architected and implemented a new AI/ML back end for classification and deep reinforcement learning.
- Designed and implemented HyperLearner, a generative hyper parameter suggestion back end for metamachine learning optimization using manifold embeddings.
- Built a neural network descriptor language for wrapping a variety of deep neural network models.
- Codeveloped the patent for a "searchable database of trained artificial intelligence objects that can be reused, reconfigured, and recomposed, into one or more subsequent artificial intelligence models" (patent number: US 10,586,173).
Copilot and Codexhttps://openai.com/blog/openai-codex/
On HumanEval, a new evaluation set we released to measure functional correctness for synthesizing programs from docstrings, our model solved 28.8% of the problems, while GPT-3 solved 0% and GPT-J solved 11.4%. Furthermore, we found that repeated sampling from the model was a surprisingly effective strategy for producing working solutions to difficult prompts. Using this method, we solved 70.2% of our problems with 100 samples per problem.
Careful investigation of our model revealed its limitations, including difficulty with docstrings describing long chains of operations and binding operations to variables.
Finally, we discussed the potential broader impacts of deploying powerful code generation technologies, covering safety, security, and economics.
I codeveloped the first Copilot extension at OpenAI.
To spur research on open-world environments with human data, we released MineRL, a suite of environments within Minecraft, alongside a large-scale dataset of human gameplay within those environments.
Besides the challenges discussed above, these environments also highlight a variety of other research challenges, including open-world multi-agent interactions, long-term planning, vision, control, navigation, and explicit and implicit subtask hierarchies. We also released a flexible framework to define new Minecraft tasks.
Research Paper: Characterizing the Capacity of Neural Networks using Algebraic Topologyhttps://arxiv.org/pdf/1802.04443.pdf
In this paper, we reframed the problem of architecture selection as understanding how data determines the most expressive and generalizable architectures suited to that data beyond inductive bias. After suggesting algebraic topology as a measure for data complexity, we showed that the power of a network to express the topological complexity of a dataset in its decision region is a strictly limiting factor in its ability to generalize. We then provided the first empirical characterization of the topological capacity of neural networks.
Our empirical analysis showed that neural networks exhibit topological phase transitions at every level of dataset complexity. This observation allowed us to connect existing theory to empirically driven conjectures on the choice of architectures for fully-connected neural networks.
Research Paper: Universal Approximation by Neural Networkshttps://arxiv.org/pdf/1910.01545.pdf
We showed that for a large class of different infinite analogues of neural networks, any continuous map can be approximated arbitrarily closely with some mild topological conditions on X. Additionally, we provided the first lower-bound on the minimal number of input and output units required by a finite approximation to an infinite neural network to guarantee that it can uniformly approximate any nonlinear operator using samples from its inputs and outputs.
PyTorch, TensorFlow, Node.js, MPI, React
Jupyter Notebook, Docker, Kubernetes, NVIDIA CUDA
Generative Pre-trained Transformers (GPT), Deep Learning, Deep Reinforcement Learning, Natural Language Processing (NLP), Clustering, Torch, Optimization, Computational Topology, GPU Computing, Computer Vision, Reinforcement Learning, GPT, Blitz, Segment, Web Extensions, Web Development, Program Synthesis, Extensions, Mathematics, Programming, Machine Learning
PhD in Machine Learning
Carnegie Mellon University - Pittsburgh, PA, USA
Master's Degree in Machine Learning
Carnegie Mellon University - Pittsburgh, PA, USA
Bachelor's Degree in Pure Mathematics
University of California, Berkeley - Berkeley, CA, USA
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.Start hiring