William Guss, Artificial Intelligence Developer in San Francisco, CA, United States
William Guss

Artificial Intelligence Developer in San Francisco, CA, United States

Member since October 18, 2022
William is a computer scientist and mathematician who conducted doctoral research in deep learning theory and reinforcement learning at Carnegie Mellon University under Ruslan Salakhutdinov. He founded MineRL to develop general AI in Minecraft via human priors. In 2020, MineRL was acquired by OpenAI, where he served as a research scientist, codeveloping GitHub Copilot, Codex, and algorithms for imitation learning and alignment. Currently, William runs Lydian, an ML and AI consulting firm.
William is now available for hire

Portfolio

  • GhostWrite
    Blitz, Node.js, JavaScript, React, Segment...
  • OpenAI
    PyTorch, MPI, Deep Learning, Deep Reinforcement Learning...
  • Carnegie Mellon University
    Deep Learning, Torch, React, Node.js, Express.js, Mathematica, Optimization...

Experience

Location

San Francisco, CA, United States

Availability

Part-time

Preferred Environment

Visual Studio Code, Python, JavaScript, C#, PyTorch, TensorFlow, Node.js, Generative Pre-trained Transformers (GPT)

The most amazing...

...tool I've developed is Copilot, a large language model capable of programming at near human level.

Employment

  • Chief Operating Officer

    2022 - PRESENT
    GhostWrite
    • Developed GhostWrite, an AI email assistant product, growing from one to 2,000 users in two months with net positive cash flow.
    • Managed a team of three engineers and two marketeers in the product's development, rollout, roadmap, and business development.
    • Built the entire web and ML stack from zero to one: Heroku-based web services, a Delta Lake data management system, segment IO analytics, and cluster ML deployment for high throughput product delivery.
    Technologies: Blitz, Node.js, JavaScript, React, Segment, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Reinforcement Learning, Web Extensions
  • Research Scientist

    2020 - 2022
    OpenAI
    • Developed methodologies for scaling large language models using human feedback, imitation learning (ExpIt), and reinforcement learning from human feedback. I also uncovered data feedback and computed scaling laws for model alignment.
    • Codeveloped Codex, a large language model capable of programming at a human level.
    • Cocreated the first release of Copilot, the VS Code AI autocomplete extension which was acquired by GitHub.
    • Handled the handover of my acquired project, MineRL, developing large-scale Minecraft AI models, and organized four official NeurIPS conference workshops and competitions.
    • Built new imitation learning algorithms based on distance-to-measure techniques from computational topology and applied them to procedural generation environments.
    • Conducted extensive testing of state-of-the-art imitation learning algorithms in complex environments.
    Technologies: PyTorch, MPI, Deep Learning, Deep Reinforcement Learning, Natural Language Processing (NLP), Clustering, Computer Vision, Cluster, Kubernetes, Docker, GPU Computing, CUDA, Jupyter Notebook
  • Doctoral Researcher

    2017 - 2021
    Carnegie Mellon University
    • Developed the first method for application of computational topology to deep neural networks: https://arxiv.org/pdf/1802.04443.pdf.
    • Created the MineRL project, a large-scale effort to reproduce general human intelligence in open-world domains through internet-scale behavioral cloning. OpenAI later acquired this project.
    • Managed 10+ team members in Japan, USA, England, India, and Germany, as well as several research interns across many projects. See the list of publications at: https://scholar.google.com/citations?view_op=list_works&hl=en&hl=en&user=5bB_sFcAAAAJ.
    • Created infinite dimensional extensions to deep neural networks: https://arxiv.org/pdf/1612.04799.pdf. I also proved the very first universal approximation theorem for these networks: https://arxiv.org/pdf/1910.01545.pdf.
    Technologies: Deep Learning, Torch, React, Node.js, Express.js, Mathematica, Optimization, Computational Topology
  • Visiting Researcher

    2019 - 2019
    Freie Universität Berlin
    • Acted as a visiting researcher at the university's Institute of Mathematics, working in the discrete topology and geometry group.
    • Developed a theoretical framework for analyzing deep neural networks using algebraic topology.
    • Built new theoretical foundations for analyzing neural hyperplane arrangements. These formations are central to neural codes and compression theories of neural network learning.
    Technologies: Computational Topology, GPU Computing
  • Chief Technology Officer

    2017 - 2019
    InfoPlay
    • Worked for InfoPlay, a cryptocurrency hedge fund applying stochastic gradients to markets.
    • Developed long-term roadmaps for technology acquisition and strategized the development of proprietary methodologies.
    • Created a novel reinforcement learning approach for acting multi-modal non-stationary environments, leveraging multiple asynchronous data sources.
    • Led and managed a small team to implement the technology roadmap.
    • Managed the development of a deep reinforcement learning infrastructure for online trading in financial markets.
    Technologies: Deep Learning, Deep Reinforcement Learning, TensorFlow, PyTorch
  • Founder, Director of Research, and President

    2015 - 2017
    Machine Learning at Berkeley
    • Successfully launched and managed six research teams studying deep learning theory and applications.
    • Researched deep active learning, a bridge between deep learning and active learning, using policy and selection steps inspired by AlphaGo.
    • Acted as the project manager on OpenBrain, a massively asynchronous recurrent neurocomputational approach to artificial general intelligence.
    • Theorized and implemented a new ML algorithm and generalized artificial neural networks.
    • Collaborated with the International Computer Science Institute, researching new layer functions for complex neural networks on Fourier spectrum data and developing reinforcement learning techniques to perform multiple-model car fleet driving.
    • Led one of twelve sponsored teams competing to pursue the development of conversational AI and received a $100,000 grant from Amazon for a year-long project (Alexa Prize).
    • Built a generative information retrieval model using neural Turing machines and inverse reinforcement learning.
    • Managed the organizational recruiting and retention process.
    Technologies: Deep Learning, Deep Reinforcement Learning, Computer Vision, Management, Natural Language Processing (NLP)
  • Machine Learning Engineer

    2015 - 2017
    Bonsai
    • Architected and implemented a new AI/ML back end for classification and deep reinforcement learning.
    • Designed and implemented HyperLearner, a generative hyper parameter suggestion back end for metamachine learning optimization using manifold embeddings.
    • Built a neural network descriptor language for wrapping a variety of deep neural network models.
    • Codeveloped the patent for a "searchable database of trained artificial intelligence objects that can be reused, reconfigured, and recomposed, into one or more subsequent artificial intelligence models" (patent number: US 10,586,173).
    Technologies: Python, TensorFlow, Deep Reinforcement Learning, Deep Learning, Clustering

Experience

  • Copilot and Codex
    https://openai.com/blog/openai-codex/

    We introduced Codex, a generative pre-trained transformer (GPT) language model fine-tuned on publicly available code from GitHub, and studied its Python code-writing capabilities. A distinct production version of Codex powers GitHub Copilot.

    On HumanEval, a new evaluation set we released to measure functional correctness for synthesizing programs from docstrings, our model solved 28.8% of the problems, while GPT-3 solved 0% and GPT-J solved 11.4%. Furthermore, we found that repeated sampling from the model was a surprisingly effective strategy for producing working solutions to difficult prompts. Using this method, we solved 70.2% of our problems with 100 samples per problem.

    Careful investigation of our model revealed its limitations, including difficulty with docstrings describing long chains of operations and binding operations to variables.

    Finally, we discussed the potential broader impacts of deploying powerful code generation technologies, covering safety, security, and economics.

    I codeveloped the first Copilot extension at OpenAI.

  • MineRL
    http://www.minerl.io

    Deep reinforcement learning has had many significant successes, including a superhuman performance at Dota and Go. However, there are several challenges ahead if we want to apply it in the real world, including sample efficiency, task specification, and exploration. We believe addressing these challenges will require an open-world environment and human data.

    To spur research on open-world environments with human data, we released MineRL, a suite of environments within Minecraft, alongside a large-scale dataset of human gameplay within those environments.

    Besides the challenges discussed above, these environments also highlight a variety of other research challenges, including open-world multi-agent interactions, long-term planning, vision, control, navigation, and explicit and implicit subtask hierarchies. We also released a flexible framework to define new Minecraft tasks.

  • GhostWrite
    http://www.ghostwrite.rip

    GhostWrite is an AI-based email automation solution built on large-language models. Aimed at automating away the inbox, this tool enables users to write emails instantly with just a few instructions. The application comprises a central web service and distributed web-browser extensions that augment the user's webmail client functionality.

  • Research Paper: Characterizing the Capacity of Neural Networks using Algebraic Topology
    https://arxiv.org/pdf/1802.04443.pdf

    The learnability of different neural architectures can be characterized directly by computable measures of data complexity.

    In this paper, we reframed the problem of architecture selection as understanding how data determines the most expressive and generalizable architectures suited to that data beyond inductive bias. After suggesting algebraic topology as a measure for data complexity, we showed that the power of a network to express the topological complexity of a dataset in its decision region is a strictly limiting factor in its ability to generalize. We then provided the first empirical characterization of the topological capacity of neural networks.

    Our empirical analysis showed that neural networks exhibit topological phase transitions at every level of dataset complexity. This observation allowed us to connect existing theory to empirically driven conjectures on the choice of architectures for fully-connected neural networks.

  • Research Paper: Universal Approximation by Neural Networks
    https://arxiv.org/pdf/1910.01545.pdf

    In this line of work, I answered the open question of universal approximation of nonlinear operators F: X -> Y when X and Y are both infinite-dimensional.

    We showed that for a large class of different infinite analogues of neural networks, any continuous map can be approximated arbitrarily closely with some mild topological conditions on X. Additionally, we provided the first lower-bound on the minimal number of input and output units required by a finite approximation to an infinite neural network to guarantee that it can uniformly approximate any nonlinear operator using samples from its inputs and outputs.

Skills

  • Languages

    Python, JavaScript, C#
  • Libraries/APIs

    PyTorch, TensorFlow, Node.js, MPI, React
  • Tools

    Cluster, Mathematica
  • Platforms

    Jupyter Notebook, Docker, Kubernetes, CUDA
  • Other

    Generative Pre-trained Transformers (GPT), Deep Learning, Deep Reinforcement Learning, Natural Language Processing (NLP), Clustering, Torch, Optimization, Computational Topology, GPU Computing, Computer Vision, Reinforcement Learning, Blitz, Segment, Web Extensions, Web Development, Program Synthesis, Extensions, Mathematics, Programming, Machine Learning
  • Paradigms

    Management
  • Frameworks

    Express.js

Education

  • PhD in Machine Learning
    2017 - 2022
    Carnegie Mellon University - Pittsburgh, PA, USA
  • Master's Degree in Machine Learning
    2017 - 2019
    Carnegie Mellon University - Pittsburgh, PA, USA
  • Bachelor's Degree in Pure Mathematics
    2015 - 2017
    University of California, Berkeley - Berkeley, CA, USA

To view more profiles

Join Toptal
Share it with others