Cory Massaro, Software Developer in Kent, OH, United States
Cory Massaro

Software Developer in Kent, OH, United States

Member since April 13, 2020
Cory has recently worked on NLG and back-end engineering at the Wikimedia Foundation. Before this, he worked at Google on internationalization for Google Lens and internationalization and language modeling for the Speech and Keyboard team. Cory holds a master's degree in natural language processing and has built various information extraction systems as an intern in graduate school.
Cory is now available for hire


  • Self-Employed
    Keras, TensorFlow, Kubernetes, Docker, Flask, Python, JavaScript
  • Google
    Google Mock (GMock), gRPC, Python, C++
  • Self-employed
    Redux, React, Docker, Python, C++



Kent, OH, United States



Preferred Environment

Vim Text Editor, Kubernetes, Docker, Node.js, React, TensorFlow, NLTK, Python, TypeScript, Amazon Web Services (AWS)

The most amazing...

...project I've contributed to was language modeling for GBoard, a state-of-the-art performance in over 100 languages!


  • Full-stack Developer

    2019 - PRESENT
    • Contributed to pipelining software for distributed software flows. Implemented containerization, ensuring distributed workflows could be run in predictable environments on a fleet of AWS machines.
    • Created, an online-only poetry journal.
    • Wrote legislython, an API for Congressional voting data.
    Technologies: Keras, TensorFlow, Kubernetes, Docker, Flask, Python, JavaScript
  • Software Engineer

    2018 - 2019
    • Led internationalization efforts for visual search in Google Lens. I expanded support for shopping and other verticals to Chinese, Korean, and Japanese languages via improved tokenization and part-of-speech detection.
    • Helped integrate geolocation data into visual search, allowing visual search to identify nearby locations more accurately.
    • Applied semantic information from ML models and static ontologies to improve information retrieval recall for visual search.
    • Strategized with other engineers how to handle right-to-left scripts, e.g. Perso-Arabic, when overlaying translated text in Google Lens.
    • A/B tested changes to retrieval model. Assessed user comparisons of outputs to determine whether a proposed algorithm change would positively affect quality.
    Technologies: Google Mock (GMock), gRPC, Python, C++
  • Software Developer

    2017 - 2018
    • Contributed to MiPandas, a genomics data visualization interface. Created visualizations in React/Redux and served data from a Flask back end. Dockerized the application.
    • Wrote a language detection system for hospitals using PocketSphinx. This system leveraged an acoustic model with support for the union of phonemes in 10 different languages, then classified the resulting phonemes.
    • Built a website for a Persian cultural center. I used WordPress to prototype a front end quickly, then added some additional functionality by wrapping a Flask back end.
    Technologies: Redux, React, Docker, Python, C++
  • Software Engineer

    2014 - 2017
    • Built an LSTM-based model to predict nonce compound words in German. This improved the orthographic accuracy of Google's state-of-the-art ASR system and paved the way for similar normalization in other languages.
    • Found new data sources and experimented with different preprocessing and sampling to improve automated speech recognition performance in over sixty languages, including a WER improvement of 33% in Hindi.
    • Enabled state-of-the-art language models in over one hundred languages for GBoard.
    • Spearheaded the incorporation of Google's Keyboard language model pipeline into the speech team's infrastructure. Wrote unit and integration tests to make the Keyboard pipeline more robust.
    • A/B tested new language models. Assessed user comparisons of outputs to determine whether a proposed algorithm change would positively affect quality.
    Technologies: Borg, TensorFlow, Python, C++


  • Legislython

    An API that consumes voting data from This API transforms the underlying XML into Python objects for use in web and analytics applications. I have also wrapped the API in a simple Flask application to allow users to generate downloadable CSVs.


  • Languages

    Python 3, Python 2, C++, Bash, Bash Script, Python, SQL, C++11, C++14, JavaScript, TypeScript
  • Frameworks

    Flask, Django, gRPC, Google Mock (GMock), Redux
  • Libraries/APIs

    NumPy, Keras, Flask-RESTful, jQuery, SciPy, Pandas, NLTK, TensorFlow, React, Node.js, PyTorch, Scikit-learn
  • Paradigms

    REST, Data Science
  • Platforms

    Ubuntu, Ubuntu Linux, Amazon Web Services (AWS), Linux, Docker, Kubernetes
  • Storage

    PostgreSQL, SQLite, Databases, Elasticsearch
  • Other

    A/B Testing, Big Data, Deep Learning, Machine Learning, Natural Language Processing (NLP), Speech Recognition, Speech to Text, Entity Extraction, AWS, Full-stack, Data Analytics, Natural Language Understanding (NLU), Information Extraction, Information Retrieval, Entity-relationships Model (ERM), Clustering, Unsupervised Learning, Clustering Algorithms, K-means Clustering, Borg, Ajax, Data Analysis, SAP SD
  • Tools

    Vim Text Editor, JSX, VirtualBox


  • Master of Arts Degree in Computational Linguistics
    2012 - 2014
    Brandeis University - Waltham, MA
  • Master of Arts Degree in Comparative Literature
    2010 - 2012
    University of California, Santa Barbara - Santa Barbara, CA
  • Bachelor of Arts Degree in Creative Writing
    2006 - 2010
    Duke University - Durham, NC

To view more profiles

Join Toptal
Share it with others