Cory Massaro, Software Developer in San Jose, CA, United States
Cory Massaro

Software Developer in San Jose, CA, United States

Member since August 5, 2019
Cory worked at Google for four years. He last worked on a multimodal visual/linguistic information retrieval system. Before that, he worked on internationalization and language modeling for the Speech and Keyboard team. Cory holds a master's degree in natural language processing and has built various information extraction systems as an intern during his time in graduate school.
Cory is now available for hire

Portfolio

  • Self-Employed
    Keras, TensorFlow, Kubernetes, Docker, Flask, Python, JavaScript
  • Google
    Google Mock (GMock), GRPC, Python, C++
  • Self-employed
    Redux, React, Docker, Python, C++

Experience

Location

San Jose, CA, United States

Availability

Part-time

Preferred Environment

Bash, Linux, Vim Text Editor, C++, Kubernetes, Docker, Node.js, React, JavaScript, TensorFlow, NLTK, Keras, Python

The most amazing...

...project I've contributed to was language modeling for GBoard—state-of-the-art performance in over 100 languages!

Employment

  • Full-stack Developer

    2019 - PRESENT
    Self-Employed
    • Contributed to pipelining software for distributed software flows. Implemented containerization, ensuring distributed workflows could be run in predictable environments on a fleet of AWS machines.
    • Created middlelost.com, an online-only poetry journal.
    • Wrote legislython, an API for Congressional voting data.
    Technologies: Keras, TensorFlow, Kubernetes, Docker, Flask, Python, JavaScript
  • Software Engineer

    2018 - 2019
    Google
    • Led internationalization efforts for visual search in Google Lens. I expanded support for shopping and other verticals to Chinese, Korean, and Japanese languages via improved tokenization and part-of-speech detection.
    • Helped integrate geolocation data into visual search, allowing visual search to identify nearby locations more accurately.
    • Applied semantic information from ML models and static ontologies to improve information retrieval recall for visual search.
    • Strategized with other engineers how to handle right-to-left scripts, e.g. Perso-Arabic, when overlaying translated text in Google Lens.
    • A/B tested changes to retrieval model. Assessed user comparisons of outputs to determine whether a proposed algorithm change would positively affect quality.
    Technologies: Google Mock (GMock), GRPC, Python, C++
  • Software Developer

    2017 - 2018
    Self-employed
    • Contributed to MiPandas, a genomics data visualization interface. Created visualizations in React/Redux and served data from a Flask back end. Dockerized the application.
    • Wrote a language detection system for hospitals using PocketSphinx. This system leveraged an acoustic model with support for the union of phonemes in 10 different languages, then classified the resulting phonemes.
    • Built a website for a Persian cultural center. I used WordPress to prototype a front end quickly, then added some additional functionality by wrapping a Flask back end.
    Technologies: Redux, React, Docker, Python, C++
  • Software Engineer

    2014 - 2017
    Google
    • Built an LSTM-based model to predict nonce compound words in German. This improved the orthographic accuracy of Google's state-of-the-art ASR system and paved the way for similar normalization in other languages.
    • Found new data sources and experimented with different preprocessing and sampling to improve automated speech recognition performance in over sixty languages, including a WER improvement of 33% in Hindi.
    • Enabled state-of-the-art language models in over one hundred languages for GBoard.
    • Spearheaded the incorporation of Google's Keyboard language model pipeline into the speech team's infrastructure. Wrote unit and integration tests to make the Keyboard pipeline more robust.
    • A/B tested new language models. Assessed user comparisons of outputs to determine whether a proposed algorithm change would positively affect quality.
    Technologies: BORG, TensorFlow, Python, C++

Experience

  • Legislython (Development)
    https://github.com/flosincapite/legislython

    An API that consumes voting data from senate.gov. This API transforms the underlying XML into Python objects for use in web and analytics applications. I have also wrapped the API in a simple Flask application to allow users to generate downloadable CSVs.

Skills

  • Languages

    Python 3, Python 2, C++, Bash, Python, SQL, C++11, C++14, JavaScript
  • Frameworks

    Flask, Django, GRPC, Google Mock (GMock), Redux
  • Libraries/APIs

    NumPy, Keras, Flask-RESTful, jQuery, SciPy, Pandas, NLTK, TensorFlow, React, Node.js, PyTorch, Scikit-learn
  • Paradigms

    REST, Data Science
  • Platforms

    Ubuntu, Ubuntu Linux, Amazon Web Services (AWS), Linux, Docker, Kubernetes
  • Storage

    PostgreSQL, SQLite, Databases, Elasticsearch
  • Other

    A/B Testing, Bash Scripting, Big Data, Deep Learning, Machine Learning, Natural Language Processing (NLP), Speech Recognition, Speech to Text, Entity Extraction, AWS, Full-stack, Data Analytics, Natural Language Understanding, Information Extraction, Information Retrieval, Entity Relationship Modeling, Clustering, Unsupervised Learning, Clustering Algorithms, K-means Clustering, BORG, Ajax, Data Analysis, SAP SD
  • Tools

    Gimp, Vim Text Editor, JSX, VirtualBox

Education

  • Master of Arts degree in Computational Linguistics
    2012 - 2014
    Brandeis University - Waltham, MA
  • Master of Arts degree in Comparative Literature
    2010 - 2012
    University of California, Santa Barbara - Santa Barbara, CA
  • Bachelor of Arts degree in Creative Writing
    2006 - 2010
    Duke University - Durham, NC

To view more profiles

Join Toptal
Share it with others