Nicholas Roth, Big Data Developer in Austin, TX, United States
Nicholas Roth

Big Data Developer in Austin, TX, United States

Member since January 25, 2020
Nicholas is an efficient, deep learning engineer and data scientist with experience in data curation, classical machine learning algorithms, and statistical modeling. His initial focus is on data extraction. Once an algorithm (and intuition) shows the most crucial areas, he shifts gears to get the "juice" out of that data before testing and deploying models in a client's infrastructure, delivering exceptional and accurate work.
Nicholas is now available for hire


    Python, Amazon, Amazon Web Services (AWS), Linux, PyTorch
  • Oracle Labs
    TensorFlow, Linux, C++, C, Linux Kernel, Oracle RDBMS



Austin, TX, United States



Preferred Environment

PyTorch, NumPy, Pandas, Amazon Web Services (AWS), Linux, Jupyter, Vim

The most amazing...

...project I've worked on is with the startup I designed and built a system that trains generic predictive models for nontechnical customers.


  • Deep Learning Engineer

    2018 - PRESENT
    • Designed, implemented, and productized a system for training predictive models leveraging internal and external data.
    • Performed a data science role to assist customers with tasks outside of the predictive model system.
    • Helped design and build a system that functions as an automatic data scientist, showing users similar insights to what a data scientist would, to maximize profits and minimize humans in the loop.
    Technologies: Python, Amazon, Amazon Web Services (AWS), Linux, PyTorch
  • Research Assistant

    2012 - 2018
    Oracle Labs
    • Developed a deep property graph embedding based on Doc2vec for use with predictive deep and machine learning models on knowledge graphs, significantly improving our fraud detection offerings.
    • Wrote a high-performance distributed query engine for graph analysis (PGX.DIST), which was our first competitive offering of the kind.
    • Designed a measurement tool and series of experiments to improve performance of the Oracle database, allowing us to better understand the performance impact of certain memory management operations.
    • Wrote message-oriented middleware for large-scale enterprise computing, providing a demo for Oracle Coherence.
    • Wrote an HTML5-based Programmer’s Notebook back when Jupyter was new and still called IPython. This supported alternative languages like R, J (modern APL), and JavaScript.
    Technologies: TensorFlow, Linux, C++, C, Linux Kernel, Oracle RDBMS


  • Embeddings for Music (Development)

    I worked with a team to scrape music video playlists from YouTube and used the data to learn Doc2vec embeddings with my custom implementation of the algorithm in TensorFlow. To build a music recommender system from this, I took an existing “playlist vector” from a user’s listening history and sorted the songs in our dataset by cosine distance, returning the top ones. Before leaving, I suggested that the next step might be to use content-based embeddings derived from the hidden layer of an LSTM trained on the videos’ soundtracks.


  • Languages

    C++, Python, C
  • Libraries/APIs

    PyTorch, TensorFlow
  • Other

    Machine Learning, Artificial Intelligence (AI), Big Data, Algorithms, Natural Language Processing (NLP)
  • Paradigms

    Data Science, Automation
  • Platforms

    Linux, Amazon Web Services (AWS)

To view more profiles

Join Toptal
Share it with others