Head of Data Science/Quantitative Trading2020 - PRESENTTHINKalpha
Technologies: Python 3, Amazon Web Services (AWS), SpaCy, Natural Language Processing (NLP), Machine Learning, SQL, Data Visualization, Data Engineering, Statistical Analysis, Cloud
- Designed, built, and managed a quantitative trading engine that covers global equities, currencies, and cryptocurrencies. This system was used to build and optimize trading strategies that traded hundreds of millions of capital.
- Built and integrated the ETL pipelines, monitoring, and code for the quantitative database that housed all market data across supported assets and integrated this with systems for backtesting and live-trading agents.
- Created a natural language to quant-formula translation engine that generates quantitative trading strategies from verbal descriptions into formulas that can be backtested or traded in ThinkAlpha's trading engine.
- Built, deployed, and optimized a variety of custom trading strategies for Avatar traders.
- Designed, constructed, and managed a series of high Sharpe trading strategies.
Deep Learning Engineer2019 - 2021Voiceops
Technologies: Document Processing, Custom BERT, Data Science, Deep Learning, Amazon Web Services (AWS), PyTorch, Torch, Natural Language Processing (NLP), Pandas, Machine Learning, Python, Keras, Fairseq, Artificial Intelligence (AI), SQL, Data Engineering, Statistical Analysis, Cloud
- Developed the architecture and construction of AWS-based infrastructure for large-scale machine learning. VoiceOps is an AI-driven coaching and training platform for call centers.
- Built DL models to support the transcription process. Included scripts to pre-train, fine-tune, and fully integrate transformers (e.g., BERT, various Hugging Face transformers) into novel new architectures that included both text and statistical data.
- Built a modified transformer to automatically score the quality of transcriptions and determine whether they should pass to the client (ROC-AUC = 0.90).
- Created a modified transformer that automated the detection of speakers based on text (ROC-AUC = 0.97).
- Automated the estimation of how long a transcript would take to transcribe to replace a fixed-price system (cost savings of 20–30% of total transcription costs).
- Improved Automated Speech Recognition (ASR) via Seq2seq architectures.
Chief Technology Officer2019 - 2021Mobilads
Technologies: Data Science, Amazon Web Services (AWS), Pandas, Shapely, GeoPandas, Python, Data Visualization, Data Engineering, Statistical Analysis, Cloud
- Constructed a geospatial system that maps physical ad impressions based on vehicle and mobile GPS data. The Mobilads geospatial system was successfully built to operate worldwide and to scale to thousands of vehicles and billions of GPS points.
- Developed automated reporting systems for the clients of Mobilads to demonstrate the technology.
- Built up the company's IP portfolio by integrating census, geotracking, and social data to enrich what Mobilads knows about the people who see their vehicles. This ensures consistent industry-leading return on ad spend.
- Architected and led the development of Mobilads' app for autonomously managing tens of thousands of drivers.
Founder, CEO, and Principal Consultant2016 - 2021Relu Analytics
Technologies: Document Processing, Custom BERT, Web Scraping, Data Science, Deep Learning, Amazon Web Services (AWS), PyTorch, Torch, Natural Language Processing (NLP), Pandas, Machine Learning, TensorFlow, Keras, Scikit-learn, Python, Artificial Intelligence (AI), SQL, Data Visualization, Data Engineering, Statistical Analysis, Google Cloud Platform (GCP), Cloud
- Consulted as the senior data scientist at Step Energy Services. Built algorithms for optimizing the use of fixed equipment, including extended maintenance, failure prediction, forecasting, and budgeting, as well as cash flow prediction.
- Worked with the leadership team of Cinelytics to build scalable NLP pipelines. Provided code samples and walked through the software engineering team on building and deploying deep learning models in the capacity of a data scientist at Cinelytic.
- Designed an end-to-end machine learning application using Google Cloud to serve as an API for the front-end team to deliver predictions via the company's UI. Consulted as the data scientist at Meditalente GMBH.
CEO | Previously Chief Data Scientist2017 - 2019Sigmai
Technologies: Document Processing, Custom BERT, Data Science, Deep Learning, Amazon Web Services (AWS), Torch, Natural Language Processing (NLP), Pandas, Machine Learning, R, TensorFlow, Keras, Python, Artificial Intelligence (AI), Data Engineering, Statistical Analysis, Cloud
- Led a team of 15 data scientists, linguists, software engineers, product managers, and sales professionals leading to Sigmai's acquisition in 2018 by Commetric.
- Focused primarily on deep learning for text classification with Keras and TensorFlow and its integration within a rule-based NLP system.
- Developed an out-of-memory document clustering system to allow the clustering of billions of news articles.
- Built a natural language processing (NLP) system that rivaled the best NLP companies in finance and led to data trials with some of the largest fund managers.
- Led and oversaw the Newsful application (app.Newsful.io) that was shortlisted for the 2018 SIIA CODiE Award. The business operations were acquired by Commetric.
Data Scientist2016 - 2018Zalando
Technologies: Amazon Web Services (AWS), Data Science, Pandas, Machine Learning, Scikit-learn, R, Spark, Python, SQL, Data Engineering, Statistical Analysis, Cloud
- Developed analytical tools and ETL pipelines in Spark on AWS.
- Built predictive tools for targeting audiences for specific ad campaigns.
- Developed interactive data applications for product owners using Python and R Shiny to automate time-consuming analysis tasks, including customer journeys and return on ad spend.
- Developed a system to optimize how ads are placed within the search and recommendation engine to reduce lost revenue due to poor ad placement by up to $0.5 million USD per month.
- Designed a system for determining the causal impact of multiple concurrent ad campaigns, including off-site, on-site, banner Ads, and full-page ads, using regression and Bayesian time-series models.