Senior Machine Learning Scientist2019 - 2020System1 Biosciences
Technologies: SQL, Deep Learning, Signal Processing, Image Processing, Experimental Research, Experimental Design, Continuous Integration (CI), Docker, Git, Project Management, Data Visualization, Statistics, Presentations, Amazon Web Services (AWS), Machine Learning, Convolutional Neural Networks, Computer Vision, PyTorch, Scikit-learn, Pandas, NumPy, SciPy, Python
- Led the video microscopy data pipeline team with biology, robotics, software, and data science members. Deployed a 12-step processing DAG in AWS on 500+ videos (over 10TB). Reduced the failure rate of QC-ed videos by 75% and increased frame rate 10x.
- Built and productionized CNN-based image segmentation for automated quantification of tissue protein expression. Deployed in AWS on over 1,000 scanned images (more than 1PB).
- Demonstrated effects of lab protocols on tissue quality, used for patents and investor demos.
- Created an advanced analytics pipeline to measure and describe neuronal network activity. It was used to demonstrate the significant and distinct effects of three different neuromodulatory drugs and validate new lab protocols.
- Built an analytics pipeline to assay hierarchical effects of experimental variables. Created novel, statistically rigorous methods for demonstrating disease effects.
- Served as a technical lead for neurodegenerative disease program. Planned and executed scientific roadmaps and company and investor presentations while coordinating experimental designs, data pipelines, ML, and analytics.
Senior Data Scientist—Machine Learning2017 - 2019Intuit, Inc.
Technologies: A/B Testing, Git, Python, Pandas, Amazon Web Services (AWS), Docker, Technical Project Management, Keras, Deep Learning, Hadoop, PySpark, SQL, Natural Language Processing (NLP), SciPy, NumPy, Machine Learning
- Acted as a technical lead for QuickBooks Online's self-help recommendation algorithm, which required a multi-team collaboration. Expanded its use to all customer segments and submitted multiple patents for its backend ML algorithms.
- Trained, productionized, and A/B tested the first real-time deep learning models (RNN and LSTM) in QuickBooks. Boosted customer engagement by 55%, reduced customer support call rates by 10% and reduced direct annual costs by at least $900,000.
- Transformed data from millions of users and billions of clickstream events via distributed computing such as Spark to create embedded representations of online user activity and improve multiple existing ML services.
- Trained interns and led exploratory machine learning and NLP research for customer success. Projects included an API service to anonymize customer chat data and a predictive customer support call intent model.
Visiting Scientist2015 - 2017Oregon Health and Science University
Technologies: Scientific Computing, Linux, Experimental Research, 3D Image Processing, Signal Processing, Experimental Design, Factor Analysis, Python, Data Visualization, Statistics, Computer Vision, Graph Theory, Machine Learning
- Led two research projects on a 6-member data team comprised of graduate students, postdoctoral scientists, and research staff, resulting in three publications and multiple conference presentations.
- Built multilinear regression models explaining more than 60% variance in the correlational structure of fMRI time-series data, using anatomical and gene expression data as features.
- Trained students and research staff in structural and functional MRI, signal processing, and data analysis.
Graduate Student Researcher2012 - 2017UC Davis Center for Neuroscience
Technologies: Signal Processing, 3D Image Processing, Linux, Experimental Design, Experimental Research, Data Visualization, Statistics
- Developed data analysis strategies independently. Selected for a two-year Autism Speaks research fellowship award for my work.
- Produced results that were instrumental in securing a federal grant worth over $1.5 million.
- Published 12 peer-reviewed studies with over 700 citations, covering advanced statistical and computational techniques for processing multimodal brain MRI data and characterizing typical and atypical brain organization.