Senior Data Scientist2019 - PRESENTClarigent Health
Technologies: Python, SQL, Azure
- Improved status quo of published suicide ideation classification model from ROC-AUC 0.76 to 0.92 using same validation method.
- Implemented advanced NLP feature engineering, and implementing dynamic/"smart" pre-processing and dimensionality reduction, along with concurrent hyperparameter search and feature selection using XGBoost regressors and classifiers.
- Expanded scope of what company previously thought was possible to predict.
- Played a key role in study design and user experience in the app.
Data Scientist, Owner2018 - PRESENTData Science Consulting LLC
Technologies: Python, Scikit-learn, keras, tensorflow, flask, SQL, airflow
- Created a data as a service solution for small and medium-size businesses.
- Provided end-to-end automated solutions involving data acquisition, database setup+maintenance, exploratory analysis, dashboards/data visualizations, machine learning for predictive and unsupervised modeling, and web apps for any type of data (text, time series, tabular, etc.).
Lead Data Scientist, Owner2016 - PRESENTFantasy Outliers
Technologies: Python, Flask, R, D3.js, HTML, CSS
- Provided historical and predictive analysis for fantasy football.
- Beat ESPN's weekly projections in Weeks 6-16 of 2017.
- Predicted several key underrated players in 2017 (Russell Wilson, Zach Ertz, Mark Ingram) and quarterback projections beat expert consensus rankings.
- Explored what actually happened in competitive leagues with interactive visualizations (fantasyoutliers.com).
Data Science Researcher2016 - 2018Georgia Tech Research Institute
- Analyzed team cohesion in League of Legends Matches. Implemented automated data-collection pipeline in MongoDB with >3TB of data of League of Legends match data. Used PCA, K-Means clustering, network density, and others to develop non-skill-based features from a psychological perspective that discriminated between wins and losses. Trained Gradient Boosting Classifier to predict the game winner based on historical psychological dimensions across the team (non-skill-based) with some success (AUC 0.58-0.68).
- Automated data acquisition, cleaning, merging, and visualizing various publicly available data breach sources, creating a more reliable and complete data source. Created an automated engine using web scraping and NLP to gather and search SEC filings for language containing a high probability of data breach cost disclosures.
- Built compliance risk metric for government facilities using multiple, auto-trained and aggregated XGBoost models to help prioritize government resources (NLP, NNMF). Built automated, cross-document named entity analysis pipeline, using spacy and Python, for count-based association analysis.
- Built software, inspired by Continuous Integration platforms, that builds, runs, and assesses granularized performance of a script across all function calls (Python). Links to git repository and runs with every commit, comparing performance to previous commit, and raises alerts if performance dips below user-defined thresholds. Visualizes performance history in a dashboard (Flask, SQLAlchemy).
Data Scientist Contractor2015 - 2018Self-employed (remote)
Technologies: Python, Flask, HTML, CSS, Machine learning, R, MongoDB, SQL
- Built automated information extraction engine for unstructured financial statements using a unique pipeline of tree-based ensemble classifiers. Enabled company to engage in more complex historical analyses.
- Created a Monte-Carlo-based pricing simulator that provides insight into both portfolio-wide and individual client pricing strategies with very little information about the customer. Expected profit simulated distributions combined with visualizations helped pricing team understand probabilistic expectations for a given customer, which lead to better client relationships. Built an automated system that forecasted eligible assets, which led to higher profits.
- Implemented first-of-kind program that analyzed signal rate data using a sequence of Random Forest Classifiers and logic to attribute signal load to individual devices and analyze results. Continued work on capstone project through prototype completion.
Outbound Business Development + Operations2014 - 2015Connect First
Technologies: Excel, Phone
- Created foundational methodologies for a new lead generation department, which led to better sales and more internal funding for our department.
Composer, Founder2010 - 2015Tuneplant
Technologies: Music composition
- Developed project management and relationship building skills with clients, maintaining profitable, repeat-customer business, and 5-star rating.
Business Development and Music Production2012 - 2014alcheh&hunt
Technologies: Music composition, Sales
- Grew list from ~100 to 900+ organically developed, active contacts in 12 months through introductory meeting generation with top-tier advertising agencies.
Senior Diagnostic Consultant / Database Analyst2005 - 2008The Nielsen Company
Technologies: Excel, SPSS
- Worked with VP’s and C-Level executives to create and implement a comprehensive quantitative and qualitative framework describing the consumer adoption process.
- Used Excel and SPSS to craft data-driven responses to inquiries regarding historical database and to conduct research, which resulted in internal recognition of achievement award.