Python 3 Developer in Cincinnati, OH, United States
Senior Data Scientist2019 - PRESENTClarigent Health
Technologies: Python, SQL, Azure
- Improved status quo of published suicide ideation classification model from ROC-AUC 0.76 to 0.88 as defined by leave-one-out validation.
- Built model stress tester, assessing performance consistently across models from a variety of perspectives using a wide range of visualizations.
Data Scientist, Owner2018 - PRESENTData Science Consulting LLC
Technologies: Python, Scikit-learn, keras, tensorflow, flask, SQL, airflow
- Created a data as a service solution for small and medium-size businesses.
- Provided end-to-end automated solutions involving data acquisition, database setup+maintenance, exploratory analysis, dashboards/data visualizations, machine learning for predictive and unsupervised modeling, and web apps for any type of data (text, time series, tabular, etc.).
Lead Data Scientist, Owner2016 - PRESENTFantasy Outliers
Technologies: Python, Flask, R, D3.js, HTML, CSS
- Provided historical and predictive analysis for fantasy football.
- Beat ESPN's weekly projections in Weeks 6-16 of 2017.
- Predicted several key underrated players in 2017 (Russell Wilson, Zach Ertz, Mark Ingram) and quarterback projections beat expert consensus rankings.
- Explored what actually happened in competitive leagues with interactive visualizations (fantasyoutliers.com).
Data Science Researcher2016 - 2018Georgia Tech Research Institute
- Analyzed team cohesion in League of Legends Matches. Implemented automated data-collection pipeline in MongoDB with >3TB of data of League of Legends match data. Used PCA, K-Means clustering, network density, and others to develop non-skill-based features from a psychological perspective that discriminated between wins and losses. Trained Gradient Boosting Classifier to predict the game winner based on historical psychological dimensions across the team (non-skill-based) with some success (AUC 0.58-0.68).
- Automated data acquisition, cleaning, merging, and visualizing various publicly available data breach sources, creating a more reliable and complete data source. Created an automated engine using web scraping and NLP to gather and search SEC filings for language containing a high probability of data breach cost disclosures.
- Built compliance risk metric for government facilities using multiple, auto-trained and aggregated XGBoost models to help prioritize government resources (NLP, NNMF). Built automated, cross-document named entity analysis pipeline, using spacy and Python, for count-based association analysis.
- Built software, inspired by Continuous Integration platforms, that builds, runs, and assesses granularized performance of a script across all function calls (Python). Links to git repository and runs with every commit, comparing performance to previous commit, and raises alerts if performance dips below user-defined thresholds. Visualizes performance history in a dashboard (Flask, SQLAlchemy).
Data Scientist Contractor2015 - 2018Self-employed (remote)
Technologies: Python, Flask, HTML, CSS, Machine learning, R, MongoDB, SQL
- Built automated information extraction engine for unstructured financial statements using a unique pipeline of tree-based ensemble classifiers. Enabled company to engage in more complex historical analyses.
- Created a Monte-Carlo-based pricing simulator that provides insight into both portfolio-wide and individual client pricing strategies with very little information about the customer. Expected profit simulated distributions combined with visualizations helped pricing team understand probabilistic expectations for a given customer, which lead to better client relationships. Built an automated system that forecasted eligible assets, which led to higher profits.
- Implemented first-of-kind program that analyzed signal rate data using a sequence of Random Forest Classifiers and logic to attribute signal load to individual devices and analyze results. Continued work on capstone project through prototype completion.
Outbound Business Development + Operations2014 - 2015Connect First
Technologies: Excel, Phone
- Created foundational methodologies for a new lead generation department, which led to better sales and more internal funding for our department.
Composer, Founder2010 - 2015Tuneplant
Technologies: Music composition
- Developed project management and relationship building skills with clients, maintaining profitable, repeat-customer business, and 5-star rating.
Business Development and Music Production2012 - 2014alcheh&hunt
Technologies: Music composition, Sales
- Grew list from ~100 to 900+ organically developed, active contacts in 12 months through introductory meeting generation with top-tier advertising agencies.
Senior Diagnostic Consultant / Database Analyst2005 - 2008The Nielsen Company
Technologies: Excel, SPSS
- Worked with VP’s and C-Level executives to create and implement a comprehensive quantitative and qualitative framework describing the consumer adoption process.
- Used Excel and SPSS to craft data-driven responses to inquiries regarding historical database and to conduct research, which resulted in internal recognition of achievement award.
- Fantasy Football Predictive Models Beat ESPN, Tied Vegas (Development)https://medium.com/fantasy-outliers
Last year, Fantasy Outliers’ predictive models helped a disproportionate number of users win their leagues, spotted Free Agent pickups a week or two before others started talking about them, gave good start/sit direction. When compared to ESPN's projections, yearly overall rankings were more accurate than ESPN’s 72% of the time and were directionally accurate 84% of the time for quarterbacks. Weekly projections were more accurate than ESPN's 57% of the time and directionally accurate 64% of the time for quarterbacks who were likely starters. Other positions were less accurate, but still better than ESPN often.
In 2018, we implemented a game winner prediction model that predicted NFL game winners with information available Tuesday morning that ended up tying Vegas's predictions that used information available up until kickoff.
Full write-ups include, How Artificial Intelligence (AI) beat ESPN in Fantasy Football (https://medium.com/fantasy-outliers/how-artificial-intelligence-ai-beat-espn-in-fantasy-football-204f4c05e1c9) and Can machine learning help improve your fantasy football draft? (https://medium.com/fantasy-outliers/can-machine-learning-can-help-improve-your-fantasy-football-draft-4ceea1f1b2bd).
- Attributing Flowrate Signal to Devices Using Data Sensors (Development)http://blog.galvanize.com/data-science-analyze-energy-efficiency/#.VrTPsd-rTBJ
For a capstone project at Galvanize, built a system that uses data from sensors to analyze energy efficiency. The system can determine what devices or appliances are currently turned on and the resource demands attributed to each device, allowing for further usage optimization downstream.
FrameworksMachine Learning, Flask
Libraries/APIsScikit-learn, Keras, XGBoost, D3.js, jQuery, TensorFlow
ParadigmsData Science, Object-oriented Programming (OOP), Agile
OtherAlgorithms, Data Mining, Data Visualization, Natural Language Processing (NLP), Deep Learning, Speech Analytics, Agile Data Science
- Master's degree in Music Composition2005 - 2007University of Louisville - Louisville, KY
- Bachelor's degree in Physics, Music, Psychology (minor)2000 - 2004Wake Forest University - Winston-Salem, NC
- Deep Learning SpecializationJANUARY 2019 - PRESENTCoursera
- Data AnalystAPRIL 2016 - PRESENTUdacity
- Data Science Immersive BootcampSEPTEMBER 2015 - PRESENTGalvanize