Chris Seal, Machine Learning Developer in Cincinnati, OH, United States
Chris Seal

Machine Learning Developer in Cincinnati, OH, United States

Member since August 13, 2018
Chris is an experienced data scientist with over 4 years' experience working independently and in the government subcontracting space with a leading data analytics firm. His well-rounded university education and work history include a magna cum laude undergraduate degree in physics, a master's in music composition, an advanced degree from Galvanize Data Science Immersive, and a deep learning specialization from deeplearning.ai.
Chris is now available for hire

Portfolio

Experience

Location

Cincinnati, OH, United States

Availability

Part-time

Preferred Environment

Linux, Sublime, Atom, iPython, GitHub

The most amazing...

...result of a project is that I beat ESPN's fantasy football projections and tied Vegas's game winners using raw, unadjusted machine learning

Employment

  • Senior Data Scientist

    2020 - PRESENT
    Homee
    • Led the data science effort for a startup with momentum looking to scale nationally.
    Technologies: Python, MySQL
  • Data Scientist, Owner

    2018 - PRESENT
    Data Science Consulting LLC
    • Provided end-to-end automated solutions involving data acquisition, database setup+maintenance, exploratory analysis, dashboards/data visualizations, machine learning for predictive and unsupervised modeling, and web apps for any type of data (text, time series, tabular, etc.).
    • Created a detailed plan-of-action that a global IT consulting company presented to multiple front-end and back-end engineers serving as instructions to integrate a new service into their existing platform. Built a substantive prototype to demonstrate functionality to stakeholders in Python. The end-to-end instruction manual included wireframes, data modeling, Monte Carlo simulations, performance testing, and motivation for the project.
    • Mapped out a new database system from an existing operational schema for analysts at a leading collections agency to use, which simplified and lead to more robust analyses. (SQL, Airflow).
    • Built a flask app for a publicly-traded healthcare company that optimizes efficiency and accuracy when preparing compliance reports. Incorporates human-in-the-loop report initialization, automated querying, task assignment, pdf generation and more. The app is structured to scale with the company at minimal maintenance investment. Passed the first round of user acceptance testing with no additional requests.
    • Built a keyword extraction API for the US Government to assist in summarizing and searching a large number of documents using fast variations of many of the popular techniques and case-specific customization, callable with parameters adaptable to users' needs.
    • Converted an Excel-based unpoliced database and reporting process for an investment firm to a scalable, verifiable, and flexible database schema. Created an automated pdf summary report with a range of visualizations visible to key stakeholders.
    • Built an automated report for a leading retail investment company that provided extensive data visualizations which gave insight into all aspects of the sales pipeline.
    • Conducted a literature review for a start-up e-learning platform that resulted in a prioritized data collection and modeling plan of action from launch onward.
    Technologies: Python, Scikit-learn, Keras, Tensorflow, Flask, SQL, Airflow, Pandas, Numpy
  • Lead Data Scientist, Owner

    2016 - PRESENT
    Fantasy Outliers
    • Scraped publicly available historical fantasy football data from thousands of leagues, and created an automated data scraping+wrangling process that obtained and merged NFL data from a variety of sources.
    • Beat ESPN's weekly projections in the only comparison made, which occurred during Weeks 6-16 of 2017 - https://medium.com/fantasy-outliers/how-artificial-intelligence-ai-beat-espn-in-fantasy-football-204f4c05e1c9.
    • Predicted several key underrated players in 2017 (Russell Wilson, Zach Ertz, Mark Ingram) and quarterback projections beat expert consensus rankings - https://medium.com/fantasy-outliers/can-machine-learning-can-help-improve-your-fantasy-football-draft-4ceea1f1b2bd.
    • Built an interactive website using HTML, CSS, D3.js, and Javascript with automated scripts in Python that interactively explores what actually happened in competitive leagues (fantasyoutliers.com).
    • Tied Vegas's up-to-kickoff game-winner projections using automated predictions based on data available Tuesday morning the week prior with no manual adjustments for injury, etc. - https://medium.com/fantasy-outliers/we-tied-vegas-in-our-first-attempt-at-predicting-nfl-game-winners-with-machine-learning-24a805ab3126.
    • Developed an automated lineup optimizer that won 50/50 ball at Draft Kings >90% of the time in the last three weeks of 2018. (Daily Fantasy Sports).
    • Used an automated pipeline of preprocessing and predictive algorithms, along with automated feature selection to iterate through both meta and model parameters to find most predictive models for QB, WR, RB, TE, K, and D/ST for rookies, second-year players, and veterans.
    Technologies: Python, Flask, R, D3.js, HTML, CSS, Numpy, Pandas, Scikit-Learn, AWS
  • Senior Data Scientist

    2019 - 2019
    Clarigent Health
    • Improved status quo of published, patented suicide ideation classification model by ~10-15%, based on leave-one-out validation. The modeling approach performed better on the new dataset.
    • Expanded scope of what the company previously thought was possible to predict. Built successful models in areas they hadn't previously thought possible.
    • Built pipeline from scratch that includes version-controlled, advanced NLP feature engineering, dynamic/"smart" pre-processing with dimensionality reduction, concurrent hyperparameter search and feature selection for both regressors and classifiers, model explainability, and insights across multiple models. The approach is dynamic, allowing users to align modeling approach with the dataset and project constraints.
    Technologies: Python, SQL, Azure, XGBoost, spaCy, NLP, Scikit-learn
  • Data Science Researcher

    2016 - 2018
    Georgia Tech Research Institute
    • Analyzed team cohesion in League of Legends Matches. Implemented automated data-collection pipeline in MongoDB with >3TB of data of League of Legends match data. Used PCA, K-Means clustering, network density, and others to develop non-skill-based features from a psychological perspective that discriminated between wins and losses. Trained Gradient Boosting Classifier to predict the game winner based on historical psychological dimensions across the team (non-skill-based) with some success (AUC 0.58-0.68).
    • Automated data acquisition, cleaning, merging, and visualizing various publicly available data breach sources, creating a more reliable and complete data source. Created an automated engine using web scraping and NLP to gather and search SEC filings for language containing a high probability of data breach cost disclosures.
    • Built compliance risk metric for government facilities using multiple, auto-trained and aggregated XGBoost models to help prioritize government resources (NLP, NNMF). Built automated, cross-document named entity analysis pipeline, using spacy and Python, for count-based association analysis.
    • Implemented software that inputs log data and a system definition and outputs an interactive system visualization dynamically changing across time as the user steps through time (mxGraph, Javascript, Python, HTML/CSS). Used to understand complex, nested systems and debug issues within them.
    • Built software, inspired by continuous integration platforms, that builds, runs, and assesses granularized performance of a script across all function calls (Python). Links to a git repository and runs with every commit, comparing performance to the previous commit, and raises alerts if performance dips below user-defined thresholds. Visualizes performance history in a dashboard (Flask, SQLAlchemy).
    Technologies: Python, R, Flask, SQL, MongoDB, mxGraph, JavaScript, SpaCy, Scikit-Learn
  • Data Scientist Contractor

    2015 - 2018
    Self-employed (remote)
    • Built automated information extraction engine for unstructured financial statements using a unique pipeline of tree-based ensemble classifiers. Enabled company to engage in more complex historical analyses. Decreased data entry time and increased accuracy. Displayed results of classification models in an interactive website where users are pointed to areas of low confidence. System started with a small data set, and is built in such a way where models can be retrained from scratch at the click of a button when new data has been validated. (Python, Flask).
    • Created a Monte-Carlo-based pricing simulator that provides insight into both portfolio-wide and individual client pricing strategies with very little information about the customer. Expected profit simulated distributions combined with visualizations helped pricing team understand probabilistic expectations for a given customer, which lead to better client relationships. Built an automated system forecasting eligible assets, which led to higher profits.
    • Implemented first-of-kind program that analyzed signal rate data using a sequence of Random Forest Classifiers and logic to attribute signal load to individual devices and analyze results. Continued work on capstone project through prototype completion.
    Technologies: Python, Flask, HTML, CSS, Machine learning, R, MongoDB, SQL
  • Outbound Business Development + Operations

    2014 - 2015
    Connect First
    • Created foundational methodologies for a new lead generation department, which led to better sales and more internal funding for our department.
    Technologies: Excel, Phone
  • Composer, Founder

    2010 - 2015
    Tuneplant
    • Developed project management and relationship building skills with clients, maintaining profitable, repeat-customer business, and 5-star rating.
    Technologies: Music composition
  • Business Development and Music Production

    2012 - 2014
    alcheh&hunt
    • Grew list from ~100 to 900+ organically developed, active contacts in 12 months through introductory meeting generation with top-tier advertising agencies.
    Technologies: Music composition, Sales
  • Senior Diagnostic Consultant / Database Analyst

    2005 - 2008
    The Nielsen Company
    • Worked with VP’s and C-Level executives to create and implement a comprehensive quantitative and qualitative framework describing the consumer adoption process.
    • Used Excel and SPSS to craft data-driven responses to inquiries regarding historical database and to conduct research, which resulted in internal recognition of achievement award.
    Technologies: Excel, SPSS

Experience

  • Fantasy Football Predictive Models Beat ESPN, Tied Vegas (Development)
    https://medium.com/fantasy-outliers

    Last year, Fantasy Outliers’ predictive models helped a disproportionate number of users win their leagues, spotted Free Agent pickups a week or two before others started talking about them, gave good start/sit direction. When compared to ESPN's projections, yearly overall rankings were more accurate than ESPN’s 72% of the time and were directionally accurate 84% of the time for quarterbacks. Weekly projections were more accurate than ESPN's 57% of the time and directionally accurate 64% of the time for quarterbacks who were likely starters. Other positions were less accurate, but still better than ESPN often.

    In 2018, we implemented a game winner prediction model that predicted NFL game winners with information available Tuesday morning that ended up tying Vegas's predictions that used information available up until kickoff.

    Full write-ups include, How Artificial Intelligence (AI) beat ESPN in Fantasy Football (https://medium.com/fantasy-outliers/how-artificial-intelligence-ai-beat-espn-in-fantasy-football-204f4c05e1c9) and Can machine learning help improve your fantasy football draft? (https://medium.com/fantasy-outliers/can-machine-learning-can-help-improve-your-fantasy-football-draft-4ceea1f1b2bd).

  • Attributing Flowrate Signal to Devices Using Data Sensors (Development)
    http://blog.galvanize.com/data-science-analyze-energy-efficiency/#.VrTPsd-rTBJ

    For a capstone project at Galvanize, built a system that uses data from sensors to analyze energy efficiency. The system can determine what devices or appliances are currently turned on and the resource demands attributed to each device, allowing for further usage optimization downstream.

Skills

  • Languages

    Python 2, Python 3, Python, SQL, JavaScript, HTML, R, CSS5
  • Libraries/APIs

    Scikit-learn, XGBoost, Matplotlib, Pandas, NumPy, TensorFlow, Keras, D3.js, Spark Streaming, jQuery
  • Tools

    NLPP, MxGraph, Git, GitHub, Kafka Streams, Spark SQL
  • Paradigms

    Data Science, Object-oriented Programming (OOP), Agile, Anomaly Detection
  • Other

    Data Visualization, Machine Learning, Natural Language Processing (NLP), Speech Analytics, Algorithms, Data Mining, Data Analytics, Software Development, Deep Learning, Agile Data Science, Convolutional Neural Networks, Time Series Analysis, Sentiment Analysis, AWS, Data Scraping, faust
  • Frameworks

    Flask, Spark
  • Platforms

    Apache Kafka, Linux, Amazon Web Services (AWS), Windows
  • Storage

    MongoDB, NoSQL, PostgreSQL, AWS S3, Redshift, MySQL

Education

  • Master's degree in Music Composition
    2005 - 2007
    University of Louisville - Louisville, KY
  • Bachelor's degree in Physics, Music, Psychology (minor)
    2000 - 2004
    Wake Forest University - Winston-Salem, NC

Certifications

  • Data Streaming Nanodegree
    APRIL 2020 - PRESENT
    Udacity
  • Data Engineering Nanodegree
    MARCH 2020 - PRESENT
    Udacity
  • Deep Learning Specialization
    JANUARY 2019 - PRESENT
    Coursera
  • Data Analyst
    APRIL 2016 - PRESENT
    Udacity
  • Data Science Immersive Bootcamp
    SEPTEMBER 2015 - PRESENT
    Galvanize

To view more profiles

Join Toptal
Share it with others