David Pepper

R Developer in New York, NY, United States

Davis is a senior data scientist with a proven track record of delivering innovative solutions and bottom line results. He has expertise in AI, machine learning, game theory, graph databases, and feature engineering.
  • Mathematical Modeling, 20 years
  • R, 20 years
  • Modeling, 20 years
  • Artificial Intelligence (AI), 14 years
  • Data Science, 10 years
  • Machine Learning, 10 years
  • Python 3, 9 years
  • Natural Language Processing (NLP), 4 years
New York, NY, United States



Preferred Environment

RStudio, Git, Google Colaboratory, TensorFlow

The most amazing...

...mathematical model I've ever built was to calculate optimal strategies for city block walking routes.


  • Data Science Consultant

    2017 - PRESENT
    New York Civil Liberties Union
    • Cleaned and analyzed bail data throughout New York State to find racial discrepancies in bail set for individuals accused of similar crimes.
    • Assisted counsel in a Section 2 voting rights case, reading expert reports and suggesting additional analysis.
    Technologies: R, RMarkdown, Shiny, RStudio, Boosted Gradient Tree Modeling
  • Data Science Consultant

    2018 - 2018
    • Analyzed customer survey data to determine interest in various energy saving features for home improvement.
    Technologies: R, Gitlab, RStudio, ggplot, tidyverse
  • Senior Data Scientist

    2016 - 2018
    Socure, Inc.
    • Developed fraud detection/identity verification algorithms for a financial software company.
    • Created the code base for automated data pipeline: input, cleaning, visualization, and model building.
    Technologies: R, AWS, Jenkins, Elastic Beanstalk, Python, Django
  • Chief Data Scientist

    2016 - 2018
    • Served as the lead data scientist for API tool startup, making it easy for organizations' major systems (marketing, finance, HR, payroll, etc.) to communicate with each other.
    • Developed a data flow system in which data from different databases is universally available.
    Technologies: R, PostgreSQL
  • Derivatives Quant

    2012 - 2014
    Treesdale Partners
    • Developed four-dimensional model of the implied volatility surface (IVS) for foreign exchange markets.
    • Developed a smoothing method based on principal surfaces and a no-arbitrage constraint that predicts the evolution of the IVS over time.
    Technologies: Stata, Excel, Mathematica
  • Consulting Economist

    2009 - 2011
    US Treasury Department
    • Analyzed history of financial regulation in the United States from 1950 to the present.
    • Created a series of databases containing all 213 financial regulatory laws passed since 1950.\.
    • Findings were used to brief US and European officials, including the Assistant Secretary for Domestic Policy in the Treasury Department and representatives of the European Central Bank.
    Technologies: Stata, Access
  • Senior Quantitative Analyst

    2004 - 2008
    • Performed statistical analysis of economic, political, social, and geographic causes of state failure and transitions to democracy.
    • Created models that are the most sophisticated predictors of regime instability currently available, with a two-year forecast accuracy rate of over 80%.
    • Findings were used by intelligence personnel, including CIA, Defense Department, and White House Officials.
    Technologies: Stata, R, Msthematica


  • Languages

    R, Python 3, SQL, Scala
  • Frameworks

    Machine Learning
  • Tools

    Mathematica, STATA, Hidden Markov Model
  • Paradigms

    Data Science
  • Platforms

  • Other

    Statistics, Modeling, Artificial Intelligence (AI), Mathematical Modeling, Mathematics, Random Forests, Regression Models, Ridge Regression, Robust Regression, Logistic Regression, Random Forest Regression, Decision Tree Classification, Classification Algorithms, Time Series Analysis, Gradient Boosted Trees, Principal Component Analysis (PCA), Factor Analysis, Economics, Economic Analysis, International Development & Economics, Political Campaigning, Surveys, Survey Development & Analysis, Games, Discrete Mathematics, Text Classification, Natural Language Processing (NLP), K-means Clustering, Support Vector Machines (SVM), Markov Chain Monte Carlo (MCMC) Algorithms, Discriminant Analysis (LDA), Directed Acrylic Graphs (DAG), Volatility
  • Libraries/APIs

    TensorFlow, TensorFlow Deep Learning Library (TFLearn)
  • Storage

    MySQL, NoSQL, PostgreSQL, Graph Databases


  • Ph.D. in Political Economics
    1987 - 1992
    Stanford University Graduate School of Business - Palo Alto, California
  • Bachelor of Arts degree in Applied Mathematics
    1983 - 1987
    Harvard University - Cambridge, Massachusetts
