Jake Hershey, Ph.D., Developer in Jacksonville, Oregon, United States
Jake is available for hire
Hire Jake

Jake Hershey, Ph.D.

Verified Expert  in Engineering

Machine Learning Developer

Jacksonville, Oregon, United States
Toptal Member Since
June 20, 2018

Jake is an accomplished marketing statistician, data scientist, and database programmer with 17 years of experience building SQL Server databases, predictive models, machine learning solutions, and custom interactive web-based data visualizations. He is the founder of SurveyGraphics.com - a data visualization tool for the market research industry.


Excel VBA, Scikit-learn, Pandas, NumPy, D3.js, R, Python, Oracle...
eMusic, Inc.
Excel VBA, R, Microsoft SQL Server, Statistical Modeling, Machine Learning
EarthLink, Inc
Excel VBA, SAS, Microsoft SQL Server, Statistical Modeling, Machine Learning




Preferred Environment

D3.js, JavaScript, Python, SQL, Windows

The most amazing...

...company I've founded is SurveyGraphics.com -- Market research companies use it to post interactive survey results on the web.

Work Experience

Director of Business Intelligence

2011 - 2016
  • Built a set of logistic regression models (one per product) for email targeting that improved email response rates by 18%, while maintaining email volume, by improving the relevance of email messaging. I used R (GLM) for the analysis and implemented the solution in SQL Server.
  • Built regression and clustering models that identify likely fraud. This lets us proactively cancel fraudulent orders and reduced our chargeback rate by 30%.
  • Created an automated reporting tool for landing page testing that makes registration flow optimization quick and accurate.
  • Built a set of machine learning classification algorithms (using Python Scikit-Learn) that identify the leads with the highest purchase likelihood for upgrade and cross-sell.
  • Developed a web-based data visualization tool, using D3 (JavaScript), that brings together our subscription billing, terminations/refunds, email response, Google Adwords, and Google Analytics data into one reporting application, with a rich set of graphing and filtering features.
  • Built a self-service email list creation and calendaring web application (using D3.js, JavaScript, PHP, and SQL server) that makes it easy for our email team to create email lists, with a drag/drop interface.
  • Created a website (the “Telesales Dispatcher”) that presents the highest quality leads to our telesales agents each day, based on statistical models of purchase likelihood that I developed.
Technologies: Excel VBA, Scikit-learn, Pandas, NumPy, D3.js, R, Python, Oracle, Microsoft SQL Server, Statistical Modeling, Machine Learning

Director of Research and Analytics

2007 - 2011
eMusic, Inc.
  • Built a suite of automated database reporting applications, using Excel (with VBA) as a client for SQL Server data, providing visibility into all of the marketing data and company key metrics, including signups, web conversion rates, email open and click rates, site usage, churn metrics, and geo-mapping.
  • Built a web scraping tool that retrieves song and album prices from Amazon and iTunes, so that we can strategically price our music catalog to best comparative advantage. This tool improved our overall prices by 15%.
  • Developed customer churn and upsell models of our customers via multivariate logistic regression, used for enhanced targeting of our member communications and offers. Improved upsell rates by 13% while improving retention by 8%.
Technologies: Excel VBA, R, Microsoft SQL Server, Statistical Modeling, Machine Learning

Director of Direct Marketing and Analytics

2000 - 2007
EarthLink, Inc
  • Developed, implemented, and analyzed the direct marketing strategy for EarthLink’s dial-up products, including both EarthLink’s flagship dialup internet brand and the PeoplePC value brand. Managed a small team of data scientists, and an overall annual marketing budget of $60 million dollars.
  • Optimized marketing spend across direct response TV, solo and shared mail, sponsorships, promotions, and field marketing. Generated more than 850,000 members through my channels in 2007, beating our 2007 plan by 10%.
  • Created the “FrontLine Strategizer”: a SQL Server database application (with Excel / VBA client) that builds aggregated monthly forecasts out of campaign-level inputs. Integrated with my real-time direct marketing response projections, this tool enabled me to react quickly to campaign performance and optimize budget allocation.
  • Managed a team of marketing managers and data analysts distributed across the San Francisco and Atlanta offices, building their direct marketing skills and helping them deliver subscribers on time and under budget. Also managed a stable of marketing and media buying agencies that assisted with all of our direct marketing efforts.
  • Built the “MACalyzer” – a SQL Server database application for direct response TV reporting that reduced the member acquisition costs in the television channel by 18%. This tool ties 400+ dedicated phone numbers to their associated advertising spend and identifies the resulting orders so that we can track the ROI and member acquisition cost for each airing of our commercial.
  • Developed a direct mail reporting and analysis engine in SQL Server with an Excel user-interface. Wrote all the SQL code that loads mail recipients, matches them to mail respondents, and reports response rates via an OLAP cube with more than 20 demographic and marketing dimensions. This reporting and analysis tool was in use for at least seven years after I built it.
  • Built the “Churn Toaster”: An OLAP-style SQL Server application providing visibility into the churn rates during individual calendar months, allowing users to explore each month’s churn along a variety of dimensions (acquisition channel or partner, offer, customer tenure, payment method, and voluntary vs involuntary churn).
Technologies: Excel VBA, SAS, Microsoft SQL Server, Statistical Modeling, Machine Learning

Technical Analysis of Stock Price: The Head and Shoulders Pattern

Created a Python program that finds the "Head and Shoulders" stock price pattern in stock price trends. Created a website that lets you adjust the parameters of the "Head and Shoulders" pattern (ratio of Head to Shoulders, Trend Angle, # of Days in the pattern, etc), and shows the gain or loss to your portfolio over the next 30 days, as though you had bought the matching stocks on the day after the end of the pattern.

Topic Extraction from Tweets

Programmed a script to collect tweets matching 1600 keywords every hour, and identify the "trending" topics on Twitter for those keywords. Created a website to display the top trending topics, using JavaScript and D3.js.

Automated WordCloud and Data Visualization for Topic Themes

Wrote a script that extracts the keywords and themes from articles published on the Refinery29 website, and presents them as a word cloud. Users can select the month, and the themes with the most user engagement for that month are presented. The larger the font, the more engagement the keyword received. And, a line graph at the bottom of the screen shows the seasonal engagement pattern for words when you click on them.

Trending Celebrity Topics

Wrote a script that "follows" 800+ celebrities, and stores their tweets and Facebook posts in a SQL server table. Then it uses NLP tools to extract the topics that are common across multiple celebrities. Finally, I created a website using javascript and D3.js to publish the trending celebrity topics, so it's easy to see what celebrities are talking about.


I founded the company SurveyGraphics LLC and built the website SurveyGraphics.com in 2014. This is a service that lets market research companies upload raw survey data, and the site automatically converts the data into interactive graphs.

SQL Server Script | Calculate Logistic Regression Model Scores for Arbitrary Models

I worked on a SQL Server stored procedure that applies the logistic regression model parameters to a data table to calculate a likelihood score for each row and appends that score back to the input table.

SQL Server Script That Loads a Text File, Without Knowing the Format in Advance

I created a fun SQL Server script to load a text file into a table without knowing the column names or data types in advance. It examines the top row of the text file to detect the column names, examines the rest of the data to detect the data types, creates the table with the correct field names and data types, and loads the data from the text file using "bulk insert."

Tableau Revenue Dashboard (Screenshot)

I created a wide variety of reports in Tableau, to help us track physician productivity, billing, and revenue. I've linked to a screenshot of a Tableau Revenue Dashboard I created.
1994 - 1999

Ph.D. in Psychology, Quantitative Psychology

Univ of California, Riverside - Riverside, California

1988 - 1993

Bachelor's Degree in Psychology

Univ. of California, Berkeley - Berkeley, California


Google Analytics API, Scikit-learn, Facebook API, Twitter API, Pandas, NumPy, D3.js, Facebook Ads API, Sailthru API, JW Player API, Instagram API, Pinterest API, LinkedIn API, Google Drive API, Tumblr API


Tableau, cURL Command Line Tool


JavaScript, Visual Basic for Applications (VBA), Python, SQL, T-SQL (Transact-SQL), R, SAS, Excel VBA, PHP 7, Google Apps Script


Data Science, Business Intelligence (BI)


MySQL, PostgreSQL, PL/SQL, Microsoft SQL Server, SQL Stored Procedures


Windows, Oracle, Apache Pig


Multivariate Statistical Modeling, SAS Macros, Web Scraping, Machine Learning, Base SAS, SAS Stats, Statistical Modeling

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.


Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring