Matthew de Marte, Developer in Yorktown Heights, NY, United States
Matthew is available for hire
Hire Matthew

Matthew de Marte

Verified Expert  in Engineering

Data Scientist and Developer

Location
Yorktown Heights, NY, United States
Toptal Member Since
August 2, 2022

Matthew is a data scientist who has worked primarily in professional baseball, an incredibly fast-paced and demanding industry. He has utilized R and SQL databases with strengths relying on machine learning and predictive modeling to provide better data-driven decisions. Matthew is a fantastic problem solver with a data science skill set that can communicate technical findings to non-technical audiences.

Portfolio

Vaulted Baseball
R, SQL, pgAdmin, PostgreSQL, Gradient Boosting, Neural Networks...
Lotte Giants
R, SQL, pgAdmin, PostgreSQL, RStudio Shiny, Tidyverse, Ggplot2, Plotly, Caret...
Activate Inc
Data Science, Data Analytics, Data Reporting, Python, R, SQL...

Experience

Availability

Part-time

Preferred Environment

R, SQL, RStudio Shiny, Git, Excel 365, Statistical Modeling, Gradient Boosted Trees, Neural Networks, Mixed-effects Models, Machine Learning

The most amazing...

...project I've completed is a player projection system for the Korean Baseball Organization that was 200% better than previous industry standards.

Work Experience

Co-founder and Lead Data Scientist

2022 - PRESENT
Vaulted Baseball
  • Managed a team of developers to build a web application that housed an entire analytical infrastructure, specifically for major league baseball players.
  • Created the entire back-end data processing pipeline, which included scraping, manipulating, aggregating, and creating data while managing the SQL database, as well as creating and saving model predictions daily.
  • Built machine learning algorithms to evaluate the value of a pitch and batted ball to enhance player evaluation and use real-time testing tools on our web application.
Technologies: R, SQL, pgAdmin, PostgreSQL, Gradient Boosting, Neural Networks, Linear Regression, Logistic Regression, Mixed-effects Models, Generalized Additive Mixed Model (GAMM), Unsupervised Learning, Supervised Machine Learning, Statistical Modeling, Creative Problem Solving, Git, Data Analysis, RStudio Shiny, Management, Data Science, Statistics, Mathematics, Data Analytics, Artificial Intelligence (AI), Code Review, Source Code Review, Interviewing, Technical Hiring, Data Reporting, Data Visualization, Data Mining, Web Scraping, Time Series Analysis, Predictive Modeling, Predictive Analytics

Consultant and Assistant Director, Research Development

2021 - PRESENT
Lotte Giants
  • Created a player projection system for all players in the Korean Baseball Organization. The system beat the industry standard forecasting tools by 200% using R-squared.
  • Designed a player projection system that forecasted performance in Korea, which was currently playing professionally in the United States.
  • Maintained a system for player personnel decision-making rooted in our player projections system. This system has contributed to the organization's six transactions worth over $5 million in surplus value.
  • Organized a metric from a process of six chained XGBoost models to evaluate individual pitches. This metric was the foundation of our pitcher projections and is the cornerstone of organizational pitching development and strategy.
  • Developed an entire in-game strategy system rooted in predictive modeling to enhance in-game decision-making. The system has been worth approximately ten wins since the 2021 season.
  • Built a pipeline that automated all statistical and machine learning models and projections to run daily and create new data reflected in the team's internal web application.
  • Implemented a project management system to help other data scientists track their workflow, standardize testing methodologies, and create more efficient statistical processes that enhance productivity.
  • Managed a data science schema and successfully managed over 200 tables containing all relevant statistical information from projections to metrics and core research.
  • Directed as an assistant for the Korean Baseball Organization as the first foreign data scientist and, at the age of 25, became the youngest person in a leadership position for the Korean Baseball Organization.
Technologies: R, SQL, pgAdmin, PostgreSQL, RStudio Shiny, Tidyverse, Ggplot2, Plotly, Caret, XGBoost, Linear Regression, Logistic Regression, Mixed-effects Models, Generalized Additive Mixed Model (GAMM), Machine Learning, Creative Problem Solving, Data Analysis, Management, Statistical Modeling, Statistical Programming, Statistical Forecasting, Statistical Analysis, IT Project Management, Data Science, Statistics, Mathematics, Data Analytics, Artificial Intelligence (AI), Code Review, Source Code Review, Interviewing, Technical Hiring, Data Reporting, Data Visualization, Data Mining, Web Scraping, Time Series Analysis, Predictive Modeling, Predictive Analytics

Data Scientist

2022 - 2022
Activate Inc
  • Helped the client access a CSV file with 33 million rows of data they had previously been unable to access.
  • Analyzed and aggregated data into reports to help the client answer valuable questions for their customer.
  • Assisted the client in delivering time-sensitive analyses to their customer on a tight deadline.
Technologies: Data Science, Data Analytics, Data Reporting, Python, R, SQL, Predictive Analytics

Assistant, Quantitative Analysis

2019 - 2021
Los Angeles Angels
  • Created an optimization algorithm to update infield and outfield positioning process for the 2021 season. Forecasted to create over $8 million in value for the organization through improved player performance.
  • Developed an entire statistical infrastructure to evaluate outfield defense in the minor leagues in 2019, and automated the infrastructure to deliver bi-weekly reports with visualizations to shareholders.
  • Established multiple dashboards which are utilized by executives to help assist in the decision-making of multi-million dollar transactions.
  • Designed a team wins projections model with a linear regression that produced an RMSE more than any win projections system in the public baseball ecosystem.
  • Managed the entire data pipeline of reports for our major league coaching staff and players during parts of the 2019 season and the entire 2020 season.
Technologies: R, SQL, Excel 365, RStudio Shiny, Linear Regression, Logistic Regression, Mixed-effects Models, Data Analysis, XGBoost, Machine Learning, DataViz, Data Communication, Creative Problem Solving, Ggplot2, Plotly, Python, Data Science, Statistics, Mathematics, Data Analytics, Artificial Intelligence (AI), Code Review, Source Code Review, Interviewing, Data Reporting, Data Visualization, Data Mining, Web Scraping, Time Series Analysis, Predictive Modeling, Predictive Analytics

Vaulted Baseball Web Application

Created an entire analytics department that major league baseball players can purchase to enhance their careers. I was the sole data scientist on the team and managed a group of web developers to bring my vision to life. In addition to managing the developers, I created the project's necessary statistical and machine learning algorithms. I also automated the entire data processing system that runs daily, and was the manager of the SQL database that was responsible for creating our entire back-end data infrastructure and core research.

Languages

R, SQL, Python

Frameworks

RStudio Shiny

Libraries/APIs

Tidyverse, Ggplot2, Caret, XGBoost

Tools

pgAdmin, Plotly, DataViz, Git

Paradigms

Data Science, Management

Storage

PostgreSQL

Other

Excel 365, Statistical Modeling, Gradient Boosted Trees, Neural Networks, Mixed-effects Models, Machine Learning, Data Analysis, Creative Problem Solving, Gradient Boosting, Linear Regression, Logistic Regression, Generalized Additive Mixed Model (GAMM), Supervised Machine Learning, Statistical Programming, Statistical Forecasting, Statistical Analysis, Data Communication, Random Forests, Statistics, Mathematics, Data Analytics, Artificial Intelligence (AI), Code Review, Source Code Review, Interviewing, Technical Hiring, Data Reporting, Data Visualization, Data Mining, Time Series Analysis, Predictive Modeling, Predictive Analytics, Unsupervised Learning, IT Project Management, Web Scraping, Marketing Analytics

Platforms

Amazon Web Services (AWS)

2014 - 2018

Bachelor's Degree in Business Analytics

Babson College - Wellesley, MA

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring