Matthew de Marte
Verified Expert in Engineering
Data Scientist and Developer
Matthew is a data scientist who has worked primarily in professional baseball, an incredibly fast-paced and demanding industry. He has utilized R and SQL databases with strengths relying on machine learning and predictive modeling to provide better data-driven decisions. Matthew is a fantastic problem solver with a data science skill set that can communicate technical findings to non-technical audiences.
R, SQL, RStudio Shiny, Git, Excel 365, Statistical Modeling, Gradient Boosted Trees, Neural Networks, Mixed-effects Models, Machine Learning
The most amazing...
...project I've completed is a player projection system for the Korean Baseball Organization that was 200% better than previous industry standards.
Co-founder and Lead Data Scientist
- Managed a team of developers to build a web application that housed an entire analytical infrastructure, specifically for major league baseball players.
- Created the entire back-end data processing pipeline, which included scraping, manipulating, aggregating, and creating data while managing the SQL database, as well as creating and saving model predictions daily.
- Built machine learning algorithms to evaluate the value of a pitch and batted ball to enhance player evaluation and use real-time testing tools on our web application.
Consultant and Assistant Director, Research Development
- Created a player projection system for all players in the Korean Baseball Organization. The system beat the industry standard forecasting tools by 200% using R-squared.
- Designed a player projection system that forecasted performance in Korea, which was currently playing professionally in the United States.
- Maintained a system for player personnel decision-making rooted in our player projections system. This system has contributed to the organization's six transactions worth over $5 million in surplus value.
- Organized a metric from a process of six chained XGBoost models to evaluate individual pitches. This metric was the foundation of our pitcher projections and is the cornerstone of organizational pitching development and strategy.
- Developed an entire in-game strategy system rooted in predictive modeling to enhance in-game decision-making. The system has been worth approximately ten wins since the 2021 season.
- Built a pipeline that automated all statistical and machine learning models and projections to run daily and create new data reflected in the team's internal web application.
- Implemented a project management system to help other data scientists track their workflow, standardize testing methodologies, and create more efficient statistical processes that enhance productivity.
- Managed a data science schema and successfully managed over 200 tables containing all relevant statistical information from projections to metrics and core research.
- Directed as an assistant for the Korean Baseball Organization as the first foreign data scientist and, at the age of 25, became the youngest person in a leadership position for the Korean Baseball Organization.
- Helped the client access a CSV file with 33 million rows of data they had previously been unable to access.
- Analyzed and aggregated data into reports to help the client answer valuable questions for their customer.
- Assisted the client in delivering time-sensitive analyses to their customer on a tight deadline.
Assistant, Quantitative Analysis
Los Angeles Angels
- Created an optimization algorithm to update infield and outfield positioning process for the 2021 season. Forecasted to create over $8 million in value for the organization through improved player performance.
- Developed an entire statistical infrastructure to evaluate outfield defense in the minor leagues in 2019, and automated the infrastructure to deliver bi-weekly reports with visualizations to shareholders.
- Established multiple dashboards which are utilized by executives to help assist in the decision-making of multi-million dollar transactions.
- Designed a team wins projections model with a linear regression that produced an RMSE more than any win projections system in the public baseball ecosystem.
- Managed the entire data pipeline of reports for our major league coaching staff and players during parts of the 2019 season and the entire 2020 season.
Vaulted Baseball Web Application
R, SQL, Python
Tidyverse, Ggplot2, Caret, XGBoost
pgAdmin, Plotly, DataViz, Git
Data Science, Management
Excel 365, Statistical Modeling, Gradient Boosted Trees, Neural Networks, Mixed-effects Models, Machine Learning, Data Analysis, Creative Problem Solving, Gradient Boosting, Linear Regression, Logistic Regression, Generalized Additive Mixed Model (GAMM), Supervised Machine Learning, Statistical Programming, Statistical Forecasting, Statistical Analysis, Data Communication, Random Forests, Statistics, Mathematics, Data Analytics, Artificial Intelligence (AI), Code Review, Source Code Review, Interviewing, Technical Hiring, Data Reporting, Data Visualization, Data Mining, Time Series Analysis, Predictive Modeling, Predictive Analytics, Unsupervised Learning, IT Project Management, Web Scraping, Marketing Analytics
Amazon Web Services (AWS)
Bachelor's Degree in Business Analytics
Babson College - Wellesley, MA
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.Start hiring