Enrique Balp-Straffon, Developer in Valle de Bravo, Mexico
Enrique is available for hire
Hire Enrique

Enrique Balp-Straffon

Verified Expert  in Engineering

Data Scientist and Developer

Location
Valle de Bravo, Mexico
Toptal Member Since
March 25, 2020

Enrique is a data scientist with an academic background in physics and neuroscience. Over the years, he has participated and led teams in several machine learning projects with both startups and corporations in the financial, healthcare, logistics, marketing, and energy sectors. Enrique has in-depth experience in different areas of artificial intelligence, such as computer vision, natural language processing, financial risk modeling, and so on.

Portfolio

Wesper
OpenAI, OpenAI GPT-4 API, OpenAI GPT-3 API, LangChain, LlamaIndex...
UMBA
Machine Learning, Risk Management, Data Engineering, SQL, Data Queries...
SYNX.AI
Amazon Web Services (AWS), TensorFlow, Keras, SciPy, Pandas, XGBoost...

Experience

Availability

Part-time

Preferred Environment

Docker, TensorFlow, Scikit-learn, NumPy, Pandas, Python

The most amazing...

...project I've made is an auto-ML platform for the automatic searching of the best data preprocessing strategies, hyper-parameters, and deep neural architectures.

Work Experience

Senior Data Scientist

2022 - PRESENT
Wesper
  • Developed a sleep expert chatbot with LLMs and OpenAI APIs, creating a RAG vector DB-based pipeline with expert knowledge.
  • Created ML algorithms to extract precise insights regarding sleep health from body sensor data, including a sleep-wake model and a respiratory event detection model.
  • Achieved state-of-the-art results compared to other devices in the market and submitted them to the FDA.
Technologies: OpenAI, OpenAI GPT-4 API, OpenAI GPT-3 API, LangChain, LlamaIndex, Language Models, Prompt Engineering, ChatGPT, Retrieval-augmented Generation (RAG)

Data Science Lead

2021 - 2022
UMBA
  • Led a team of data scientists and data engineers in charge of the lending decisions of the company.
  • Took responsibility for the risk management models leveraging different machine learning techniques to predict probability of default and for the operational back-end Python API.
  • Contributed to the portfolio of loans that grew on average 15% per month while keeping default constant in the data acquisition phase of the company.
Technologies: Machine Learning, Risk Management, Data Engineering, SQL, Data Queries, Data Science, Financial Modeling, Statistics, Mathematics, Amazon SageMaker, NoSQL, MongoDB, Amazon Aurora, Artificial Intelligence (AI)

Senior Data Scientist | CTO | Co-founder

2017 - 2019
SYNX.AI
  • Created a tool for the automatic searching of the best combinations of preprocessing strategies, neural architectures, and hyperparameters using parallel training in several GPUs.
  • Trained a computer vision model with deep convolutional neural networks for the diagnostic of diabetic retinopathy.
  • Used machine learning to develop several financial risk models for different Mexican fintechs and banks, helping them to reduce the load on human analysts, adjust their risk strategies, and optimize their portfolios.
  • Developed a machine-learning churn model for a major Mexican payment processing company.
  • Developed a tool to monitor and optimize aircraft fuel spending for a major international airline based in Abu Dhabi.
  • Created a tool for online advertising budget optimization based on reinforcement learning (contextual bandits).
  • Performed as a sales engineer—talking with potential clients in order to understand their business needs and translate them into technical data science specifications.
  • Lead a team of six data scientists—mentoring and supervising their progress in different projects, making sure everyone was engaged, learning and delivering the right results for our clients.
  • Created a demand forecasting model for a large food company using the Prophet library.
Technologies: Amazon Web Services (AWS), TensorFlow, Keras, SciPy, Pandas, XGBoost, Scikit-learn, Flask, Docker, Python, Machine Learning, SQL, Spark SQL, Data Queries, Data Science, Financial Modeling, Statistics, Mathematics, Computer Vision, NoSQL, Facial Recognition, Convolutional Neural Networks (CNN), Object Detection, Artificial Intelligence (AI)

Senior Data Scientist

2016 - 2017
Wizeline
  • Designed and oversaw the creation of a tool for the understanding and prediction of oil price differentials arising from the interaction between production volumes, refinery demand and transportation costs, using geographic and financial data.
  • Performed as the main technical contact with the client, an oil trading company based in Colorado.
  • Led a team of four data scientists and one engineer.
Technologies: QGIS, NetworkX, Scikit-learn, SciPy, Pandas, Plotly, Machine Learning, SQL, Microsoft Excel, Data Queries, Web Scraping, Data Science, Financial Modeling, Statistics, Mathematics, NoSQL

Data Scientist in Computer Vision

2016 - 2016
Makeup on Us
  • Developed and optimized face and facial landmark detectors, as well as a color synthesizer using transformations in different color spaces as key components of a makeup augmented reality system.
  • Presented our technology in a talk at the Microsoft Reactor Center in San Francisco.
  • Developed prototypes for other facial computer vision systems such as emotion recognition and face identification using deep convolutional neural networks.
Technologies: TensorFlow, NumPy, Dlib, OpenCV, OpenGL, Python, Deep Learning, Data Queries, Data Science, Mathematics, Computer Vision, Facial Recognition, Convolutional Neural Networks (CNN), Object Detection, Artificial Intelligence (AI)

Data Scientist in Financial Analysis

2015 - 2016
Kueski
  • Created a graph database using Neo4J codifying several relationships among customers (phone, Facebook friends, addresses, and more) in order to create network features to feed a fraud detection machine learning model.
  • Designed and trained a machine learning model to detect the probability of fraud (identity theft), which included features from a Neo4J graph database, that allowed a 50% reduction in the volume of applications human analysts had to review.
  • Participated in the financial analysis of the company's portfolio, creating metrics and insights into the evolution of cohorts, profitability, and so on.
Technologies: Neo4j, Scikit-learn, Pandas, Python, Machine Learning, MongoDB, Artificial Intelligence (AI)

Data Scientist in Marketing

2014 - 2015
Linio
  • Optimized eCommerce marketing by developing a model to monitor and calibrate TV advertising campaigns in Latin American using precise information about spot timings, costs and channels and measuring their impact on online visits.
  • Used genetic algorithms to find the best possible configuration of agents in customer service call center, taking into consideration the historical hourly volume of calls and parameters such as desired occupancy and operational costs.
  • Created a product recommender system based on visit and transaction data using Apache Spark.
Technologies: Spark, Python, R

Assistant Researcher in Neuroscience

2007 - 2008
University of Wisconsin
  • Applied methods from complex dynamical systems theory such as synchronization and recurrent analysis to electroencephalographic data.
  • Applied an independent component analysis for sensor data cleaning.
  • Explored the consequences of using different information-theoretical methodologies such as mutual information in the understanding of chaotic systems.
Technologies: MATLAB

Deep Learning for the Diagnosis of Diabetic Retinopathy

An automatic screening/pre-diagnostic system for diabetic retinopathy using an ensemble of deep neural networks followed by a random forest classifier.
The model was trained to explore a vast space of convolutional neural network architectures inspired by Inception and ResNet (residual neural network) using best practices such as transfer learning, data augmentation, regularization, dropout, ensembles, and parallel training in several GPUs.
The system was designed to screen patients and take the workload off from ophthalmologists. The model had a sensitivity of 95% (true positive rate), while it had a specificity of 65% (true negative rate). This meant that the ophthalmologists only had to manually review 35% of the negative cases, resulting in much more efficient use of their time.
The system has not yet reached the stage of commercial distribution due to funding and regulatory issues.

Modeling Geographical Differences in the Price of Oil

I lead a team of data scientists and engineers in the creation of a tool to model oil price differences between locations in the USA. Based on historical data of production by thousands of wells, demand from refineries, and transportation costs in pipelines and trains, we used network analysis (networkx), geographic information systems (geopandas) and linear optimization to understand and visualize the price equilibrium of the system. The project was for a midstream company in Colorado, and was used to inform traders decisions.

Languages

Python, SQL, R, Cypher

Paradigms

Data Science

Other

Machine Learning, Data Mining, Data Queries, Financial Modeling, Mathematics, Artificial Intelligence (AI), Computer Vision, Natural Language Processing (NLP), Financial Data Analytics, Deep Learning, Unsupervised Learning, Credit Risk, Statistics, Facial Recognition, Convolutional Neural Networks (CNN), Object Detection, GPT, Generative Pre-trained Transformers (GPT), Data Analytics, Data Reporting, QGIS, Financial Markets, GeoPandas, Recommendation Systems, Risk Management, Data Engineering, Decentralized Finance (DeFi), Ethereum Smart Contracts, Web Scraping, OpenAI, OpenAI GPT-4 API, OpenAI GPT-3 API, LangChain, Neuroscience, Physics, Computer Science, Language Models, Prompt Engineering, ChatGPT, Retrieval-augmented Generation (RAG)

Libraries/APIs

Pandas, Scikit-learn, TensorFlow, Keras, NumPy, OpenGL, SciPy, NetworkX, XGBoost, OpenCV, Dlib, PySpark

Tools

Microsoft Excel, Spark SQL, Amazon SageMaker, MATLAB, Plotly, GIS, BigQuery, Mathematica

Storage

NoSQL, Neo4j, MongoDB, MySQL, Amazon S3 (AWS S3), Graph Databases, Amazon Aurora

Frameworks

Flask, Spark, LlamaIndex

Platforms

Amazon Web Services (AWS), Docker

2006 - 2008

Master's Degree in Physics

Institute of Physics, UNAM - Mexico City, Mexico

2007 - 2007

Participated in a Research Stay in Neuroscientific Data Analysis

University of Wisconsin - Madison, WI, USA

2002 - 2006

Bachelor's Degree in Physics

National University of Mexico - Mexico City, Mexico

2005 - 2005

Participated in the Santader Scholarship Exchange Program in Physics

University of Madrid - Madrid, Spain

MAY 2020 - MAY 2023

AWS Cloud Practitioner

Amazon Web Services (AWS)

APRIL 2020 - APRIL 2023

Certified TensorFlow Developer

TensorFlow Certificate Program

JANUARY 2014 - PRESENT

CFA Level I

CFA Institute

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring