Enrique Balp-Straffon, Data Scientist and Developer in Valle de Bravo, Mexico
Enrique Balp-Straffon

Data Scientist and Developer in Valle de Bravo, Mexico

Member since January 16, 2020
Enrique is a data scientist with an academic background in physics and neuroscience. Over the years, he has participated and led teams in several machine learning projects with both startups and corporations in the financial, healthcare, logistics, marketing, and energy sectors. Enrique has in-depth experience in different areas of artificial intelligence, such as computer vision, natural language processing, financial risk modeling, and so on.
Enrique is now available for hire


    Python, Docker, Flask, AWS, Scikit-learn, XGBoost, Pandas, SciPy, Keras...
  • Wizeline
    Plotly, Pandas, SciPy, Scikit-learn, NetworkX, QGIS
  • Makeup on Us
    Python, OpenGL, OpenCV, Dlib, NumPy, TensorFlow



Valle de Bravo, Mexico



Preferred Environment

Python, Pandas, NumPy, Scikit-learn, TensorFlow, Docker

The most amazing...

...project I've made is an auto-ML platform for the automatic searching of the best data preprocessing strategies, hyper-parameters, and deep neural architectures.


  • Senior Data Scientist | CTO | Co-founder

    2017 - 2019
    • Created a tool for the automatic searching of the best combinations of preprocessing strategies, neural architectures, and hyperparameters using parallel training in several GPUs.
    • Trained a computer vision model with deep convolutional neural networks for the diagnostic of diabetic retinopathy.
    • Used machine learning to develop several financial risk models for different Mexican fintechs and banks, helping them to reduce the load on human analysts, adjust their risk strategies, and optimize their portfolios.
    • Developed a machine-learning churn model for a major Mexican payment processing company.
    • Developed a tool to monitor and optimize aircraft fuel spending for a major international airline based in Abu Dhabi.
    • Created a tool for online advertising budget optimization based on reinforcement learning (contextual bandits).
    • Performed as a sales engineer—talking with potential clients in order to understand their business needs and translate them into technical data science specifications.
    • Lead a team of six data scientists—mentoring and supervising their progress in different projects, making sure everyone was engaged, learning and delivering the right results for our clients.
    • Created a demand forecasting model for a large food company using the Prophet library.
    Technologies: Python, Docker, Flask, AWS, Scikit-learn, XGBoost, Pandas, SciPy, Keras, TensorFlow
  • Senior Data Scientist

    2016 - 2017
    • Designed and oversaw the creation of a tool for the understanding and prediction of oil price differentials arising from the interaction between production volumes, refinery demand and transportation costs, using geographic and financial data.
    • Performed as the main technical contact with the client, an oil trading company based in Colorado.
    • Led a team of four data scientists and one engineer.
    Technologies: Plotly, Pandas, SciPy, Scikit-learn, NetworkX, QGIS
  • Data Scientist in Computer Vision

    2016 - 2016
    Makeup on Us
    • Developed and optimized face and facial landmark detectors, as well as a color synthesizer using transformations in different color spaces as key components of a makeup augmented reality system.
    • Presented our technology in a talk at the Microsoft Reactor Center in San Francisco.
    • Developed prototypes for other facial computer vision systems such as emotion recognition and face identification using deep convolutional neural networks.
    Technologies: Python, OpenGL, OpenCV, Dlib, NumPy, TensorFlow
  • Data Scientist in Financial Analysis

    2015 - 2016
    • Created a graph database using Neo4J codifying several relationships among customers (phone, Facebook friends, addresses, and more) in order to create network features to feed a fraud detection machine learning model.
    • Designed and trained a machine learning model to detect the probability of fraud (identity theft), which included features from a Neo4J graph database, that allowed a 50% reduction in the volume of applications human analysts had to review.
    • Participated in the financial analysis of the company's portfolio, creating metrics and insights into the evolution of cohorts, profitability, and so on.
    Technologies: Python, Pandas, Scikit-learn, Neo4J
  • Data Scientist in Marketing

    2014 - 2015
    • Optimized eCommerce marketing by developing a model to monitor and calibrate TV advertising campaigns in Latin American using precise information about spot timings, costs and channels and measuring their impact on online visits.
    • Used genetic algorithms to find the best possible configuration of agents in customer service call center, taking into consideration the historical hourly volume of calls and parameters such as desired occupancy and operational costs.
    • Created a product recommender system based on visit and transaction data using Apache Spark.
    Technologies: R, Python, Spark
  • Assistant Researcher in Neuroscience

    2007 - 2008
    University of Wisconsin
    • Applied methods from complex dynamical systems theory such as synchronization and recurrent analysis to electroencephalographic data.
    • Applied an independent component analysis for sensor data cleaning.
    • Explored the consequences of using different information-theoretical methodologies such as mutual information in the understanding of chaotic systems.
    Technologies: Matlab


  • Deep Learning for the Diagnosis of Diabetic Retinopathy (Development)

    An automatic screening/pre-diagnostic system for diabetic retinopathy using an ensemble of deep neural networks followed by a random forest classifier.
    The model was trained to explore a vast space of convolutional neural network architectures inspired by Inception and ResNet (residual neural network) using best practices such as transfer learning, data augmentation, regularization, dropout, ensembles, and parallel training in several GPUs.
    The system was designed to screen patients and take the workload off from ophthalmologists. The model had a sensitivity of 95% (true positive rate), while it had a specificity of 65% (true negative rate). This meant that the ophthalmologists only had to manually review 35% of the negative cases, resulting in much more efficient use of their time.
    The system has not yet reached the stage of commercial distribution due to funding and regulatory issues.

  • Modeling geographical differences in the price of oil (Development)

    I lead a team of data scientists and engineers in the creation of a tool to model oil price differences between locations in the USA. Based on historical data of production by thousands of wells, demand from refineries, and transportation costs in pipelines and trains, we used network analysis (networkx), geographic information systems (geopandas) and linear optimization to understand and visualize the price equilibrium of the system. The project was for a midstream company in Colorado, and was used to inform traders decisions.


  • Paradigms

    Data Science
  • Other

    Machine Learning, Data Mining, Computer Vision, Natural Language Processing (NLP), Financial Data Analytics, Deep Learning, Unsupervised Learning, Credit Risk, Statistics, Data Analytics, Data Reporting, AWS, Financial Markets, GeoPandas, Recommendation Systems
  • Languages

    Python, R, SQL, Cypher
  • Libraries/APIs

    Pandas, Scikit-learn, TensorFlow, Keras, NumPy, OpenCV, Dlib, PySpark
  • Frameworks

  • Tools

    GIS, BigQuery
  • Platforms

  • Storage

    Neo4j, MongoDB, MySQL, AWS S3, Graph Databases


  • Master's degree in Physics
    2006 - 2008
    Institute of Physics, UNAM - Mexico City, Mexico
  • Participated in a research stay in Neuroscientific Data Analysis
    2007 - 2007
    University of Wisconsin - Madison, WI, USA
  • Bachelor's degree in Physics
    2002 - 2006
    National University of Mexico - Mexico City, Mexico
  • Participated in the Santader Scholarship Exchange Program in Physics
    2005 - 2005
    University of Madrid - Madrid, Spain


  • AWS Cloud Practitioner
    MAY 2020 - MAY 2023
    Amazon Web Services (AWS)
  • Certified TensorFlow Developer
    APRIL 2020 - APRIL 2023
    TensorFlow Certificate Program
  • CFA Level I
    CFA Institute

To view more profiles

Join Toptal
Share it with others