Juan Luis Ruiz - Tagle, Developer in Madrid, Spain

Juan Luis Ruiz - Tagle

Data Scientist and Developer

Madrid, Spain
Toptal Member Since
September 29, 2022

Juan Luis is a data scientist with expertise in spatial analytics and optimization. He has a background in computer science and four years of professional experience working in spatial data science, finance, and advertising technology. He combines his deep knowledge in machine learning with software engineering best practices to build robust and reliable ML solutions. Juan Luis has strong analytical skills and addresses problems from a business perspective, prioritizing the client's needs.

Juan is available for hire
Hire Juan


IESE Business School
University Teaching
Pandas, NumPy, GIS, GeoPandas, Optimization, PySAL, SQL, Docker, Spark...
ETS Asset Management Factory
Time Series Analysis, Generative Adversarial Networks (GANs), APIs, Flask...


Madrid, Spain



Preferred Environment

MacOS, Google Cloud, BigQuery, Git, Slack, Python

The most amazing...

...system I've developed is a set of spatial ML algorithms in SQL, which run at scale on cloud data warehouses like Google BigQuery.

Work Experience

2022 - 2022

Data Analytics Lead Instructor

IESE Business School
  • Taught a two-week intensive course on Python and Data Analytics to 60+ MiM students at IESE Business School.
  • Managed different Python levels in students, making sure the inexperienced had a solid understanding of the fundamentals while I provided the more advanced students with extra material.
  • Evaluated the students, measuring the effort made to take the most out of the course, regardless of their initial Python skills.
  • Coordinated with two teacher assistants who helped me with the classes and another lead instructor who instructed another classroom.
Technologies: University Teaching
2020 - 2022

Data Scientist

  • Implemented spatial statistics and ML algorithms in SQL to run them at scale on cloud data warehouses.
  • Developed spatial models for estimating accumulated litter in cities at a granular level.
  • Built optimization solutions for vehicle routing and territory management, connected to Google BigQuery as remote functions.
  • Designed spatial indexes for clients, which combined target demographics, POI presence density, and mobility data.
  • Identified trends in hotspot areas for retail during the pandemic using human mobility data (origin-destination matrices), POI data, and performing time series analysis.
  • Created ETL processes with Apache Airflow to recurrently ingest spatial data from several data sources into CARTO's platform.
Technologies: Pandas, NumPy, GIS, GeoPandas, Optimization, PySAL, SQL, Docker, Spark, Apache Airflow, Databricks, TensorFlow, Data Science, Spatial Reasoning, Data Analytics, REST APIs, Big Data, JavaScript, Data Visualization, Apache Spark, Predictive Analytics, Data Analysis, Analytics, eCommerce, Marketplaces, Data Management, Data Governance, Azure, Keras, Scikit-learn, Databases, Data Modeling, Database Administration (DBA), Decision Trees, Snowflake, Regression, Data Scientist, Recommendation Systems, Data Engineering, Google Cloud, BigQuery, Git, ETL Development, Vehicle Routing, Geographic Information Systems
2019 - 2020

Data Scientist

ETS Asset Management Factory
  • Applied state-of-the-art techniques to make more accurate predictions of financial markets' behavior, contributing to the financial advisory firm's primary purpose of making stock market investment recommendations driven by data science.
  • Developed a RESTful API that serves synthetic stock series created by generative adversarial networks on demand.
  • Put into production a novel deep learning portfolio investment strategy and deployed it to internal servers to automate portfolio recommendations.
Technologies: Time Series Analysis, Generative Adversarial Networks (GANs), APIs, Flask, Jenkins, Data Analytics, REST APIs, Data Analysis, Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), Analytics, Keras, Scikit-learn, Finance, Data Modeling, Decision Trees, Regression, Data Scientist, Google Cloud, Git
2019 - 2019

Data Analyst

  • Developed a funnel for the company's video advertising campaigns which helped gain insights into the adequate progress of the business.
  • Built ETL processes that aggregated data periodically from ads stored in a MongoDB database and displayed the current state of the ad flow in a dashboard.
  • Assisted the CEO in preparing the company's next funding round by analyzing revenue and client fidelity.
Technologies: MongoDB, SQL, Python, Google Sheets, Data Analytics, Data Analysis, Analytics, eCommerce, Business Analysis


Local MX Refinement | ML tool for Out of Home Advertising Campaign Optimization

While working in CARTO, I took full ownership of the ML models and optimization algorithms of the Local MX, a tool built for the Havas Media Group, which makes predictions of coverage and impacts at a very granular level and computes a selection of billboards that maximize such metrics.

The client's interest was to measure the impressions (number of visits) and coverage (number of distinct visitors) each of their billboards in Spain received weekly. They also wanted this information segmented by different categorical variables: type of day, hourly range, age, gender, and income level. For this, our models were trained on data from several sources (telco, SDK data, sociodemographic, POI, etc.). Then an optimization algorithm ordered the billboards best adapted to the target campaign.

I got involved in this project at a calibration stage, in which I:

• Tweaked the ML models and algorithms to align with client expectations
• Automatized background processes for telco data ingestion, automatic enablement of new billboards in the tool, etcetera
• Extended the usage of the tool within the Canary Islands by computing SDK routes on this region with OSRM
• Handled the communication with the client for all technical matters

Sales KPI Calculation Automation for an International Beverage Company

During my time at CARTO, I had the privilege to work with one of the leading beverage companies in the world. They had a vast amount of sales data to analyze to calculate various KPIs related to their fleet, stock clients, and other business areas. At the time, processing this data and performing the calculations were done purely in Excel, which was time-consuming and prone to errors.

Together with my team, we launched a Spark cluster in Databricks to automate the KPI calculations. This allowed us to leverage the power of distributed computing and easily process the massive amounts of data the client was working with. I worked closely with their team to understand their specific requirements. Then I implemented the Spark-based solution that automated the calculations, eliminating the need for manual intervention and saving countless work hours.


TweetWars is a web app that enables users to compare two Twitter accounts by analyzing their latest 200 tweets and displaying insightful results in an interactive dashboard.

The tweets of both accounts are analyzed using NLP techniques, including sentiment and emotion prediction, topic modeling, and tweeting behavior statistics. These results are presented in a dashboard and sent to the paying user.

Despite its complexity, the system is fully autonomous and requires minimal maintenance on my part. It is comprised of multiple seamlessly integrated microservices which take care of payment processing, tweet fetching, sentiment inference, dashboard generation, email communication, and other tasks.

Black Friday Analysis

I did a thorough spatial analysis of the effects of the pandemic on retail stores during Black Friday in four different cities across the US using SafeGraph's human mobility data. I compared data for 2019, 2020, and 2021 to obtain insights into the evolution of footfall traffic and presented my results in a webinar and an article published on SafeGraph's blog.

Spatial Data Science Conference 2022

I participated as a speaker in the Spatial Data Science Conference held in May 2022 at the Royal Geographic Society in London. This conference is among the most renowned congresses for geographic information systems and spatial data.

I presented the CARTO Analytics Toolbox, an SQL library for cloud data warehouses' spatial analysis and modeling.

Scraper App for Official State Documents in PDF

A script that scrapes the BOE (the official gazette published daily by the Spanish government) in PDF format. It extracts relevant information about newly registered brands, including the registrant's name, telephone number, company website, etc. It also generates an excel file containing all the scraped data in a structured way.

Personal Blog

I write about data science, geographic information systems, and math. Some articles have also been published in CARTO's blog as well as in Towards Data Science and Cantor's Paradise, two important publications on the Medium platform.

Some examples:
• Generating fake data with pandas, very quickly
• What to expect when throwing dice and adding them up
• Scraping Google Search (without getting caught)
• Can neural networks predict the stock market just by reading

Scraping Orchestra

I created a scraping master-slave system based on Google App Engine. The main problem of scraping is that sites can block your IP if they detect misleading behavior. As a solution, this system orchestrates from a local process a scraper deployed in Google App Engine. The main idea is to start scraping and redeploying the scraper to get a new IP whenever the current IP gets blocked.

Svenska Scraper

I built a web scraper during college to help me study the Swedish language. Given a list of words in Spanish, it gathers their translations and example sentences. The software also generates exercises to practice.



Python, SQL, R, JavaScript, Snowflake


Pandas, Scikit-learn, NumPy, REST APIs, Keras, TensorFlow, Stripe, PyTorch


BigQuery, GIS, Git, Apache Airflow, Google Sheets, Slack, Jenkins, Celery, Google Analytics


Data Science, Agile Software Development


Google Cloud, Databases, MongoDB, Google Cloud Storage, Database Administration (DBA)


Artificial Intelligence (AI), Natural Language Processing (NLP), Machine Learning, Deep Learning, Spatial Analysis, Data Analytics, Big Data, Data Visualization, Data Analysis, Analytics, Data Management, Data Modeling, Data Scientist, Data Engineering, Geographic Information Systems, Optimization, PySAL, APIs, Predictive Analytics, ETL Development, Business Analysis, API Integration, Spatial Reasoning, GPT, Generative Pre-trained Transformers (GPT), Data Governance, Finance, Decision Trees, Regression, Recommendation Systems, Vehicle Routing, Algorithms, Data Structures, Time Series, Computer Vision, GeoPandas, Time Series Analysis, Generative Adversarial Networks (GANs), Presentations, Communication, Web Scraping, Technical Writing, Excel 365, Scraping, OCR, eCommerce, Marketplaces, University Teaching, OpenAI, Business to Business (B2B), Business to Consumer (B2C), Cloud Tasks, BERT, Sentiment Analysis, Google Cloud Functions, Azure Databricks, OpenAI GPT-3 API


Docker, Databricks, Google App Engine, Amazon Web Services (AWS), Azure


Spark, Flask, Apache Spark, Bootstrap


2020 - 2020

Master's Degree in Data Science

Universidad Politécnica de Madrid - Madrid, Spain

2015 - 2018

Bachelor's Degree in Computer Science

KTH Royal Institute of Technology - Stockholm, Sweden