
Juan Luis Ruiz - Tagle
Data Scientist and Developer
Juan Luis is a data scientist with expertise in spatial analytics and optimization. He has a background in computer science and four years of professional experience working in spatial data science, finance, and advertising technology. He combines his deep knowledge in machine learning with software engineering best practices to build robust and reliable ML solutions. Juan Luis has strong analytical skills and addresses problems from a business perspective, prioritizing the client's needs.
Portfolio
Experience
Python - 4 yearsMachine Learning - 4 yearsSQL - 4 yearsApache Airflow - 3 yearsBigQuery - 3 yearsSpatial Reasoning - 2 yearsOptimization - 2 yearsDocker - 2 yearsAvailability
Preferred Environment
MacOS, Google Cloud, BigQuery, Git, Slack, Python
The most amazing...
...system I've developed is a set of spatial ML algorithms in SQL, which run at scale on cloud data warehouses like Google BigQuery.
Work Experience
Data Analytics Lead Instructor
IESE Business School
- Taught a two-week intensive course on Python and Data Analytics to 60+ MiM students at IESE Business School.
- Managed different Python levels in students, making sure the inexperienced had a solid understanding of the fundamentals while I provided the more advanced students with extra material.
- Evaluated the students, measuring the effort made to take the most out of the course, regardless of their initial Python skills.
- Coordinated with two teacher assistants who helped me with the classes and another lead instructor who instructed another classroom.
Data Scientist
CARTO
- Implemented spatial statistics and ML algorithms in SQL to run them at scale on cloud data warehouses.
- Developed spatial models for estimating accumulated litter in cities at a granular level.
- Built optimization solutions for vehicle routing and territory management, connected to Google BigQuery as remote functions.
- Designed spatial indexes for clients, which combined target demographics, POI presence density, and mobility data.
- Identified trends in hotspot areas for retail during the pandemic using human mobility data (origin-destination matrices), POI data, and performing time series analysis.
- Created ETL processes with Apache Airflow to recurrently ingest spatial data from several data sources into CARTO's platform.
Data Scientist
ETS Asset Management Factory
- Applied state-of-the-art techniques to make more accurate predictions of financial markets' behavior, contributing to the financial advisory firm's primary purpose of making stock market investment recommendations driven by data science.
- Developed a RESTful API that serves synthetic stock series created by generative adversarial networks on demand.
- Put into production a novel deep learning portfolio investment strategy and deployed it to internal servers to automate portfolio recommendations.
Data Analyst
Seedtag
- Developed a funnel for the company's video advertising campaigns which helped gain insights into the adequate progress of the business.
- Built ETL processes that aggregated data periodically from ads stored in a MongoDB database and displayed the current state of the ad flow in a dashboard.
- Assisted the CEO in preparing the company's next funding round by analyzing revenue and client fidelity.
Experience
Local MX Refinement | ML tool for Out of Home Advertising Campaign Optimization
https://carto.com/blog/carto-havas-media-big-data-ai-world-madrid/The client's interest was to measure the impressions (number of visits) and coverage (number of distinct visitors) each of their billboards in Spain received weekly. They also wanted this information segmented by different categorical variables: type of day, hourly range, age, gender, and income level. For this, our models were trained on data from several sources (telco, SDK data, sociodemographic, POI, etc.). Then an optimization algorithm ordered the billboards best adapted to the target campaign.
I got involved in this project at a calibration stage, in which I:
• Tweaked the ML models and algorithms to align with client expectations
• Automatized background processes for telco data ingestion, automatic enablement of new billboards in the tool, etcetera
• Extended the usage of the tool within the Canary Islands by computing SDK routes on this region with OSRM
• Handled the communication with the client for all technical matters
Sales KPI Calculation Automation for an International Beverage Company
Together with my team, we launched a Spark cluster in Databricks to automate the KPI calculations. This allowed us to leverage the power of distributed computing and easily process the massive amounts of data the client was working with. I worked closely with their team to understand their specific requirements. Then I implemented the Spark-based solution that automated the calculations, eliminating the need for manual intervention and saving countless work hours.
TweetWars
http://tweetwars.wtfThe tweets of both accounts are analyzed using NLP techniques, including sentiment and emotion prediction, topic modeling, and tweeting behavior statistics. These results are presented in a dashboard and sent to the paying user.
Despite its complexity, the system is fully autonomous and requires minimal maintenance on my part. It is comprised of multiple seamlessly integrated microservices which take care of payment processing, tweet fetching, sentiment inference, dashboard generation, email communication, and other tasks.
Black Friday Analysis
https://www.safegraph.com/blog/2021-black-fridaySpatial Data Science Conference 2022
https://www.youtube.com/watch?v=6kNqsQY_e90I presented the CARTO Analytics Toolbox, an SQL library for cloud data warehouses' spatial analysis and modeling.
Scraper App for Official State Documents in PDF
Personal Blog
http://juanluis.meSome examples:
• Generating fake data with pandas, very quickly
https://towardsdatascience.com/generating-fake-data-with-pandas-very-quickly-b99467d4c618
• What to expect when throwing dice and adding them up
https://www.cantorsparadise.com/what-to-expect-when-throwing-dice-and-adding-them-up-5231f3831d7
• Scraping Google Search (without getting caught)
https://juanluisrto.medium.com/scraping-google-search-without-getting-caught-e43bb91b363e
• Can neural networks predict the stock market just by reading
newspapers?
https://quantdare.com/can-neural-networks-predict-the-stock-market-just-by-reading-newspapers/
Scraping Orchestra
https://github.com/juanluisrto/Scraping-orchestraSvenska Scraper
https://github.com/juanluisrto/SvenskaScraperSkills
Languages
Python, SQL, R, JavaScript, Snowflake
Libraries/APIs
Pandas, Scikit-learn, NumPy, REST APIs, Keras, TensorFlow, Stripe, PyTorch
Tools
BigQuery, GIS, Git, Apache Airflow, Google Sheets, Slack, Jenkins, Celery, Google Analytics
Paradigms
Data Science, Agile Software Development
Storage
Google Cloud, Databases, MongoDB, Google Cloud Storage, Database Administration (DBA)
Other
Artificial Intelligence (AI), Natural Language Processing (NLP), Machine Learning, Deep Learning, Spatial Analysis, Data Analytics, Big Data, Data Visualization, Data Analysis, Analytics, Data Management, Data Modeling, Data Scientist, Data Engineering, Geographic Information Systems, Optimization, PySAL, APIs, Predictive Analytics, ETL Development, Business Analysis, API Integration, Spatial Reasoning, GPT, Generative Pre-trained Transformers (GPT), Data Governance, Finance, Decision Trees, Regression, Recommendation Systems, Vehicle Routing, Algorithms, Data Structures, Time Series, Computer Vision, GeoPandas, Time Series Analysis, Generative Adversarial Networks (GANs), Presentations, Communication, Web Scraping, Technical Writing, Excel 365, Scraping, OCR, eCommerce, Marketplaces, University Teaching, OpenAI, Business to Business (B2B), Business to Consumer (B2C), Cloud Tasks, BERT, Sentiment Analysis, Google Cloud Functions, Azure Databricks, OpenAI GPT-3 API
Platforms
Docker, Databricks, Google App Engine, Amazon Web Services (AWS), Azure
Frameworks
Spark, Flask, Apache Spark, Bootstrap
Education
Master's Degree in Data Science
Universidad Politécnica de Madrid - Madrid, Spain
Bachelor's Degree in Computer Science
KTH Royal Institute of Technology - Stockholm, Sweden