Carlos del Cacho, Developer in Madrid, Spain
Carlos is available for hire
Hire Carlos

Carlos del Cacho

Verified Expert  in Engineering

Machine Learning Developer

Location
Madrid, Spain
Toptal Member Since
November 2, 2019

With 15+ years of experience, Carlos has deep technical knowledge within the AI space. A rare breed with expertise in data engineering, data science, and general software architecture. A techie with a business heart, he has gained wisdom in online business models, retention strategies, churn analysis, LTV computation, and sales forecasting within a variety of industries.

Portfolio

Databricks
Amazon Web Services (AWS), Spark
Paradigma Digital
Amazon Web Services (AWS), Scikit-learn, Python
Stratio Big Data
Scikit-learn, Python, Hue, Impala, Apache Hive, Kudu, HDFS, Spark, Cloudera

Experience

Availability

Part-time

Preferred Environment

Amazon Web Services (AWS), Elasticsearch, Apache Lucene, Apache Spark, Scikit-learn, Java, R, Python

The most amazing...

...accomplishment thus far in life is keeping the curiosity of a toddler to continue learning after all these years.

Work Experience

Solutions Architect

2020 - PRESENT
Databricks
  • Designed solutions for customers on top of the Databricks platform.
  • Provided technical support to the sales team, delivered presentations and workshops to customers to close business.
  • Assisted with deployments of the platform in a multi-cloud environment (AWS/Azure).
Technologies: Amazon Web Services (AWS), Spark

Big Data and AI Tech Lead

2019 - 2020
Paradigma Digital
  • Consulted across several industries, including banking (consumer finance), media, and utilities.
  • Provided project management within the data science space, leading teams of up to five individual contributors.
  • Hired and screened technical staff, including data scientists, data engineers, big data architects, and data governance consultants.
Technologies: Amazon Web Services (AWS), Scikit-learn, Python

Data Science Manager

2016 - 2019
Stratio Big Data
  • Segmented products in sales levels using unsupervised machine learning for personalized treatment when building predictive models.
  • Overhauled the architecture to improve the scalability of the system. Performing 20 million daily regressions within a demand forecasting problem in a distributed cluster in less than 8 hours.
  • Led a data science team of five engineers within a demand forecasting project in retail.
  • Developed new business development and analytics (presales). Held a 30% conversion rate from proposal to contract in data science projects over my tenure in presales.
  • Assisted deals in retail, media, education, banking, marketing, and utility industry sectors.
Technologies: Scikit-learn, Python, Hue, Impala, Apache Hive, Kudu, HDFS, Spark, Cloudera

Chief Data Scientist

2014 - 2016
Jobandtalent
  • Designed and implemented a matching algorithm for job offer recommendations based on conditional relevance models by analyzing career paths in resume databases.
  • Doubled the conversion rate of the recommender pipeline with the algorithm, resulting in an equivalent increase in the number of monthly job applications.
  • Prototyped a system for improving customer segmentation in email marketing in order to increase sales of online courses, with the idea of targeting users at the micro-level as opposed to targeting them by area of expertise.
  • Created a distributed system on top of Elastic MapReduce for tuning machine learning hyperparameters, accelerating run time of grid searches by 50x in an information retrieval application.
  • Proactively wrote a tool to automate dashboard generation and trained the BI team on its usage, shaving weekly hours of routine work from their schedule.
  • Delivered lectures to co-workers on Information Retrieval concepts.
  • Authored feature description specifications for several modules, defining the long term vision of the recommendation algorithm.
  • Screened machine learning engineers in hiring processes conducting knowledge assessments within area of expertise.
Technologies: Linear Discriminant Analysis (LDA), Topic Modeling, Amazon S3 (AWS S3), Redshift, Cloud Storage, BigQuery, R, PHP, Java, Redis, Apache Lucene, Hadoop

Sentiment Analysis for Political Forecasting

Crawling of social media (Twitter) and filtering based on specific keywords to retrieve and classify sentiment with regards to the Peace Plebiscite that took place in Colombia in October 2016. Classification through recurrent neural networks with deeplearning4j.

Recommender System for Olympic Channel

https://www.olympicchannel.com
Developed a personalization module based on ElasticSearch and text classification through linear kernel SVMs that attained 93% accuracy in tagging video metadata content automatically. Lead the development team and managed customer engagement. The system is designed over cloud technology in AWS to withstand peak traffic loads of over 150 million page views per month for the Tokyo 2020 games.

Credit Risk Scoring for a Financial Institution

https://www.bancosantander.es
Developed a hierarchical logistic model tree for assessing loan risk that departed from the bank's methodology and incorporated non-traditional signals from public datasets such as events in the Companies House Records (changes in management, refinancing, bankruptcies, etc).

Demand Forecasting for an F500 Retailer

Created a distributed system to predict daily unit product sales in over a thousand stores across Spain, performing 20 million regressions on a daily basis, improving accuracy, and delivering cost efficiencies in the supply chain by reducing excess inventory while avoiding stock-outs. Technologies: Apache Spark, Python, and scikit-learn.

Inventory Storage for an Order Management System

Developed an internal software with custom ranking functions over Apache Solr to dynamically check for article inventory in over a thousand stores with response times in milliseconds, prioritizing available locations by distance in search engine results.

Languages

R, Python, SQL, Java, PHP, C++, JavaScript

Frameworks

Hadoop, Apache Spark, Spark, LightGBM

Libraries/APIs

Apache Lucene, SpaCy, PySpark, Scikit-learn, NumPy, SciPy, XGBoost, CatBoost

Other

Machine Learning, Algorithms, Artificial Intelligence (AI), Web Scraping, Text Mining, Recommendation Systems, Presales, Statistics, Big Data, Scalability, Predictive Analytics, Association Rule Learning, Information Retrieval, Gradient Boosted Trees, Topic Modeling, Natural Language Processing (NLP), Neural Networks, Linear Regression, GPT, Generative Pre-trained Transformers (GPT), Economics, Financial Markets, Cloud Storage, Email Marketing, Online Marketing, Business Strategy, Data Mining, Churn Analysis, Acquisition Analysis, Marketing Attribution, Bayesian Statistics, Genetic Algorithms, A/B Testing, Deep Learning, Google Ads, Support Vector Machines (SVM), Linear Discriminant Analysis (LDA), Logistic Regression, Random Forests, Naive Bayes, Text Classification, Sentiment Analysis, Kubernetes Operations (kOps), Model Validation, Lambda Functions

Tools

Solr, Cloudera, Impala, Hue, BigQuery, Weka, Google Analytics, Git, Kudu

Paradigms

Scrum, Conversion Rate Optimization (CRO), Linear Programming, Business Intelligence (BI), MapReduce, Anomaly Detection, Data Science, ETL

Storage

Elasticsearch, PostgreSQL, MySQL, MongoDB, Apache Hive, Redis, HDFS, NoSQL, Data Validation, Redshift, Amazon S3 (AWS S3)

Platforms

Amazon Web Services (AWS), RapidMiner, AWS Lambda, Amazon EC2, Kubernetes, H2O Deep Learning Platform, Docker

Industry Expertise

Retail & Wholesale

2013 - 2015

Master's Degree in Artificial Intelligence

Universidad Autónoma de Madrid - Madrid, Spain

2009 - 2010

Progress towards a Bachelor of Arts Degree in Economics

UNED - Madrid, Spain

1999 - 2003

Master's Degree in Computer Science

Universidad Autónoma de Madrid - Madrid, Spain

JANUARY 2021 - JANUARY 2024

AWS Solutions Architect Associate

Amazon Web Services

AUGUST 2019 - PRESENT

Google Cloud Platform Fundamentals: Core Infrastructure

Google via Coursera

AUGUST 2019 - PRESENT

Essential Cloud Infrastructure: Foundation

Google via Coursera

JUNE 2018 - PRESENT

Convolutional Neural Networks

deeplearning.ai via Coursera

MAY 2018 - PRESENT

Structuring Machine Learning Projects

deeplearning.ai via Coursera

APRIL 2018 - PRESENT

Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization

deeplearning.ai via Coursera

MARCH 2018 - PRESENT

Neural Networks and Deep Learning

deeplearning.ai via Coursera

MARCH 2018 - PRESENT

Excel Skills for Business: Advanced

Macquarie University via Coursera

FEBRUARY 2018 - PRESENT

Excel Skills for Business: Intermediate II

Macquarie University via Coursera

FEBRUARY 2018 - PRESENT

Excel Skills for Business: Intermediate I

Macquarie University via Coursera

FEBRUARY 2018 - PRESENT

Excel Skills for Business: Essentials

Macquarie University via Coursera

NOVEMBER 2017 - PRESENT

Cost and Economics in Pricing Strategy

University of Virgina via Coursera

JUNE 2017 - PRESENT

Trading Algorithms

Indian School of Business via Coursera

JULY 2016 - PRESENT

Google Analytics Certification

Google

JULY 2016 - PRESENT

Google Adwords Certification

Google

MAY 2016 - PRESENT

Foundations of Business Strategy

University of Virginia via Coursera

MAY 2016 - PRESENT

Entrepreneurship 1: Developing the Opportunity

University of Pennsylvania via Coursera

MAY 2016 - PRESENT

Advanced Business Strategy

University of Virginia via Coursera

APRIL 2016 - PRESENT

Operations Analytics

University of Pennsylvania via Coursera

JANUARY 2016 - PRESENT

Customer Analytics

University of Pennsylvania via Coursera

AUGUST 2015 - PRESENT

Data Visualization

University of Illinois via Coursera

JULY 2015 - PRESENT

Text Mining and Analytics

University of Illinois via Coursera

JULY 2015 - PRESENT

Introduction to Big Data with Apache Spark

Databricks via EdX

JUNE 2015 - PRESENT

Cluster Analysis in Data Mining

University of Illinois via Coursera

APRIL 2015 - PRESENT

Text Retrieval and Search Engines

University of Illinois via Coursera

MARCH 2015 - PRESENT

Pattern Discovery in Data Mining

University of Illinois via Coursera

NOVEMBER 2014 - PRESENT

Data Analysis and Statistical Inference

Duke University via Coursera

JUNE 2014 - PRESENT

Model Thinking

University of Michigan via Coursera

APRIL 2013 - PRESENT

Cloudera Certified Developer for Apache Hadoop

Cloudera

FEBRUARY 2013 - PRESENT

Computing for Data Analysis

Johns Hopkins University via Coursera

OCTOBER 2012 - PRESENT

Networked Life (Social Network Analysis)

University of Pennsylvania via Coursera

MAY 2012 - PRESENT

Game Theory

Stanford University via Coursera

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring