Carlos Miguel Pereira, Developer in Cambridge, United Kingdom
Carlos is available for hire
Hire Carlos

Carlos Miguel Pereira

Verified Expert  in Engineering

Data Science and Machine Learning Developer

Location
Cambridge, United Kingdom
Toptal Member Since
October 15, 2021

Carlos has a solid track record in designing and implementing data-driven solutions, including advanced predictive modeling and optimization algorithms. His academic and professional background also brought him extensive knowledge of software development, data engineering, data visualization, and project management skills. Besides having a Ph.D., Carlos has done two postgraduate courses, one general management program, and a vast number of certifications and technical courses.

Portfolio

Optimizely
Management, Leadership, Data Science, Machine Learning, Data Engineering, Cloud...
EDIT. Disruptive Digital Education
Data Science, Machine Learning, Python, Statistics, Tutoring...
CKDelta
Data Science, Geospatial Data, Geospatial Analytics, Python, Databricks, Azure...

Experience

Availability

Full-time

Preferred Environment

Git, Visual Studio Code (VS Code), MacOS

The most amazing...

...project I have implemented was a customer churn predictive model with potential savings of €1 million per year.

Work Experience

Senior Manager, Data Science

2022 - PRESENT
Optimizely
  • Led the data science team composed of a mixture of junior and senior data scientists and data engineers.
  • Built roadmaps and prioritized work for the data science team.
  • Mentored and coached the team to grow both technically and professionally.
Technologies: Management, Leadership, Data Science, Machine Learning, Data Engineering, Cloud, Recommendation Systems, A/B Testing, Experimental Design, Artificial Intelligence (AI), Large Language Models (LLMs), Generative AI, Databases

Invited Instructor

2021 - 2022
EDIT. Disruptive Digital Education
  • Taught data science and deep learning modules in the data science and business analytics program.
  • Defined the course content for the data science and deep learning modules.
  • Created slides and material for the data science and deep learning modules.
  • Assessed the students of the data science and deep learning modules.
Technologies: Data Science, Machine Learning, Python, Statistics, Tutoring, Instructor-led Training (ILT)

Data Science Consultant

2021 - 2022
CKDelta
  • Took responsibility for end-to-end architecture, implementation, and delivery of one MVP to an energy solutions company and another to an electricity company.
  • Leveraged mobility and geospatial data to predict the best locations for EV Charger installation across the UK.
  • Estimated electricity network headroom geospatially.
Technologies: Data Science, Geospatial Data, Geospatial Analytics, Python, Databricks, Azure, Machine Learning, SQL, Pandas, PySpark, Azure DevOps, Tableau, Confluence, ETL, Git, Seaborn, Jira, LightGBM, Programming, Predictive Modeling, Startups, Electric Vehicles, Large-Scale Computing, Statistics

Data Science Manager

2021 - 2021
Metyis
  • Delivered a data strategy and roadmap of use cases for a large Benelux group.
  • Defined candidate qualifications and conducted interviews.
  • Built internal best practices and company awareness.
Technologies: Data Science, Machine Learning, Roadmaps, Management, Consulting, Data Strategy

Data Lead

2020 - 2021
Barkyn
  • Defined roadmap of advanced analytics and reporting use cases.
  • Developed Churn and next order predictive models and deployed them in containerized web applications.
  • Built ETL pipelines using AWS services and created dashboard reports.
  • Performed deep-dive business analysis such as LTV, repetition patterns.
Technologies: Data Science, Python, Machine Learning, SQL, Jira, Pandas, Scikit-learn, ETL, Git, Seaborn, Docker, Confluence, LightGBM, Modeling, Monitoring, Jupyter Notebook, Tableau, Amazon Web Services (AWS), PyCharm, NGINX, Flask, AWS Lambda, Redshift, AWS Elastic Beanstalk, Roadmaps, Programming, Web Scraping, Predictive Modeling, DBeaver, Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), Matplotlib, eCommerce, Startups, Text Mining, XGBoost, Statistics, A/B Testing, Large-Scale Computing, Big Data, Data Strategy

Lead Data Scientist

2018 - 2020
NOS Communications
  • Created a model to predict Churn and dunning to prioritize welcome calls.
  • Developed a model to predict equipment defects to avoid inefficient replacements.
  • Created a model to predict technical recurrences to reduce customer attrition.
  • Built a model to predict customer satisfaction, NPS, and voice of the customer (VoC) scorings.
  • Created a model of sentiment analysis for customer surveys.
  • Performed several developments for web scraping, including sites, forums, Facebook, Reddit, and LinkedIn.
  • Led and mentored more than ten people. Some projects included predicting the recurrence of telecom problems, developing an intelligent call routing, predicting billing service requests and errors, and building chatbots for robotic process automation.
  • Defined roadmap of analytical use cases for the department.
  • Managed collaboration with universities and co-supervised MSc students.
  • Defined cases and conducted interviews for the trainee program.
Technologies: Data Science, Optimization, Machine Learning, SQL, Mentorship, Jira, Pandas, Scikit-learn, ETL, Git, Seaborn, Deep Learning, Team Management, TensorFlow, Keras, PySpark, Hadoop, Docker, Confluence, LightGBM, Modeling, Monitoring, Microsoft Power BI, Apache Hive, Oracle, Azure, XGBoost, Roadmaps, Web Scraping, Programming, Predictive Modeling, DBeaver, Zeppelin, Jupyter Notebook, PyCharm, Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), Matplotlib, Telecommunications, Text Mining, Statistics, A/B Testing, Large-Scale Computing, Big Data, Data Strategy, Artificial Intelligence (AI)

Team Lead Machine Learning Engineer

2018 - 2018
Center for Computer Graphics
  • Led and mentored the machine learning team including Ph.D. students.
  • Developed a business analytics system for the chemical sector of a large multinational company.
  • Applied machine learning algorithms for the textile sector of a large national company.
  • Involved in the initial phase of the development of a scalable and automated machine learning system for the security sector of a large national company.
  • Developed a recommendation system for the retail sector of a national company.
  • Interacted directly with clients and other stakeholders.
  • Led technical writing of multi-million euro funding proposals.
  • Involved in the creation of the Collaborative Laboratory ProChild CoLAB against Poverty and Social Exclusion.
Technologies: Optimization, Data Science, Python, Machine Learning, SQL, Matplotlib, Algorithms, Technical Writing, RStudio, Jira, Pandas, Scikit-learn, ETL, Git, Seaborn, Deep Learning, Team Management, TensorFlow, Keras, PySpark, Hadoop, Jupyter Notebook, AutoML, Programming, Predictive Modeling, R, GPT, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Research, Development, Consulting, Large-Scale Computing, Big Data, Artificial Intelligence (AI)

Research Associate

2017 - 2018
Institute of Telecommunications
  • Proposed novel system architectures for connecting e-health devices directly to the Internet.
  • Created models and optimizations for e-health devices connecting directly to the internet.
  • Wrote research papers and co-supervised MSc students.
Technologies: Optimization, Data Science, Python, Algorithms, Mentorship, MATLAB, Wireless, Linux, Subversion (SVN), Telecommunications, Software Development, Technical Writing, Distributed Systems, JSON, Software Architecture, IoT Protocols, Programming, Development, Research

Research Fellow

2012 - 2017
Institute of Telecommunications
  • Designed energy- and latency-aware scheduling schemes to improve battery life on smartphones.
  • Optimized and created a packet transmission scheduling model.
  • Created a smartphone battery consumption model taking into consideration the different transmission technologies.
  • Designed and benchmarked IoT middleware platforms and mobile gateway applications for a telecommunications company.
  • Applied distributed computing and meta-heuristics to explore resource efficiency gains for multi-homed wireless devices.
  • Characterized and created a model of radio frequency interference in vehicular networks.
  • Wrote research papers, was involved in international collaborations, and co-supervised MSc students.
Technologies: Data Science, Python, Machine Learning, SQL, Matplotlib, Algorithms, Mentorship, MATLAB, Wireless, Linux, Subversion (SVN), Telecommunications, Software Development, Linear Programming, Bash, Technical Writing, RStudio, R, Distributed Systems, JSON, Software Architecture, IoT Protocols, JetBrains, PyCharm, Java, IntelliJ IDEA, Software Design, Grid Computing, Android, Metaheuristics, Programming, Operations Research, Predictive Modeling, Development, Research, Constraint Programming

Research Assistant

2010 - 2012
Institute of Telecommunications
  • Developed resource allocation and decision algorithms leveraging advanced coding techniques.
  • Characterized and created a model of packet collisions in vehicular ad-hoc networks.
  • Wrote research papers and was involved in international collaborations.
Technologies: Optimization, Python, Algorithms, MATLAB, Wireless, Linux, Software Development, Linear Programming, Bash, C, C++, Programming, Development, Research, Telecommunications

Customer Churn and Payment Default Prediction

Created and deployed a model for predicting when customers would churn or enter in payment default to prioritize the order of the calls. Customers with a higher probability of churn or entering payment default were ranked higher and contacted first than the rest.

Predict Equipment Defects

Created and deployed a model to predict equipment defects to prevent incorrect and costly replacements. For each maintenance operation at a customer's home, the model recommended the replacement (or not) of the equipment.

Predict Customers' Next Order

Created and deployed a model for recommending the best moment for each customer's next order. The model would output, after each purchase and taking into consideration the customer's profile, the predicted time for the next purchase, which allowed for retention actions or direct marketing.

Sentiment Analysis and Topic Classification of Customer Surveys

Created models for sentiment analysis and topic classification leveraging natural language processing techniques and a large number of customer surveys available. The models built allowed to predict the polarity such as love and hate, topics and categories like price, quality of experience, and entertainment, of sentences.

Predict Customer Satisfaction

Created a model to predict customer satisfaction based on NPS and VoC scorings. Large volumes of data, corresponding to several customer dimensions, were incorporated into the model in order to have the most accurate predictions possible.

Predictive Model of Technical Recurrences

Created a model to predict recurrence of problems after installations or maintenance procedure, allowing to reduce customer attrition. For each new installation and maintenance operation, the model would signal the ones with a higher probability of recurrence such as customer complaints.

Languages

Python, SQL, R, Java, Bash, C, C++

Frameworks

LightGBM, Hadoop, Flask

Libraries/APIs

Matplotlib, Pandas, Scikit-learn, XGBoost, TensorFlow, Keras, PySpark

Tools

Tableau, MATLAB, PyCharm, Jira, Git, Seaborn, Confluence, Microsoft Power BI, GitHub, Subversion (SVN), JetBrains, IntelliJ IDEA, AutoML, NGINX

Paradigms

Data Science, Business Intelligence (BI), Management, ETL, Linear Programming, Azure DevOps, Constraint Programming

Platforms

Windows, Linux, RStudio, Oracle, Azure, AWS Lambda, Jupyter Notebook, Android, Docker, Amazon Web Services (AWS), AWS Elastic Beanstalk, Zeppelin, Databricks, Visual Studio Code (VS Code), MacOS

Other

Optimization, Data Analysis, Programming, Data Analytics, Algorithms, Machine Learning, Mentorship, Team Management, Monitoring, Predictive Modeling, Research, Large-Scale Computing, Development, Modeling, R&D, Technical Writing, Distributed Systems, IoT Protocols, Deep Learning, Roadmaps, Metaheuristics, Web Scraping, Big Data, Data Engineering, Data Visualization, Dashboards, Natural Language Processing (NLP), eCommerce, Consulting, Text Mining, Statistics, Data Strategy, GPT, Generative Pre-trained Transformers (GPT), Scrum Master, Wireless, Software Development, Software Architecture, Software Design, Grid Computing, Operations Research, Startups, A/B Testing, Leadership, Geospatial Data, Geospatial Analytics, Electric Vehicles, Cloud, Recommendation Systems, Experimental Design, Tutoring, Instructor-led Training (ILT), Artificial Intelligence (AI), Machine Learning Operations (MLOps), Azure Databricks, Computer Vision, Business, Certified Scrum Product Owner (CSPO), Generative AI, Large Language Models (LLMs)

Industry Expertise

Telecommunications

Storage

Databases, Apache Hive, DBeaver, JSON, Redshift, Google Cloud

2023 - 2023

MBA Essentials in Management

The London School of Economics and Political Science - London, UK

2022 - 2022

Oxford Executive Leadership Programme in Leadership

Saïd Business School, University of Oxford - Oxford, UK

2020 - 2020

General Management Program in Management

Católica Lisbon School of Business and Economics - Lisbon, Portugal

2018 - 2019

Postgraduate Programme in Business Intelligence and Analytics

Porto Business School - Porto, Portugal

2012 - 2017

Ph.D. in Electrical and Computer Engineering

Faculty of Engineering of the University of Porto - Porto, Portugal

JANUARY 2023 - PRESENT

Azure AI Fundamentals

Microsoft

JANUARY 2023 - PRESENT

Azure Data Scientist Associate

Microsoft

FEBRUARY 2022 - PRESENT

Certified Scrum Product Owner

Scrum Alliance

MAY 2021 - PRESENT

Deep Learning Specialization

Coursera

FEBRUARY 2021 - PRESENT

Tableau Desktop Specialist

Tableau

DECEMBER 2020 - PRESENT

Data Visualization with Tableau Specialization

Coursera

JUNE 2020 - PRESENT

Data Engineering, Big Data, and Machine Learning on GCP Specialization

Coursera

MAY 2020 - PRESENT

Certified Scrum Master

Scrum Alliance

MAY 2019 - PRESENT

Complete Guide to TensorFlow for Deep Learning with Python

Udemy

NOVEMBER 2018 - PRESENT

Spark and Python for Big Data with PySpark

Udemy

JANUARY 2013 - PRESENT

Data Analysis

Coursera

JANUARY 2013 - PRESENT

Machine Learning

Coursera

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring