Carlos Miguel Pereira, Data Science and Machine Learning Developer in Cambridge, United Kingdom
Carlos Miguel Pereira

Data Science and Machine Learning Developer in Cambridge, United Kingdom

Member since October 15, 2021
Carlos has a solid track record in designing and implementing data-driven solutions, including advanced predictive modeling and optimization algorithms. His academic and professional background also brought him extensive knowledge of software development, data engineering, data visualization, and project management skills. Besides having a Ph.D., Carlos has done two postgraduate courses, one general management program, and a vast number of certifications and technical courses.
Carlos is now available for hire

Portfolio

  • Optimizely
    Management, Leadership, Data Science, Machine Learning, Data Engineering...
  • EDIT. Disruptive Digital Education
    Data Science, Machine Learning, Python, Statistics, Tutoring...
  • CKDelta
    Data Science, Geospatial Data, Geospatial Analytics, Python, Databricks...

Experience

Location

Cambridge, United Kingdom

Availability

Part-time

Preferred Environment

PyCharm, Jupyter Notebook, Windows, Linux, Git

The most amazing...

...project I have implemented was a customer churn predictive model with potential savings of €1 million per year.

Employment

  • Senior Manager, Data Science

    2022 - PRESENT
    Optimizely
    • Led the data science team composed of a mixture of junior and senior data scientists and data engineers.
    • Built roadmaps and prioritized work for the data science team.
    • Mentored and coached the team to grow both technically and professionally.
    Technologies: Management, Leadership, Data Science, Machine Learning, Data Engineering, Cloud, Recommendation Systems, A/B Testing, Experimental Design, Artificial Intelligence (AI)
  • Invited Instructor

    2021 - 2022
    EDIT. Disruptive Digital Education
    • Taught data science and deep learning modules in the data science and business analytics program.
    • Defined the course content for the data science and deep learning modules.
    • Created slides and material for the data science and deep learning modules.
    • Assessed the students of the data science and deep learning modules.
    Technologies: Data Science, Machine Learning, Python, Statistics, Tutoring, Instructor-led Training (ILT)
  • Data Science Consultant

    2021 - 2022
    CKDelta
    • Took responsibility for end-to-end architecture, implementation, and delivery of one MVP to an energy solutions company and another to an electricity company.
    • Leveraged mobility and geospatial data to predict the best locations for EV Charger installation across the UK.
    • Estimated electricity network headroom geospatially.
    Technologies: Data Science, Geospatial Data, Geospatial Analytics, Python, Databricks, Azure, Machine Learning, SQL, Pandas, PySpark, Azure DevOps, Tableau, Confluence, ETL, Git, Seaborn, Jira, LightGBM, Programming, Predictive Modeling, Startups, Electric Vehicles, Large-Scale Computing, Statistics
  • Data Science Manager

    2021 - 2021
    Metyis
    • Delivered a data strategy and roadmap of use cases for a large Benelux group.
    • Defined candidate qualifications and conducted interviews.
    • Built internal best practices and company awareness.
    Technologies: Data Science, Machine Learning, Roadmaps, Management, Consulting, Data Strategy
  • Data Lead

    2020 - 2021
    Barkyn
    • Defined roadmap of advanced analytics and reporting use cases.
    • Developed Churn and next order predictive models and deployed them in containerized web applications.
    • Built ETL pipelines using AWS services and created dashboard reports.
    • Performed deep-dive business analysis such as LTV, repetition patterns.
    Technologies: Data Science, Python, Machine Learning, SQL, Jira, Pandas, Scikit-learn, ETL, Git, Seaborn, Docker, Confluence, LightGBM, Monitoring, Modeling, Jupyter Notebook, Tableau, Amazon Web Services (AWS), PyCharm, NGINX, Flask, AWS Lambda, Redshift, AWS Elastic Beanstalk, Roadmaps, Programming, Web Scraping, Predictive Modeling, DBeaver, Natural Language Processing (NLP), Matplotlib, eCommerce, Startups, Text Mining, XGBoost, Statistics, A/B Testing, Large-Scale Computing, Big Data, Data Strategy
  • Lead Data Scientist

    2018 - 2020
    NOS Communications
    • Created a model to predict Churn and dunning to prioritize welcome calls.
    • Developed a model to predict equipment defects to avoid inefficient replacements.
    • Created a model to predict technical recurrences to reduce customer attrition.
    • Built a model to predict customer satisfaction, NPS, and voice of the customer (VoC) scorings.
    • Created a model of sentiment analysis for customer surveys.
    • Performed several developments for web scraping, including sites, forums, Facebook, Reddit, and LinkedIn.
    • Led and mentored more than ten people. Some projects included predicting the recurrence of telecom problems, developing an intelligent call routing, predicting billing service requests and errors, and building chatbots for robotic process automation.
    • Defined roadmap of analytical use cases for the department.
    • Managed collaboration with universities and co-supervised MSc students.
    • Defined cases and conducted interviews for the trainee program.
    Technologies: Data Science, Optimization, Machine Learning, SQL, Mentorship, Jira, Pandas, Scikit-learn, ETL, Git, Seaborn, Deep Learning, Team Management, TensorFlow, Keras, PySpark, Hadoop, Docker, Confluence, LightGBM, Modeling, Monitoring, Microsoft Power BI, Apache Hive, Oracle, Azure, XGBoost, Roadmaps, Web Scraping, Programming, Predictive Modeling, DBeaver, Zeppelin, Jupyter Notebook, PyCharm, Natural Language Processing (NLP), Matplotlib, Telecommunications, Text Mining, Statistics, A/B Testing, Large-Scale Computing, Big Data, Data Strategy, Artificial Intelligence (AI)
  • Team Lead Machine Learning Engineer

    2018 - 2018
    Center for Computer Graphics
    • Led and mentored the machine learning team including Ph.D. students.
    • Developed a business analytics system for the chemical sector of a large multinational company.
    • Applied machine learning algorithms for the textile sector of a large national company.
    • Involved in the initial phase of the development of a scalable and automated machine learning system for the security sector of a large national company.
    • Developed a recommendation system for the retail sector of a national company.
    • Interacted directly with clients and other stakeholders.
    • Led technical writing of multi-million euro funding proposals.
    • Involved in the creation of the Collaborative Laboratory ProChild CoLAB against Poverty and Social Exclusion.
    Technologies: Optimization, Data Science, Python, Machine Learning, SQL, Matplotlib, Algorithms, Technical Writing, RStudio, Jira, Pandas, Scikit-learn, ETL, Git, Seaborn, Deep Learning, Team Management, TensorFlow, Keras, PySpark, Hadoop, Jupyter Notebook, AutoML, Programming, Predictive Modeling, R, R Studio, Natural Language Processing (NLP), Research, Development, Consulting, Large-Scale Computing, Big Data, Artificial Intelligence (AI)
  • Research Associate

    2017 - 2018
    Institute of Telecommunications
    • Proposed novel system architectures for connecting e-health devices directly to the Internet.
    • Created models and optimizations for e-health devices connecting directly to the internet.
    • Wrote research papers and co-supervised MSc students.
    Technologies: Optimization, Data Science, Python, Algorithms, Mentorship, MATLAB, Wireless, Linux, Subversion (SVN), Telecommunications, Software Development, Technical Writing, Distributed Systems, JSON, Software Architecture, IoT Protocols, Programming, Development, Research
  • Research Fellow

    2012 - 2017
    Institute of Telecommunications
    • Designed energy- and latency-aware scheduling schemes to improve battery life on smartphones.
    • Optimized and created a packet transmission scheduling model.
    • Created a smartphone battery consumption model taking into consideration the different transmission technologies.
    • Designed and benchmarked IoT middleware platforms and mobile gateway applications for a telecommunications company.
    • Applied distributed computing and meta-heuristics to explore resource efficiency gains for multi-homed wireless devices.
    • Characterized and created a model of radio frequency interference in vehicular networks.
    • Wrote research papers, was involved in international collaborations, and co-supervised MSc students.
    Technologies: Data Science, Python, Machine Learning, SQL, Matplotlib, Algorithms, Mentorship, MATLAB, Wireless, Linux, Subversion (SVN), Telecommunications, Software Development, Linear Programming, Bash, Technical Writing, RStudio, R, Distributed Systems, JSON, Software Architecture, IoT Protocols, JetBrains, PyCharm, Java, IntelliJ IDEA, Software Design, Grid Computing, Android, Metaheuristics, Programming, Operations Research, Predictive Modeling, Development, Research
  • Research Assistant

    2010 - 2012
    Institute of Telecommunications
    • Developed resource allocation and decision algorithms leveraging advanced coding techniques.
    • Characterized and created a model of packet collisions in vehicular ad-hoc networks.
    • Wrote research papers and was involved in international collaborations.
    Technologies: Optimization, Python, Algorithms, MATLAB, Wireless, Linux, Software Development, Linear Programming, Bash, C, C++, Programming, Research, Development, Telecommunications

Experience

  • Customer Churn and Payment Default Prediction

    Created and deployed a model for predicting when customers would churn or enter in payment default to prioritize the order of the calls. Customers with a higher probability of churn or entering payment default were ranked higher and contacted first than the rest.

  • Predict Equipment Defects

    Created and deployed a model to predict equipment defects to prevent incorrect and costly replacements. For each maintenance operation at a customer's home, the model recommended the replacement (or not) of the equipment.

  • Predict Customers' Next Order

    Created and deployed a model for recommending the best moment for each customer's next order. The model would output, after each purchase and taking into consideration the customer's profile, the predicted time for the next purchase, which allowed for retention actions or direct marketing.

  • Sentiment Analysis and Topic Classification of Customer Surveys

    Created models for sentiment analysis and topic classification leveraging natural language processing techniques and a large number of customer surveys available. The models built allowed to predict the polarity such as love and hate, topics and categories like price, quality of experience, and entertainment, of sentences.

  • Predict Customer Satisfaction

    Created a model to predict customer satisfaction based on NPS and VoC scorings. Large volumes of data, corresponding to several customer dimensions, were incorporated into the model in order to have the most accurate predictions possible.

  • Predictive Model of Technical Recurrences

    Created a model to predict recurrence of problems after installations or maintenance procedure, allowing to reduce customer attrition. For each new installation and maintenance operation, the model would signal the ones with a higher probability of recurrence such as customer complaints.

Skills

  • Languages

    Python, SQL, R, Java, Bash, C, C++
  • Frameworks

    LightGBM, Hadoop, Flask
  • Libraries/APIs

    Matplotlib, Pandas, Scikit-learn, XGBoost, TensorFlow, Keras, PySpark
  • Tools

    Tableau, MATLAB, PyCharm, Jira, Git, Seaborn, Confluence, Microsoft Power BI, GitHub, R Studio, Subversion (SVN), JetBrains, IntelliJ IDEA, AutoML, NGINX
  • Paradigms

    Data Science, Business Intelligence (BI), Management, ETL, Linear Programming, Azure DevOps
  • Platforms

    Windows, Linux, RStudio, Oracle, Azure, AWS Lambda, Jupyter Notebook, Android, Docker, Amazon Web Services (AWS), AWS Elastic Beanstalk, Zeppelin, Databricks
  • Other

    Optimization, Data Analysis, Programming, Data Analytics, Algorithms, Machine Learning, Mentorship, Team Management, Monitoring, Predictive Modeling, Research, Large-Scale Computing, Development, Modeling, R&D, Technical Writing, Distributed Systems, IoT Protocols, Deep Learning, Roadmaps, Metaheuristics, Web Scraping, Big Data, Data Engineering, Data Visualization, Dashboards, Natural Language Processing (NLP), eCommerce, Consulting, Text Mining, Statistics, Data Strategy, Scrum Master, Wireless, Software Development, Software Architecture, Software Design, Grid Computing, Operations Research, Startups, A/B Testing, Leadership, Geospatial Data, Geospatial Analytics, Electric Vehicles, Cloud, Recommendation Systems, Experimental Design, Tutoring, Instructor-led Training (ILT), Artificial Intelligence (AI), Machine Learning Operations (MLOps), Azure Databricks, Computer Vision
  • Industry Expertise

    Telecommunications
  • Storage

    Databases, Apache Hive, DBeaver, JSON, Redshift, Google Cloud

Education

  • Oxford Executive Leadership Programme in Leadership
    2022 - 2022
    Saïd Business School, University of Oxford - Oxford, UK
  • General Management Program in Management
    2020 - 2020
    Católica Lisbon School of Business and Economics - Lisbon, Portugal
  • Postgraduate Programme in Business Intelligence and Analytics
    2018 - 2019
    Porto Business School - Porto, Portugal
  • Ph.D. in Electrical and Computer Engineering
    2012 - 2017
    Faculty of Engineering of the University of Porto - Porto, Portugal

Certifications

  • Azure AI Fundamentals
    JANUARY 2023 - PRESENT
    Microsoft
  • Azure Data Scientist Associate
    JANUARY 2023 - PRESENT
    Microsoft
  • Deep Learning Specialization
    MAY 2021 - PRESENT
    Coursera
  • Tableau Desktop Specialist
    FEBRUARY 2021 - PRESENT
    Tableau
  • Data Visualization with Tableau Specialization
    DECEMBER 2020 - PRESENT
    Coursera
  • Data Engineering, Big Data, and Machine Learning on GCP Specialization
    JUNE 2020 - PRESENT
    Coursera
  • Certified Scrum Master
    MAY 2020 - PRESENT
    Scrum Alliance
  • Complete Guide to TensorFlow for Deep Learning with Python
    MAY 2019 - PRESENT
    Udemy
  • Spark and Python for Big Data with PySpark
    NOVEMBER 2018 - PRESENT
    Udemy
  • Data Analysis
    JANUARY 2013 - PRESENT
    Coursera
  • Machine Learning
    JANUARY 2013 - PRESENT
    Coursera

To view more profiles

Join Toptal
Share it with others