João Rafael, Developer in Porto, Portugal
João is available for hire
Hire João

João Rafael

Bio

João is an applied data science specialist who bridges the gaps between business requirements, engineering constraints, and machine learning research. He leads the development of data science projects and has deployed products in multiple industries, including telco and fintech. João has developed and implemented novel machine learning algorithms for research institutions and custom solutions for commercial clients.

Portfolio

Upper Delta
Amazon Web Services (AWS), Data Science, Machine Learning...
Powercall
Artificial Intelligence (AI), Management, Call Centers, Customer Experience...
Feedzai
Amazon Web Services (AWS), Machine Learning, Fraud Prevention...

Experience

  • Python - 16 years
  • Data Science - 12 years
  • Artificial Intelligence (AI) - 12 years
  • Machine Learning - 10 years
  • Distributed Computing - 9 years
  • Software Project Management - 8 years
  • Deep Learning - 4 years
  • PyTorch - 4 years

Preferred Environment

Python, Ubuntu, Jupyter Notebook, XGBoost, PySpark, PyTorch, Scikit-learn, Python 3, Crypto, APIs

The most amazing...

...project I've worked on is a credit card fraud detection service for one of the largest payment processors in the US, handling over $1 billion in payments daily.

Work Experience

Founder

2018 - PRESENT
Upper Delta
  • Founded Upper Delta, a specialized data science and machine learning consultancy.
  • Developed large projects in the telco industry, including product recommendation systems, churn prediction models, call center optimization products, and quality-of-service degradation prediction models.
  • Oversaw the work conducted by employee and client teams, ensuring on-time delivery and visibility of project status.
  • Supervised the research conducted for several master's theses in collaboration with universities in Portugal.
  • Mentored 20+ data scientists and software engineers, providing them with professional growth opportunities through one-on-one sessions, reading groups, and workshops.
  • Demystified the role of data science and machine learning for C-level executives in our clients' organizations.
Technologies: Amazon Web Services (AWS), Data Science, Machine Learning, Deep Reinforcement Learning, Deep Learning, Apache Spark, PyTorch, Python, Artificial Intelligence (AI), Software Project Management, Options, Equity Market Data, Financial Markets, Equity, Reinforcement Learning, Quantitative Finance, Trading, APIs, Document Parsing, PDF, OpenAI, OpenAI API, Linux, Spark, Continuous Integration (CI), Bayesian Statistics, Bayesian Inference & Modeling, Amazon SageMaker, Prompt Engineering, Generative Artificial Intelligence (GenAI), Recommendation Systems, Customer Lifetime Value (CLV), Machine Learning Operations (MLOps), Neural Networks

Co-founder

2020 - 2024
Powercall
  • Led the technical implementation of the entire product, including infrastructure, ETL pipelines, machine learning models, and dashboards.
  • Co-founded Powercall, a company that delivers a call center optimization product. By using AI to identify the best hour to contact each customer, we improve call center operations with respect to answer rates, sales per client, and client reach.
  • Engaged with clients to showcase the product, set up pilot programs, discuss integration solutions, and finalize pricing options.
Technologies: Artificial Intelligence (AI), Management, Call Centers, Customer Experience, IT Consulting, Amazon Web Services (AWS), Software Project Management, Spark

Senior Software Engineer

2014 - 2017
Feedzai
  • Implemented a suite of data science tools that became part of the core product. Improved SOTA machine learning algorithms by conducting research and implementation, thereby improving results for all clients.
  • Played a key role in the delivery team that implemented fraud detection solutions for large banks, payment processors, and merchants, including First Data and JIO Wallet.
  • Served as the tech lead for a multimillion-dollar project, defining the solution's architecture. Coordinated and communicated requirements, progress, and deadlines across the client's technical staff and internal product and research teams.
  • Supervised the work of my team from a technical perspective, ensuring high-quality code and documentation.
  • Conducted regular one-on-one meetings with every team member to assess performance, future goals, culture match, and potential actions to improve their satisfaction within the team and the company.
Technologies: Amazon Web Services (AWS), Machine Learning, Fraud Prevention, Distributed Systems, Scala, Java, Artificial Intelligence (AI), Data Science, Software Project Management, Linux, Spark, Continuous Integration (CI), Neural Networks

Researcher

2011 - 2013
University of Coimbra
  • Designed and implemented a novel programming language for parallel, event-driven programming with deadlock-free semantics.
  • Implemented a framework for automatic parallelization of existing Java applications by detecting data dependencies at a granular level and scheduling execution with a work-stealing algorithm.
  • Co-authored two scientific papers for the International Journal of Parallel Programming and the Euro-Par Conference on parallel and distributed computing.
Technologies: High-performance Computing (HPC), Parallel Programming, Distributed Systems, Java, Linux, Algorithms

Experience

Fraud Detection System

A real-time credit-card fraud detection system for one of the largest payment processors in the US. This system offered the client an API for scoring transactions and a fraud score with automatic explanations for that score. This was a multi-datacenter and multi-tenant project with hard latency requirements and mandatory high-availability.

In addition to defining the solution's architecture, I coordinated and communicated requirements, progress, and deadlines across the client's technical staff, the company's project managers, and internal product engineering and research teams; ensured high-quality code and documentation, and coached each team member through one-on-one meetings.

Product Recommendation System

A product recommendation system for a large telecom. This product was used to define the upsell marketing strategy for existing customers and would recommend specific upgrades and bundles for each client, such as voice cards, mobile data, and premium TV channels.

I developed this system from scratch. A content-based approach was used to incorporate information from multiple domains, including product usage, billing information, previous customer interactions, and demographics, as well as product characteristics and price points. The recommendations were measured against the previous strategy in A/B tests, and a statistically significant increase in average revenue per user (ARPU) was achieved.

Call Center Optimization

A service that optimizes call centers' outbound operations (e.g., marketing campaigns) by identifying the best time to call each client. The system uses sociodemographic information together with the history of past call attempts to predict the best time to contact each client. Additionally, a global optimization process is used to define the best order in which to call clients to maximize the operation's KPI performance (e.g., client reach or right-party answer rates).

The system communicates with multiple partner companies and accesses several back-end systems and databases to collect the necessary information. A dashboard was created to monitor both the system and the final business metrics.

I led the development of the process from proof-of-concept to production, ensuring correct development processes and code quality by means of unit and integration tests, code linters, CI/CD, and code reviews.

This project drove a 15% increase in client reach for the selected marketing campaigns. It was showcased as a case study for the data science community and presented to an audience of 80+ data scientists, industry players, and C-level executives.

Quality of Service Degradation Prediction

A system to detect and predict degradation in the quality of service of cable, fiber internet, and TV signal delivery for a telecom servicing over two million devices. This system monitors technical low-level metrics such as transmission signal to noise ratio, pathloss and error rates, and other information related to the network stack such as IP availability, network topology, and throughput and latency. It also processes semi-structured information obtained from real-time logs of the management infrastructure and the devices themselves.

The model combines anomaly detection and predictive algorithms to identify which clients are facing or will face network issues. Due to the large amount of data collected, PySpark was used to process the information in a cluster. Specific code was developed in Java for extra optimizations.

Throughout this project, several distinct patterns were discovered in the data and relayed to the company's engineering team to fix. Additionally, a survey was conducted, contacting the clients who were most likely to be facing problems, and 98% confirmed our findings.

Rooftop Obstacle Detection for Solar Panel Company

A computer vision project that used satellite images and official records of building footprints to detect viable locations for solar panel placement on rooftops. I completed the initial milestone of the project in two weeks, and it outperformed existing individual human labels.

Sales Forecast for a Beverage Company

A sales forecast model and data exploration tool for a beverage producer and distributor. The model would forecast sales of hundreds of individual SKUs for the next three months at each point of sale, including coffee shops, groceries, and supermarkets.

I implemented the project, which included discussions with the client data and product teams to understand and clean data issues. Datasets were enriched with external data sources for weather, demographics, and events. The final deliverable included a dashboard where the client could visualize the data geographically and uncover patterns across locations and time spans.

Data Lake Implementation in AWS

I was responsible for the design, implementation, optimization, and maintenance of the client's data lake solution, which empowered the business's data needs across product, operations, analytics, and AI functions.

The data lake supported both bulk and streaming data ingestion from various sources, including data brokers, product usage, SaaS services, operational logs, and APIs. I ensured data cleanup and transformation processes were in place to meet both operational and analytical requirements effectively.

Education

2025 - 2025

Essential Molecular Biology - 'Hands On' Laboratory Course in Molecular Biology

University of Porto - Porto, Portugal

2010 - 2013

Master's Degree in Computer Science

University of Coimbra - Coimbra, Portugal

2008 - 2010

Bachelor's Degree in Computer Science

University of Coimbra - Coimbra, Portugal

Certifications

JULY 2015 - PRESENT

Certified DataStax Architect

DataStax

Skills

Libraries/APIs

Scikit-learn, XGBoost, Pandas, OpenAI API, PyTorch, PySpark

Tools

Amazon SageMaker, Amazon Athena, Apache Iceberg, RabbitMQ, Syslog, Apache Airflow

Languages

Python, Java, Scala, JavaScript, R, Rust, Python 3, SQL, C++, XML

Paradigms

Distributed Computing, High-performance Computing (HPC), Parallel Programming, Continuous Integration (CI), Anomaly Detection, Management

Platforms

Amazon Web Services (AWS), Linux, Ubuntu, Docker, Databricks, Jupyter Notebook, Oracle, Google Cloud Platform (GCP), AWS Lambda

Frameworks

Apache Spark, Spark

Storage

Amazon S3 (AWS S3), PostgreSQL, Distributed Databases

Industry Expertise

Project Management

Other

Machine Learning, Data Science, Fraud Prevention, Recommendation Systems, Artificial Intelligence (AI), Data Engineering, Algorithms, APIs, Document Parsing, OpenAI, Generative Artificial Intelligence (GenAI), Machine Learning Operations (MLOps), Neural Networks, Deep Learning, Software Project Management, Apache Cassandra, Distributed Systems, Mathematics, Computer Vision, Large Language Models (LLMs), Options, Reinforcement Learning, Crypto, Prompt Engineering, Customer Lifetime Value (CLV), Compilers, Deep Reinforcement Learning, Predictive Analytics, Optimization, Predictive Modeling, Network Topology, FTTH, Software Development, Computer Graphics, Digital Electronics, Digital Signal Processing, Call Centers, Customer Experience, IT Consulting, Forecasting, Web Dashboards, Bayesian Statistics, Bayesian Inference & Modeling, Amazon RDS, Amazon Managed Workflows for Apache Airflow (MWAA), Equity Market Data, Financial Markets, Equity, Quantitative Finance, Trading, Molecular Biology, DNA Sequencing, Plasmid Engineering, Transfection, PDF

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring