Maxuel Reis, Developer in São Paulo - State of São Paulo, Brazil
Maxuel is available for hire
Hire Maxuel

Maxuel Reis

Verified Expert  in Engineering

Data Scientist/Engineer and Developer

São Paulo - State of São Paulo, Brazil

Toptal member since January 19, 2024

Bio

Maxuel is a data scientist and engineer with proven end-to-end data solution experience who quickly adapts to new technologies. He is skilled in batch and real-time machine learning, cloud computing (AWS, GCP, and Azure), big data (Hive and Spark), and data streams. He expertly combines data science and engineering for effective data management and business-aligned strategies. With an acting background, Maxuel has unique communication abilities that enhance teamwork and management.

Portfolio

Oliver Wyman - Data Science
Python, Data Modeling, Data Science, Business Analysis, Databricks, PySpark...
Oliver Wyman (via Toptal)
Data Science, Data Modeling, Business Analysis, PySpark, Financial Services...
InHire
SaaS, ClickHouse, SQL, Data Visualization, Streaming Data, Docker, Data...

Experience

  • SQL - 14 years
  • Python - 7 years
  • PySpark - 6 years
  • Credit Risk - 5 years
  • Amazon RDS - 4 years
  • AWS Glue - 4 years
  • Amazon DynamoDB - 4 years
  • AWS Lambda - 4 years

Availability

Full-time

Preferred Environment

SQL, Python, PySpark, Azure Databricks, Big Data, Machine Learning, Machine Learning Operations (MLOps), Data Science, Data Engineering, Amazon Web Services (AWS)

The most amazing...

...thing I've designed and implemented is a real-time recommender system for videos using NLP and computer vision for the mobile app called NOW Local News.

Work Experience

Databricks Migration Specialist

2024 - 2025
Oliver Wyman - Data Science
  • Conducted discovery and analysis of the Azure data pipeline and machine learning models to be migrated to GCP.
  • Converted and deployed the daily data pipeline on GCP using Spark jobs on Databricks.
  • Migrated and deployed machine learning models on GCP using MLflow on Databricks.
  • Validated data migration by comparing results between GCP and Azure to ensure consistency and accuracy.
  • Created comprehensive Confluence documentation detailing the new cloud data pipeline architecture.
Technologies: Python, Data Modeling, Data Science, Business Analysis, Databricks, PySpark, Financial Services, Data, Debugging, BigQuery, Google BigQuery, Azure Databricks, Delta Lake, Data Migration, Delta Live Tables (DLT), YAML, Unity Catalog, API Integration, Data Cleansing, ETL Tools, Bitbucket

Expert Data Engineer

2024 - 2025
Oliver Wyman (via Toptal)
  • Created a daily data pipeline using Azure Databricks, Databricks jobs, PySpark, and Delta Lake to generate input datasets for the daily projection model.
  • Migrated the daily data pipeline from Azure Databricks to Databricks on Google Cloud.
  • Optimized the daily data pipeline to reduce the total execution time.
Technologies: Data Science, Data Modeling, Business Analysis, PySpark, Financial Services, Data Engineering, ETL, Data Pipelines, Databricks, Azure Databricks, Google Cloud Platform (GCP), Azure, Python, SQL, Delta Lake, Spark, Data Migration, Microsoft Excel, Azure Data Lake, APIs, Data Lake Design, Big Data, Pandas, Query Optimization, Serverless, Git, NumPy, SciPy, Data, Debugging, Delta Live Tables (DLT), YAML, Unity Catalog, Functional Programming, Data Cleansing, ETL Tools, Bitbucket

ClickHouse Migration Specialist

2024 - 2024
InHire
  • Analyzed and understood complex SQL queries built on the Rockset database.
  • Converted and optimized Rockset SQL queries to run efficiently on ClickHouse.
  • Updated hundreds of dashboards in Explo to integrate with the new ClickHouse database.
  • Validated data migration by comparing results between ClickHouse and Rockset to ensure accuracy and consistency.
Technologies: SaaS, ClickHouse, SQL, Data Visualization, Streaming Data, Docker, Data, Infrastructure as Code (IaC), Debugging, Reporting, Retool, Data Migration, Reports, Data Cleansing, ETL Tools, GitHub

Consultant | Data Pipelines & BI

2024 - 2024
Profitable Media
  • Designed the data architecture to read real-time data from MariaDB using change data capture (CDC) and store it in ClickHouse.
  • Implemented and deployed a real-time data pipeline using Fivetran to capture data from MariaDB and ingest it into ClickHouse.
  • Created materialized views in ClickHouse using SQL to improve data visualization performance.
  • Analyzed business requirements and defined key metrics to be tracked on dashboards.
  • Built interactive dashboards in Metabase Cloud for data visualization.
  • Documented all dashboard calculations with clear explanations for transparency and future reference.
Technologies: ClickHouse, Fivetran, Metabase, SQL, Data Visualization, Data Pipelines, Realtime, CDC, DevOps, Data Engineering, Data Architecture, Streaming Data, Documentation, Data, Infrastructure as Code (IaC), Debugging, Reporting, Troubleshooting, Data Migration, Data Structures, Reports, Data Integration, Database Management, Data Cleansing, ETL Tools, GitHub

Expert Data Scientist (via Toptal)

2024 - 2024
Oliver Wyman
  • Conducted studies using historical data to identify and define the segmentation of groups with similar chargeback behavior.
  • Developed a churn model to predict the churn rate for payment method usage among customers with large accounts.
  • Optimized cluster utilization when running Apache Spark jobs.
  • Deployed PySpark jobs into production using EMR and Control-M.
  • Collaborated on setting up an Amazon EMR cluster to run jobs with huge data sizes.
Technologies: Python, Data Analysis, Microsoft Power BI, PySpark, Dashboards, Amazon Web Services (AWS), Models, Data Engineering, Spark, EMR, Amazon EMR Studio, Amazon Elastic MapReduce (EMR), ETL, Control-M, Data Lakes, GitLab, GitLab CI/CD, AWS DevOps, DevOps, Big Data, Modeling, Data Science, Data Migration, Data Visualization, Data Analytics, Microsoft Excel, Predictive Analytics, Data Warehousing, Data Lake Design, Data Architecture, Big Data Architecture, Distributed Systems, Pandas, Data Pipelines, Amazon Athena, Query Optimization, Serverless, Git, NumPy, SciPy, Data, Debugging, Unity Catalog, Reports, Apache, Data Cleansing, ETL Tools

Senior Data Scientist | Senior Data Engineer

2022 - 2024
Clevertech
  • Created and implemented a real-time video recommender system using natural language processing (NLP) and computer vision for the NOW LOCAL NEWS app.
  • Designed and implemented the analytics infrastructure and dashboards to track the app's performance and the recommender system.
  • Developed an AI algorithm to identify the team and league for sports videos.
  • Constructed data pipelines using Azure Data Factory and Azure Databricks.
  • Implemented a named-entity recognition (NER) model utilizing pretrained transformer models such as BERT, RoBERTa, and T5.
Technologies: Amazon RDS, AWS Lambda, AWS Glue, Amazon Athena, Streaming, Apache Superset, Google Analytics, TensorFlow, PyTorch, Amazon SageMaker, PySpark, Hugging Face, Amazon Transcribe, Amazon Comprehend, AWS Cloud Development Kit (CDK), Terraform, Lambda Architecture, Amazon DynamoDB, Redis Cache, Amazon Neptune, Amazon OpenSearch, A/B Testing, Azure Databricks, Azure Data Factory (ADF), JavaScript, TypeScript, Node.js, AWS Step Functions, Business Analysis, Data Modeling, Data Science, Azure, ETL, Computer Vision, Artificial Intelligence (AI), Deep Learning, Python, Databricks, Data Migration, Data Visualization, Data Analytics, Microsoft Excel, Azure Data Lake, Data Governance, Predictive Analytics, Prompt Engineering, OpenAI, Dashboards, APIs, Data Warehousing, PostgreSQL, Data Lake Design, Generative Artificial Intelligence (GenAI), Scikit-learn, Spark ML, Data Architecture, Big Data, Data Lakehouse, Microsoft Copilot, Distributed Systems, Pandas, Web Scraping, CSV Export, Data Pipelines, API Databases, DevOps, REST APIs, GraphQL, Query Optimization, Amazon Web Services (AWS), Serverless, BERT, Text-to-text Transfer Transformer (T5), RoBERTa, Transformers, IT Security, Back-end, Flask, Amazon EC2, Git, NumPy, SciPy, Docker, AWS Lake Formation, Data, Amazon Elastic Container Registry (ECR), Infrastructure as Code (IaC), Debugging, Reporting, Troubleshooting, Looker, Delta Lake, Azure Storage, YAML, Kubernetes, Data Structures, API Integration, Reports, Functional Programming, FastAPI, ChatGPT, Data Integration, Database Management, Apache, Apache Kafka, Data Cleansing, ETL Tools, GitHub

Senior Data Scientist

2021 - 2022
Intera
  • Designed and developed the architecture of the lakehouse for collecting data, streaming from DynamoDB with Kinesis and Kafka, and storing them in Amazon Redshift and S3.
  • Transformed and cleaned the streaming data from Redshift and S3 using Python, PySpark, and AWS Glue jobs.
  • Delivered real-time indicators with Athena and Metabase.
  • Deployed machine learning models using Python, Lambda, and API Gateway, delivering real-time predictions on streaming from the company's web application.
Technologies: Redshift, Amazon S3 (AWS S3), Amazon DynamoDB, Kafka Streams, Python, PySpark, AWS Glue, Amazon Athena, Metabase, Microsoft Power BI, Natural Language Processing (NLP), Machine Learning, Amazon API Gateway, Amazon SageMaker, AWS Lambda, Amazon Elastic MapReduce (EMR), JavaScript, TypeScript, Node.js, Business Analysis, Data Modeling, Data Science, ETL, Artificial Intelligence (AI), Spark, Data Migration, Data Visualization, Data Analytics, Mixpanel, Microsoft Excel, Data Governance, Predictive Analytics, Dashboards, APIs, Data Warehousing, PostgreSQL, Data Lake Design, Deep Learning, Scikit-learn, Data Architecture, Big Data, Data Lakehouse, Architecture, DAX, Distributed Systems, Pandas, Data Pipelines, API Databases, DevOps, REST APIs, Query Optimization, Amazon Web Services (AWS), Serverless, MariaDB, IT Security, Back-end, Technical Leadership, Product Management, Flask, Amazon EC2, Git, NumPy, SciPy, Docker, AWS Lake Formation, Data, Amazon Elastic Container Registry (ECR), Amazon CloudWatch, Infrastructure as Code (IaC), Debugging, Reporting, Troubleshooting, YAML, Data Structures, Data Scraping, API Integration, Reports, Data Integration, Database Management, Zapier, Apache, Apache Kafka, Data Cleansing, ETL Tools, GitHub

Senior Data Scientist

2019 - 2020
BV Financeira
  • Implemented innovations to reduce lead time in developing, monitoring, and updating the models deployed with continuous machine learning.
  • Performed data collection, processing, and analysis with SAS, Python, R, Databricks, Hadoop, SQL, PySpark, H2O Driverless AI, and KNIME.
  • Developed predictive models using statistical and machine learning techniques with structured and unstructured (text) data.
  • Designed experiments to answer causal inference questions.
Technologies: Machine Learning, SAS, Python, R, H2O AutoML, Azure Databricks, Hadoop, SQL, PySpark, KNIME, A/B Testing, Data Engineering, Data Scientist, Financial Services, Business Analysis, Data Modeling, Data Science, ETL, Artificial Intelligence (AI), Spark, Databricks, Data Visualization, Business Intelligence (BI), Data Analytics, Microsoft Excel, Azure, Predictive Analytics, Dashboards, Google Cloud Platform (GCP), Scikit-learn, Spark ML, Data Stewardship, Pandas, Git, NumPy, SciPy, Data, Amazon CloudWatch, Debugging, Reporting, Delta Live Tables (DLT), Unity Catalog, Data Scraping, Reports, Apache, Data Cleansing, ETL Tools, Bitbucket

Data Engineer

2017 - 2019
Banco Itaú
  • Generated daily KPIs of CRM results using SAS, Python, Hadoop, Splunk, Alteryx, and Adobe tools.
  • Developed a data mart with all the customers' interactions with the bank by collecting data streaming with Splunk Stream to provide the best CRM communication.
  • Created dashboards using Tableau with metrics for mobile app tracking.
Technologies: SAS, Python, Hadoop, Splunk, Alteryx, Adobe Analytics, SQL, Tableau, Automation, Financial Services, Data Modeling, ETL, Microsoft Excel, Trading, Scikit-learn, Query Optimization, Bash, Payment APIs, NumPy, Data, Debugging, Reports, Data Cleansing, ETL Tools

Credit Risk Data Analyst

2012 - 2017
Volkswagen Financial Services Brasil
  • Calculated loan loss provisions as per Basel II and IFRS 9, based on parameters—such as exposure at default (EAD), probability of default (PD), and loss given default (LGD)—estimated by models developed with SAS decision tree and logistic regression.
  • Implemented the retail portfolio's EAD, PD, LGD, and provision forecasting process. The entire process was also developed in SAS Enterprise Guide and mainly based on linear regression models.
  • Created presentations, ad-hoc reports, and provisioning stress tests using the SAS Enterprise Guide.
Technologies: Credit Risk, Credit Scores, Credit Ratings, A/B Testing, Basel III, IFRS 9, IFRS Financial Reporting, Finance, Tax Accounting, SAS, SQL, Data Engineering, Portfolio Analysis, Risk Modeling, Automation, Financial Services, Business Analysis, Data Modeling, ETL, Data Visualization, Business Intelligence (BI), Data Analytics, Microsoft Excel, Spanish, Data, Debugging, Reporting, Reports, Data Cleansing, ETL Tools

Experience

NOW Local Breaking News App

https://apps.apple.com/us/app/now-local-news/id6443724414/
An app for local news by local people. I was the data engineer and scientist who designed and implemented the app's real-time recommender system for videos using natural language processing (NLP) and computer vision. I also created and implemented the analytics infrastructure and dashboards to track the recommender system and the app's performance.

InHire ATS

https://www.inhire.com.br/
An applicant tracking system (ATS) that offers solutions for all stages of the recruitment process. I was the data architect who designed and developed the data visualization architecture for this SaaS product.

Financial Reserve Model

We developed and implemented a financial reserve calculation model for a credit card payment company to offset losses from chargebacks. This project involved handling massive databases with billions of records, requiring extensive expertise in Spark and SQL.

ML Algorithm for Stadium App

https://play.google.com/store/apps/details?id=com.stadium&hl=en_US&gl=US
Developed an algorithm that identifies the team and league associated with a video with 93% accuracy by analyzing the transcribed text and extracted images. I used Amazon Transcribe to extract the transcript from the video and a Hugging Face model to identify named entities. Additionally, I utilized a pre-trained deep learning model to identify colors in the video and compare them with the colors of team uniforms in the league.

Automated Crypto Trading Bots

I developed and deployed automated trading bots to execute cryptocurrency trades on Coinbase and Binance, integrating Python-based API calls to manage orders and retrieve real-time market data. By designing and testing day trading strategies, I leveraged technical indicators and market trends to optimize trade execution. Throughout this project, I gained deep insights into the cryptocurrency market, algorithmic trading, and day trading principles, strengthening my expertise in financial data processing and trading automation.

Education

2016 - 2020

Professional Technical Course in Theater: Performing Arts Interpretation

Teatro Escola Macunaíma - São Paulo, Brazil

2015 - 2015

Specialization in Statistics Topics: Statistics and Probability

Universidade de São Paulo - São Paulo, Brazil

2013 - 2014

Master's Degree in Data Analysis and Data Mining

Fundação Instituto de Administração - São Paulo, Brazil

2008 - 2011

Bachelor's Degree in Computer Science

Faculdade de Tecnologia do Estado de São Paulo - São Paulo, Brazil

Certifications

NOVEMBER 2014 - PRESENT

SAS Certified Base Programmer for SAS 9

SAS

Skills

Libraries/APIs

PySpark, Scikit-learn, Pandas, REST APIs, NumPy, TensorFlow, PyTorch, Spark ML, SciPy, Node.js, Hugging Face Transformers, Python API

Tools

AWS Glue, AWS Cloud Development Kit (CDK), H2O AutoML, Git, Microsoft Excel, Apache, Bitbucket, GitHub, Amazon Athena, Amazon SageMaker, Amazon Transcribe, Terraform, Microsoft Power BI, Amazon Elastic MapReduce (EMR), Splunk, Redash, AWS Step Functions, Amazon Elastic Container Registry (ECR), Amazon CloudWatch, Looker, ChatGPT, IBM SPSS, Google Analytics, Amazon OpenSearch, Kafka Streams, Adobe Analytics, Tableau, GitLab, GitLab CI/CD, Control-M, Confluence, Named-entity Recognition (NER), Microsoft Copilot, BigQuery, Retool, Zapier

Languages

SQL, Python, SAS, R, YAML, JavaScript, TypeScript, GraphQL, Bash, Java

Frameworks

Spark, Hadoop, Data Lakehouse, Flask, Delta Live Tables (DLT), Realtime

Paradigms

Business Intelligence (BI), ETL, Lambda Architecture, Automation, DevOps, Functional Programming, REST

Platforms

AWS Lambda, Amazon Web Services (AWS), Databricks, Docker, KNIME, Azure, Amazon EC2, Apache Kafka, Alteryx, Google Cloud Platform (GCP), Mixpanel, Kubernetes

Storage

Amazon DynamoDB, Databases, Amazon S3 (AWS S3), Data Pipelines, PostgreSQL, Data Lake Design, Redis Cache, API Databases, MariaDB, Data Integration, Database Management, Redshift, Data Lakes, ClickHouse, Azure Storage

Other

Azure Databricks, Amazon RDS, Data Analysis, Statistical Modeling, Finance, Machine Learning, Data Science, Data Engineering, Data Visualization, Programming, Statistics, Clustering, Credit Scores, A/B Testing, Presentations, Public Speaking, Streaming, Apache Superset, Metabase, Data Scientist, Credit Risk, Risk Modeling, Dashboards, Big Data, Business Analysis, Data Modeling, Artificial Intelligence (AI), Data Migration, Data Analytics, Data Warehousing, Query Optimization, Serverless, Data, Infrastructure as Code (IaC), Debugging, Reporting, Data Structures, API Integration, Reports, Data Cleansing, ETL Tools, EMR, Economics, Data Warehouse Design, Software Engineering, Neural Networks, Acting, Voice Acting, Teamwork, Hugging Face, Amazon Comprehend, Azure Data Factory (ADF), Natural Language Processing (NLP), Amazon API Gateway, Credit Ratings, Basel III, IFRS 9, IFRS Financial Reporting, Portfolio Analysis, Rockset, Financial Services, Delta Lake, Computer Vision, Deep Learning, Azure Data Lake, Data Governance, Predictive Analytics, Prompt Engineering, APIs, Data Architecture, Architecture, Distributed Systems, IT Security, Back-end, Spanish, AWS Lake Formation, IT Support, Troubleshooting, Unity Catalog, Data Scraping, FastAPI, Probability Notions and Stochastic Processes, Web Marketing, Sampling, Regression, Information Security, Networks, Time Series Analysis, Genetic Algorithms, Amazon Neptune, Tax Accounting, Models, Modeling, CI/CD Pipelines, Documentation, Business Rules, Production, Amazon EMR Studio, AWS DevOps, Machine Learning Operations (MLOps), Communication, Creativity, OpenAI, Trading, Generative Artificial Intelligence (GenAI), Big Data Architecture, Data Stewardship, DAX, Web Scraping, CSV Export, BERT, Text-to-text Transfer Transformer (T5), RoBERTa, Transformers, Technical Leadership, Product Management, Payment APIs, Fivetran, CDC, Streaming Data, SaaS, Trading Bots, Crypto, Bitcoin, BitMEX, Google BigQuery

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring