
Maxuel Reis
Verified Expert in Engineering
Data Scientist/Engineer and Developer
São Paulo - State of São Paulo, Brazil
Toptal member since January 19, 2024
Maxuel is a data architect and DevOps engineer who designs analytical data platforms on PostgreSQL/Aurora, builds multi-account AWS landing zones with Terraform, and ships GitHub Actions CI/CD for infrastructure and databases. He also brings deep data science and engineering experience across AWS, GCP, and Azure, with Spark, Hive, and batch and real-time ML. Maxuel's acting background sharpens his communication and teamwork.
Portfolio
Experience
- SQL - 14 years
- Python - 7 years
- PySpark - 6 years
- Credit Risk - 5 years
- Amazon RDS - 4 years
- AWS Glue - 4 years
- Amazon DynamoDB - 4 years
- AWS Lambda - 4 years
Preferred Environment
SQL, Python, PySpark, Azure Databricks, Big Data, Machine Learning, Machine Learning Operations (MLOps), Data Science, Data Engineering, Amazon Web Services (AWS)
The most amazing...
...thing I've designed and implemented is a real-time recommender system for videos using NLP and computer vision for the mobile app called NOW Local News.
Work Experience
Data Architect
Sema Technologies, Inc
- Designed the Code Scan analytical data platform on PostgreSQL (Aurora RDS), with a two-tier model: an Ingestion database for raw repository, commit, blame, CVE, and Salesforce data, and an Analysis database for aggregated risk metrics.
- Authored hundreds of SQL migrations (deploy, verify, revert) using Sqitch in the database repository, including stored procedures for CVE risk quartiles, file-summary percentiles, security-warning scoring, and code-duplication metrics.
- Built end-to-end CI/CD for database changes in GitHub Actions: schema test workflows (deploy, verify, revert, re-deploy), automated apply/rollback per environment (dev, stage, prod), one-time data migrations, and RDS-snapshot-aware deploys.
- Owned the Terraform repository, building a multi-account AWS landing zone with organizations, OUs, and SCPs, plus modular IaC (modules + live/dev/stage/prod) for VPC, subnets, NAT, route tables, security groups, NACLs, endpoints, and flow logs.
- Provisioned core AWS data and compute infrastructure as code: Aurora PostgreSQL, ElastiCache, ECR, ECS clusters and services, Lambda, ALBs, Auto Scaling Groups, Cloud Map, Secrets Manager, SSM, ACM, DynamoDB state locking, and S3 backends.
- Implemented secure access and observability: AWS Client VPN with TLS cert generation, IAM roles with GitHub OIDC for keyless CI deploys, SAML SSO, CloudWatch dashboards/alarms/logs, SNS data-protection policies, and Chatbot to Slack alerting.
- Built GitHub Actions CI/CD for Terraform (PR fmt/validate/plan, gated apply per env, release-chain sync, automated dev to stage to prod promotion PRs), plus a self-hosted runner module and OpenSSL RSA-OAEP encrypted PAT promotion flow.
- Contributed across the Code Scan service mesh: Python ETL for commits (SQLAlchemy + scikit-learn/CatBoost), CVE ingestion (packageurl, CPE, OWASP Dependency-Check), Salesforce gRPC/Avro integration, and Retool KPI apps and workflows.
Databricks Migration Specialist
Oliver Wyman - Data Science
- Conducted discovery and analysis of the Azure data pipeline and machine learning models to be migrated to GCP.
- Converted and deployed the daily data pipeline on GCP using Spark jobs on Databricks.
- Migrated and deployed machine learning models on GCP using MLflow on Databricks.
- Validated data migration by comparing results between GCP and Azure to ensure consistency and accuracy.
- Created comprehensive Confluence documentation detailing the new cloud data pipeline architecture.
Expert Data Engineer
Oliver Wyman (via Toptal)
- Created a daily data pipeline using Azure Databricks, Databricks jobs, PySpark, and Delta Lake to generate input datasets for the daily projection model.
- Migrated the daily data pipeline from Azure Databricks to Databricks on Google Cloud.
- Optimized the daily data pipeline to reduce the total execution time.
ClickHouse Migration Specialist
InHire
- Analyzed and understood complex SQL queries built on the Rockset database.
- Converted and optimized Rockset SQL queries to run efficiently on ClickHouse.
- Updated hundreds of dashboards in Explo to integrate with the new ClickHouse database.
- Validated data migration by comparing results between ClickHouse and Rockset to ensure accuracy and consistency.
Consultant | Data Pipelines & BI
Profitable Media
- Designed the data architecture to read real-time data from MariaDB using change data capture (CDC) and store it in ClickHouse.
- Implemented and deployed a real-time data pipeline using Fivetran to capture data from MariaDB and ingest it into ClickHouse.
- Created materialized views in ClickHouse using SQL to improve data visualization performance.
- Analyzed business requirements and defined key metrics to be tracked on dashboards.
- Built interactive dashboards in Metabase Cloud for data visualization.
- Documented all dashboard calculations with clear explanations for transparency and future reference.
Expert Data Scientist (via Toptal)
Oliver Wyman
- Conducted studies using historical data to identify and define the segmentation of groups with similar chargeback behavior.
- Developed a churn model to predict the churn rate for payment method usage among customers with large accounts.
- Optimized cluster utilization when running Apache Spark jobs.
- Deployed PySpark jobs into production using EMR and Control-M.
- Collaborated on setting up an Amazon EMR cluster to run jobs with huge data sizes.
Senior Data Scientist | Senior Data Engineer
Clevertech
- Created and implemented a real-time video recommender system using natural language processing (NLP) and computer vision for the NOW LOCAL NEWS app.
- Designed and implemented the analytics infrastructure and dashboards to track the app's performance and the recommender system.
- Developed an AI algorithm to identify the team and league for sports videos.
- Constructed data pipelines using Azure Data Factory and Azure Databricks.
- Implemented a named-entity recognition (NER) model utilizing pretrained transformer models such as BERT, RoBERTa, and T5.
Senior Data Scientist
Intera
- Designed and developed the architecture of the lakehouse for collecting data, streaming from DynamoDB with Kinesis and Kafka, and storing them in Amazon Redshift and S3.
- Transformed and cleaned the streaming data from Redshift and S3 using Python, PySpark, and AWS Glue jobs.
- Delivered real-time indicators with Athena and Metabase.
- Deployed machine learning models using Python, Lambda, and API Gateway, delivering real-time predictions on streaming from the company's web application.
Senior Data Scientist
BV Financeira
- Implemented innovations to reduce lead time in developing, monitoring, and updating the models deployed with continuous machine learning.
- Performed data collection, processing, and analysis with SAS, Python, R, Databricks, Hadoop, SQL, PySpark, H2O Driverless AI, and KNIME.
- Developed predictive models using statistical and machine learning techniques with structured and unstructured (text) data.
- Designed experiments to answer causal inference questions.
Data Engineer
Banco Itaú
- Generated daily KPIs of CRM results using SAS, Python, Hadoop, Splunk, Alteryx, and Adobe tools.
- Developed a data mart with all the customers' interactions with the bank by collecting data streaming with Splunk Stream to provide the best CRM communication.
- Created dashboards using Tableau with metrics for mobile app tracking.
Credit Risk Data Analyst
Volkswagen Financial Services Brasil
- Calculated loan loss provisions as per Basel II and IFRS 9, based on parameters—such as exposure at default (EAD), probability of default (PD), and loss given default (LGD)—estimated by models developed with SAS decision tree and logistic regression.
- Implemented the retail portfolio's EAD, PD, LGD, and provision forecasting process. The entire process was also developed in SAS Enterprise Guide and mainly based on linear regression models.
- Created presentations, ad-hoc reports, and provisioning stress tests using the SAS Enterprise Guide.
Experience
NOW Local Breaking News App
https://apps.apple.com/us/app/now-local-news/id6443724414/InHire ATS
https://www.inhire.com.br/Financial Reserve Model
ML Algorithm for Stadium App
Automated Crypto Trading Bots
Education
Professional Technical Course in Theater: Performing Arts Interpretation
Teatro Escola Macunaíma - São Paulo, Brazil
Specialization in Statistics Topics: Statistics and Probability
Universidade de São Paulo - São Paulo, Brazil
Master's Degree in Data Analysis and Data Mining
Fundação Instituto de Administração - São Paulo, Brazil
Bachelor's Degree in Computer Science
Faculdade de Tecnologia do Estado de São Paulo - São Paulo, Brazil
Certifications
SAS Certified Base Programmer for SAS 9
SAS
Skills
Libraries/APIs
PySpark, Scikit-learn, Pandas, REST APIs, NumPy, TensorFlow, PyTorch, Spark ML, SciPy, Node.js, Hugging Face Transformers, Python API, SQLAlchemy, Pydantic, Joblib, OpenSSL
Tools
AWS Glue, AWS Cloud Development Kit (CDK), H2O AutoML, Git, Microsoft Excel, Apache, Bitbucket, GitHub, Amazon Athena, Amazon SageMaker, Amazon Transcribe, Terraform, Microsoft Power BI, Amazon Elastic MapReduce (EMR), Splunk, Redash, AWS Step Functions, Amazon Elastic Container Registry (ECR), Amazon CloudWatch, Looker, ChatGPT, IBM SPSS, Google Analytics, Amazon OpenSearch, Kafka Streams, Adobe Analytics, Tableau, GitLab, GitLab CI/CD, Control-M, Confluence, Named-entity Recognition (NER), Microsoft Copilot, BigQuery, Retool, Zapier, Sqitch, Amazon Elastic Container Service (ECS), AWS CodeBuild, Amazon ElastiCache, Docker Compose, Pytest, Logging
Languages
SQL, Python, SAS, R, YAML, JavaScript, TypeScript, GraphQL, Bash, Java, Bash Script
Frameworks
Spark, Hadoop, Data Lakehouse, Flask, Delta Live Tables (DLT)
Paradigms
Business Intelligence (BI), ETL, Lambda Architecture, Automation, DevOps, Functional Programming, REST
Platforms
AWS Lambda, Amazon Web Services (AWS), Databricks, Docker, KNIME, Azure, Amazon EC2, Apache Kafka, Alteryx, Google Cloud Platform (GCP), Mixpanel, Kubernetes, AWS ALB, Salesforce
Storage
Amazon DynamoDB, Databases, Amazon S3 (AWS S3), Data Pipelines, Data Lakes, PostgreSQL, Data Lake Design, Redis Cache, API Databases, MariaDB, Data Integration, Database Management, Redshift, ClickHouse, Azure Storage, Database Modeling, SQL Stored Procedures
Other
Azure Databricks, Amazon RDS, Data Analysis, Statistical Modeling, Finance, Machine Learning, Data Science, Data Engineering, Data Visualization, Programming, Statistics, Clustering, Credit Scores, A/B Testing, Presentations, Public Speaking, Streaming, Apache Superset, Metabase, Data Scientist, Credit Risk, Risk Modeling, Dashboards, Big Data, Business Analysis, Data Modeling, Artificial Intelligence (AI), Data Migration, Data Analytics, Data Warehousing, Data Architecture, Query Optimization, Serverless, Data, Infrastructure as Code (IaC), Debugging, Reporting, Data Structures, API Integration, Reports, Data Cleansing, ETL Tools, Data Strategy, Data Processing, EMR, Economics, Data Warehouse Design, Software Engineering, Neural Networks, Acting, Voice Acting, Teamwork, Hugging Face, Amazon Comprehend, Azure Data Factory (ADF), Natural Language Processing (NLP), Amazon API Gateway, Credit Ratings, Basel III, IFRS 9, IFRS Financial Reporting, Portfolio Analysis, Rockset, Financial Services, Delta Lake, Computer Vision, Deep Learning, Azure Data Lake, Data Governance, Predictive Analytics, Prompt Engineering, APIs, Architecture, Distributed Systems, IT Security, Back-end, Spanish, AWS Lake Formation, IT Support, Troubleshooting, Unity Catalog, Data Scraping, FastAPI, Customer Relationship Management (CRM), Accounting Software, CRM Configuration, CRM Design, GraphDB, Probability Notions and Stochastic Processes, Web Marketing, Sampling, Regression, Information Security, Networks, Time Series Analysis, Genetic Algorithms, Amazon Neptune, Tax Accounting, Models, Modeling, CI/CD Pipelines, Documentation, Business Rules, Production, Amazon EMR Studio, AWS DevOps, Machine Learning Operations (MLOps), Communication, Creativity, OpenAI, Trading, Generative Artificial Intelligence (GenAI), Big Data Architecture, Data Stewardship, DAX, Web Scraping, CSV Export, BERT, Text-to-text Transfer Transformer (T5), RoBERTa, Transformers, Technical Leadership, Product Management, Payment APIs, Fivetran, Real-time Data, CDC, Streaming Data, SaaS, Trading Bots, Crypto, Bitcoin, BitMEX, Google BigQuery, ECS, Database Schema Design, Poetry, AWS IAM Identity Center, AWS Secrets Manager, GitHub Actions, Slackbot, OWASP, Platform Engineering, Release Management
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring