
Maxuel Reis
Verified Expert in Engineering
Data Scientist/Engineer and Developer
São Paulo - State of São Paulo, Brazil
Toptal member since January 19, 2024
Maxuel is a data scientist and engineer with proven end-to-end data solution experience who quickly adapts to new technologies. He is skilled in batch and real-time machine learning, cloud computing (AWS, GCP, and Azure), big data (Hive and Spark), and data streams. He expertly combines data science and engineering for effective data management and business-aligned strategies. With an acting background, Maxuel has unique communication abilities that enhance teamwork and management.
Portfolio
Experience
- SQL - 14 years
- Python - 7 years
- PySpark - 6 years
- Credit Risk - 5 years
- Amazon RDS - 4 years
- AWS Glue - 4 years
- Amazon DynamoDB - 4 years
- AWS Lambda - 4 years
Availability
Preferred Environment
SQL, Python, PySpark, Azure Databricks, Big Data, Machine Learning, Machine Learning Operations (MLOps), Data Science, Data Engineering, Amazon Web Services (AWS)
The most amazing...
...thing I've designed and implemented is a real-time recommender system for videos using NLP and computer vision for the mobile app called NOW Local News.
Work Experience
Databricks Migration Specialist
Oliver Wyman - Data Science
- Conducted discovery and analysis of the Azure data pipeline and machine learning models to be migrated to GCP.
- Converted and deployed the daily data pipeline on GCP using Spark jobs on Databricks.
- Migrated and deployed machine learning models on GCP using MLflow on Databricks.
- Validated data migration by comparing results between GCP and Azure to ensure consistency and accuracy.
- Created comprehensive Confluence documentation detailing the new cloud data pipeline architecture.
Expert Data Engineer
Oliver Wyman (via Toptal)
- Created a daily data pipeline using Azure Databricks, Databricks jobs, PySpark, and Delta Lake to generate input datasets for the daily projection model.
- Migrated the daily data pipeline from Azure Databricks to Databricks on Google Cloud.
- Optimized the daily data pipeline to reduce the total execution time.
ClickHouse Migration Specialist
InHire
- Analyzed and understood complex SQL queries built on the Rockset database.
- Converted and optimized Rockset SQL queries to run efficiently on ClickHouse.
- Updated hundreds of dashboards in Explo to integrate with the new ClickHouse database.
- Validated data migration by comparing results between ClickHouse and Rockset to ensure accuracy and consistency.
Consultant | Data Pipelines & BI
Profitable Media
- Designed the data architecture to read real-time data from MariaDB using change data capture (CDC) and store it in ClickHouse.
- Implemented and deployed a real-time data pipeline using Fivetran to capture data from MariaDB and ingest it into ClickHouse.
- Created materialized views in ClickHouse using SQL to improve data visualization performance.
- Analyzed business requirements and defined key metrics to be tracked on dashboards.
- Built interactive dashboards in Metabase Cloud for data visualization.
- Documented all dashboard calculations with clear explanations for transparency and future reference.
Expert Data Scientist (via Toptal)
Oliver Wyman
- Conducted studies using historical data to identify and define the segmentation of groups with similar chargeback behavior.
- Developed a churn model to predict the churn rate for payment method usage among customers with large accounts.
- Optimized cluster utilization when running Apache Spark jobs.
- Deployed PySpark jobs into production using EMR and Control-M.
- Collaborated on setting up an Amazon EMR cluster to run jobs with huge data sizes.
Senior Data Scientist | Senior Data Engineer
Clevertech
- Created and implemented a real-time video recommender system using natural language processing (NLP) and computer vision for the NOW LOCAL NEWS app.
- Designed and implemented the analytics infrastructure and dashboards to track the app's performance and the recommender system.
- Developed an AI algorithm to identify the team and league for sports videos.
- Constructed data pipelines using Azure Data Factory and Azure Databricks.
- Implemented a named-entity recognition (NER) model utilizing pretrained transformer models such as BERT, RoBERTa, and T5.
Senior Data Scientist
Intera
- Designed and developed the architecture of the lakehouse for collecting data, streaming from DynamoDB with Kinesis and Kafka, and storing them in Amazon Redshift and S3.
- Transformed and cleaned the streaming data from Redshift and S3 using Python, PySpark, and AWS Glue jobs.
- Delivered real-time indicators with Athena and Metabase.
- Deployed machine learning models using Python, Lambda, and API Gateway, delivering real-time predictions on streaming from the company's web application.
Senior Data Scientist
BV Financeira
- Implemented innovations to reduce lead time in developing, monitoring, and updating the models deployed with continuous machine learning.
- Performed data collection, processing, and analysis with SAS, Python, R, Databricks, Hadoop, SQL, PySpark, H2O Driverless AI, and KNIME.
- Developed predictive models using statistical and machine learning techniques with structured and unstructured (text) data.
- Designed experiments to answer causal inference questions.
Data Engineer
Banco Itaú
- Generated daily KPIs of CRM results using SAS, Python, Hadoop, Splunk, Alteryx, and Adobe tools.
- Developed a data mart with all the customers' interactions with the bank by collecting data streaming with Splunk Stream to provide the best CRM communication.
- Created dashboards using Tableau with metrics for mobile app tracking.
Credit Risk Data Analyst
Volkswagen Financial Services Brasil
- Calculated loan loss provisions as per Basel II and IFRS 9, based on parameters—such as exposure at default (EAD), probability of default (PD), and loss given default (LGD)—estimated by models developed with SAS decision tree and logistic regression.
- Implemented the retail portfolio's EAD, PD, LGD, and provision forecasting process. The entire process was also developed in SAS Enterprise Guide and mainly based on linear regression models.
- Created presentations, ad-hoc reports, and provisioning stress tests using the SAS Enterprise Guide.
Experience
NOW Local Breaking News App
https://apps.apple.com/us/app/now-local-news/id6443724414/InHire ATS
https://www.inhire.com.br/Financial Reserve Model
ML Algorithm for Stadium App
https://play.google.com/store/apps/details?id=com.stadium&hl=en_US&gl=USAutomated Crypto Trading Bots
Education
Professional Technical Course in Theater: Performing Arts Interpretation
Teatro Escola Macunaíma - São Paulo, Brazil
Specialization in Statistics Topics: Statistics and Probability
Universidade de São Paulo - São Paulo, Brazil
Master's Degree in Data Analysis and Data Mining
Fundação Instituto de Administração - São Paulo, Brazil
Bachelor's Degree in Computer Science
Faculdade de Tecnologia do Estado de São Paulo - São Paulo, Brazil
Certifications
SAS Certified Base Programmer for SAS 9
SAS
Skills
Libraries/APIs
PySpark, Scikit-learn, Pandas, REST APIs, NumPy, TensorFlow, PyTorch, Spark ML, SciPy, Node.js, Hugging Face Transformers, Python API
Tools
AWS Glue, AWS Cloud Development Kit (CDK), H2O AutoML, Git, Microsoft Excel, Apache, Bitbucket, GitHub, Amazon Athena, Amazon SageMaker, Amazon Transcribe, Terraform, Microsoft Power BI, Amazon Elastic MapReduce (EMR), Splunk, Redash, AWS Step Functions, Amazon Elastic Container Registry (ECR), Amazon CloudWatch, Looker, ChatGPT, IBM SPSS, Google Analytics, Amazon OpenSearch, Kafka Streams, Adobe Analytics, Tableau, GitLab, GitLab CI/CD, Control-M, Confluence, Named-entity Recognition (NER), Microsoft Copilot, BigQuery, Retool, Zapier
Languages
SQL, Python, SAS, R, YAML, JavaScript, TypeScript, GraphQL, Bash, Java
Frameworks
Spark, Hadoop, Data Lakehouse, Flask, Delta Live Tables (DLT), Realtime
Paradigms
Business Intelligence (BI), ETL, Lambda Architecture, Automation, DevOps, Functional Programming, REST
Platforms
AWS Lambda, Amazon Web Services (AWS), Databricks, Docker, KNIME, Azure, Amazon EC2, Apache Kafka, Alteryx, Google Cloud Platform (GCP), Mixpanel, Kubernetes
Storage
Amazon DynamoDB, Databases, Amazon S3 (AWS S3), Data Pipelines, PostgreSQL, Data Lake Design, Redis Cache, API Databases, MariaDB, Data Integration, Database Management, Redshift, Data Lakes, ClickHouse, Azure Storage
Other
Azure Databricks, Amazon RDS, Data Analysis, Statistical Modeling, Finance, Machine Learning, Data Science, Data Engineering, Data Visualization, Programming, Statistics, Clustering, Credit Scores, A/B Testing, Presentations, Public Speaking, Streaming, Apache Superset, Metabase, Data Scientist, Credit Risk, Risk Modeling, Dashboards, Big Data, Business Analysis, Data Modeling, Artificial Intelligence (AI), Data Migration, Data Analytics, Data Warehousing, Query Optimization, Serverless, Data, Infrastructure as Code (IaC), Debugging, Reporting, Data Structures, API Integration, Reports, Data Cleansing, ETL Tools, EMR, Economics, Data Warehouse Design, Software Engineering, Neural Networks, Acting, Voice Acting, Teamwork, Hugging Face, Amazon Comprehend, Azure Data Factory (ADF), Natural Language Processing (NLP), Amazon API Gateway, Credit Ratings, Basel III, IFRS 9, IFRS Financial Reporting, Portfolio Analysis, Rockset, Financial Services, Delta Lake, Computer Vision, Deep Learning, Azure Data Lake, Data Governance, Predictive Analytics, Prompt Engineering, APIs, Data Architecture, Architecture, Distributed Systems, IT Security, Back-end, Spanish, AWS Lake Formation, IT Support, Troubleshooting, Unity Catalog, Data Scraping, FastAPI, Probability Notions and Stochastic Processes, Web Marketing, Sampling, Regression, Information Security, Networks, Time Series Analysis, Genetic Algorithms, Amazon Neptune, Tax Accounting, Models, Modeling, CI/CD Pipelines, Documentation, Business Rules, Production, Amazon EMR Studio, AWS DevOps, Machine Learning Operations (MLOps), Communication, Creativity, OpenAI, Trading, Generative Artificial Intelligence (GenAI), Big Data Architecture, Data Stewardship, DAX, Web Scraping, CSV Export, BERT, Text-to-text Transfer Transformer (T5), RoBERTa, Transformers, Technical Leadership, Product Management, Payment APIs, Fivetran, CDC, Streaming Data, SaaS, Trading Bots, Crypto, Bitcoin, BitMEX, Google BigQuery
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring