
Arthur Flores Duarte
Verified Expert in Engineering
Software Developer
Florianópolis - State of Santa Catarina, Brazil
Toptal member since October 17, 2019
Arthur is a data/analytics engineer with 12+ years in data and 20 years in IT. He designs scalable warehouses, automates pipelines, designs data models, and turns multi-source data into clear dashboards across AWS and GCP. Skilled with dbt, Python, Prefect, Snowflake, and BigQuery, he recently cut BigQuery costs by 50% and built AI agents that detect and fix pipeline failures automatically. Arthur is committed, adaptable, and easy to work with.
Portfolio
Experience
- SQL - 10 years
- Data Engineering - 5 years
- Data Build Tool (dbt) - 5 years
- Data Visualization - 5 years
- Python - 4 years
- BigQuery - 3 years
- Snowflake - 3 years
- Claude Code - 1 year
Preferred Environment
SQL, Amazon Web Services (AWS), Snowflake, Python, Data Build Tool (dbt), BigQuery, Google Cloud Platform (GCP), Looker
The most amazing...
...: I implemented Cursor AI agents that detect failures in Prefect and dbt pipelines using Claude to diagnose, fix, and open a PR and send notifications in Slack.
Work Experience
Senior Data Engineer | AI Engineer
Journey Rewards
- Orchestrated data pipelines with Prefect and Python, ingesting multiple APIs data into the BigQuery data warehouse in GCP.
- Designed dbt models and data tests (SQL, Jinja) powering incremental pipelines, back-end apps, and reporting marts.
- Reduced BigQuery Costs in 50% applying clustering, partitioning, and incremental materialization strategies.
- Built AI agents that detect failures in Prefect and dbt pipelines, using Claude to diagnose and fix them automatically, opening GitHub PRs, and posting the cause, fix, and next steps to Slack.
- Used Claude Code as an AI development assistant in VS Code, boosting delivery speed and code quality.
- Delivered self-serve KPI dashboards in Hex for non-technical stakeholders of a hospitality startup.
- Maintained CI/CD workflows using GitHub and AI agents for code review.
Data Scientist
Martie
- Defined and implemented dashboards tracking key business metrics to monitor daily performance.
- Built multiple dashboards in Supernova using SQL, translating user requirements into actionable insights.
- Developed inventory and revenue forecasts using Prophet (Python lib) with a Deepnote notebook, automating data storage in Snowflake.
- Integrated multiple APIs into Snowflake through Python for seamless data ingestion.
Data Analytics Engineer
Parade
- Developed and maintained Looker dashboards, models, and schedules.
- Built and maintained dbt data models using dbt Cloud.
- Built Looker automation, integrating data with Airtable and Slack.
- Troubleshot data ingestions from Stitch and Fivetran related to different data sources: Facebook Ads, Google Ads, TikTok Ads, Shopify, Klaviyo, Airtable, PostgreSQL, and Google Analytics.
- Managed all data operations within the company, including data engineering, data analysis, analytics engineering, and experimental AI projects.
- Migrated AWS Redshift to Google BigQuery data warehouse, improving scalability.
- Coded Python scripts for data extraction, running them automatically using GCP Cloud Run: Shopify Reports (web scraping), Klaviyo Templates (API), and Attentive (webhook), etc.
- Migrated data reports and dashboards from Looker to Power BI.
Snowflake Data Engineer
Appex Group, Inc.
- Designed and built a new data architecture for the company in collaboration with the new data team, helping automate data ingestion, modeling, and visualization.
- Implemented dbt (data build tool) for data modeling and trained data analysts on how to use it properly.
- Set up Fivetran connectors from several data sources to a Snowflake data warehouse.
- Implemented Airflow DAGs (MWAA) to run and test dbt models and for custom data ingestions using API calls.
- Built controls for data quality checks in the extracted sources using dbt tests and Airflow.
Data Engineer
SimplyWise
- Designed and built a data architecture to help the company make data-driven decisions.
- Integrated multiple data sources such as Apple Search, Google Ads, MySQL, and Amplitude into a Redshift data warehouse.
- Configured data integration services like Fivetran and Stitch to collect different data sources.
- Developed a Python data pipeline to read and parse JSON files, upload them to Amazon S3, and load them into Redshift tables.
- Installed and configured the data-building tool to perform data transformation through ELT data modeling.
- Built dashboards and reports using Tableau Online.
Analytics Engineer
205 Data Lab
- Worked 100% remotely for a US-based company, providing services for San Francisco Bay data customers as an analytics engineer.
- Created automated custom data reports using Python and Excel VBA scripts.
- Extracted data for reports from Presto DB and Snowflake using complex SQL scripts.
- Transformed and modeled raw data through ELT processes using data-building tools.
- Developed Python scripts for Prefect Cloud to automate and orchestrate data reports.
- Integrated data between Snowflake and Salesforce to generate reports using Bulk API.
Data Engineer
Projeto 22
- Worked as a part-time data engineer, supporting the data squad on a new data lake project for a vehicle sales company WebMotors.
- Supported the data engineering team providing AWS resources: EC2, S3, VPC, RDS using Aurora PostgreSQL, and Redshift.
- Developed Cloud Formation templates to automate AWS resources provisioning.
- Deployed an Apache NIFI cluster for data ingestion processing using Zookeeper and NiFi Registry.
- Integrated AWS DMS (data migration service) mapping tables from SQL Server and Aurora MySQL to S3 buckets in parquet files.
- Employed an Apache Airflow and EMR cluster to support complex data processing.
- Built CloudWatch alarms for servers and applications monitoring.
- Used Lambda and Boto3 to automatically stop/start EC2 and RDS instances according to schedule, reducing costs.
- Developed Python scripts for several purposes, such as database stress tests.
Data Analyst
Spin
- Analyzed data using complex SQL queries on Google BigQuery.
- Presented data analytics reports for managers using dashboards from Mode Analytics.
- Defined business metrics to support and determine company OKRs.
- Provided monitoring and decision support reports for different areas such as growth, product, and operations.
- Worked on geospatial analysis for scooters and generated map charts using Python and Jupyter Notebook.
- Supported A/B tests to analyze the adoption of product features.
Solution Integration Architect
Optiva Inc.
- Worked 100% remotely as part of a global professional services team (English and Spanish speaking teams).
- Integrated and configured DCRM and portals for telecom customers in different countries.
- Wrote system integration tests, user acceptance tests, test scenarios, configuration handbooks, user training, and production rollout documents.
- Analyzed data from different CRM environments, exporting and comparing data into Excel using formulas such as VLOOKUP to troubleshoot missing configuration parameters.
Data Analyst | Technical Leader
Wedo Technologies
- Worked with large volumes of data. Found revenue leakages and trends between telecom systems, applying data analytics techniques.
- Integrated several telecom systems into the RAID ETL tool, reading different data sources(Oracle, SQL Server, Excel, CSV, ASN1), containing telecom events and customer data.
- Performed Oracle DB, SQL performance tuning, procedures using PL/SQL, and wrote Python and Shell scripts.
- Provided system analysis and solution design documentation, scope and architecture definition, and technical proposals for sales.
- Designed user reports, dashboards, and KPIs to support data-driven decisions and identify revenue leakages.
- Oversaw unit tests, integration tests, UAT, and production rollout. Completed technical and functional training for customers and teams.
- Provided technical leadership and project management in different projects.
- Worked on various projects for telecommunications customers in different countries, such as Brazil, Chile, and Peru.
Developer
Seventh
- Contributed to Delphi programming (MVC, object-oriented) for video surveillance software.
- Integrated different kinds of devices such as IP cameras and video encoders.
- Led network protocol integration using CGI, SOAP, HTTP, TCP, and RTSP.
- Reverse-engineered protocols using Wireshark. Decoded video and audio using VLC libraries.
Trainee | System Analyst
Alliance Consutoria
- Developed software with Uniface language—Compuware.
- Integrated databases including Oracle, SQL Server, and DB2.
- Contributed to development using Agile methodologies.
- Participated in level two CMMI project implementation.
Experience
Data Analyst for a Toptal Client
The data was extracted made using SQL and using Google BigQuery geospatial libs. The dashboard was made using Mode, Python, and Jupyter Notebook.
Data Engineer for Fintech Company
Configured data integration services like Fivetran and Stitch to collect different data sources and developed a data pipeline using Python to read and parse JSON files, upload them to AWS S3 and load them into Redshift tables. I installed and configured the DBT (data building tool) to perform data
transformation (ELT data modeling) and designed the dashboards and reports using Tableau Online.
New Data Architecture
In the first stage, we made the data available for data analysts as soon as possible using Fivetran and storing it in Snowflake.
In the second stage, we built custom connectors using Airflow and implemented more organized data modeling with dbt and SQL.
Education
Master of Business Administration (MBA) in Project Management
Fundação Getulio Vargas (FGV) - Florianopolis, SC, Brazil
Bachelor's Degree in Computer Engineering
Universidade Metodista de São Paulo - Sao Bernardo do Campo, SP, Brazil
Skills
Libraries/APIs
Tray.io, Pandas, NumPy
Tools
Microsoft Power BI, BigQuery, Stitch Data, Apache Airflow, Prefect, Microsoft Excel, Looker, dbt Cloud, AWS IAM, Amazon CloudWatch, Git, Google Analytics, GitHub, Microsoft Dynamics CRM, Shell, Amazon Virtual Private Cloud (VPC), Jupyter, Tableau, Claude, Claude Code
Languages
Snowflake, Python, SQL, Delphi, Excel VBA, Uniface
Paradigms
Dimensional Modeling, ETL, Business Intelligence (BI), Database Design
Storage
Oracle SQL, PL/SQL, Databases, Relational Databases, ANSI SQL, Redshift, Data Pipelines, PostgreSQL, Oracle RDBMS, Amazon S3 (AWS S3), Microsoft SQL Server, Oracle PL/SQL, JSON, Google Cloud, SQL Server 2012, MySQL
Frameworks
Jinja
Platforms
Oracle, Amazon Web Services (AWS), Amazon EC2, Shopify, AWS Lambda, Google Cloud Platform (GCP), Salesforce, Unix, Windows, Jupyter Notebook, Klaviyo, Airbyte, Hex
Other
Data Engineering, Data Build Tool (dbt), Dashboards, Data Visualization, Data Analysis, Data Analytics, Analytics, Visualization, Google BigQuery, Data Modeling, Fivetran, Data Architecture, Business Intelligence (BI) Platforms, Data Warehousing, ETL Pipelines, Database Schema Design, Performance Tuning, Relational Database Services (RDS), Telecom Business Support Systems (BSS), Revenue Assurance, Key Performance Indicators (KPIs), BI Reports, Mode Analytics, ELT, ETL Tools, IT Project Management, Data Warehouse Design, Orchestration, Star Schema, Customer Relationship Management (CRM), APIs, API Integration, Web Scraping, Data Orchestration, Amazon Redshift, Data Analytics (Marketing), Marketing Analytics, TCP/IP, Network Protocols, Amplitude, Shell Scripting, Dynamics CRM 365, Software Engineering, Query Optimization, Airtable, Data Vault 2.0, Data Vaults, Machine Learning, Forecasting, Metrics, Data Science, AlloyDB, Cursor AI
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring