Eduardo is available for hire

Eduardo Bartolomeu

Verified Expert in Engineering

Software Developer

Location

Recife - State of Pernambuco, Brazil

Toptal Member Since

January 26, 2024

Eduardo is a senior data engineer with over 12 years of experience in the data field. He has worked as an Oracle and SQL Server database administrator and a PL/SQL and T-SQL (Transact-SQL) developer. With experience in the financial, retail, health, and education industries, Eduardo has most recently worked as a data engineer, specializing in AWS and Google Cloud Platform (GCP) environments, creating data pipelines, data lakes, ETLs, and data warehouses.

SQL Databases PL/SQL T-SQL (Transact-SQL)AWS Glue BigQuery Data Modeling Oracle Performance Tuning Big Data Amazon RDS MySQL PostgreSQL ETL Oracle 11g Oracle Database Tuning Liquibase

Portfolio

DataArt

Snowflake, SQL, Azure Data Factory, Azure Logic Apps, Azure SQL, ADF...

2am.tech

Liquibase, Snowflake, MySQL, PostgreSQL, Microsoft SQL Server...

Truelogic Software

Terraform, AWS Glue, Redshift, Amazon Athena, AWS CodeBuild, AWS CodePipeline...

Experience

SQL - 12 years Databases - 12 years ETL - 8 years Amazon Web Services (AWS) - 5 years PySpark - 5 years Data Lake Design - 5 years Python - 5 years AWS Glue - 4 years

Availability

Part-time

Preferred Environment

Amazon Web Services (AWS), SQL, Python, PySpark, Google Cloud Platform (GCP), Apache Airflow, ETL, Data Modeling, Big Data, Data Lake Design

The most amazing...

...feature I've created for data science teams predicts hospitalizations in healthcare plans, helping to save lives.

Work Experience

Senior Data Engineer

2023 - PRESENT

DataArt

Developed stored procedures on Snowflake to consolidate data assets consumed by dashboards.
Made Azure Data Factory pipelines for ETL Excel business files in Snowflake.
Created Logic Apps to get email attachments and save them in Blob Storage.
Created Azure Functions to run after a file arrives in the storage container doing transformation and load a new CSV to be consumed as a Snowflake stage.

Technologies: Snowflake, SQL, Azure Data Factory, Azure Logic Apps, Azure SQL, ADF, Azure Data Lake, Azure SQL Data Warehouse, Azure Functions, Azure Virtual Machines, Python, Blob Storage

Senior Data Engineer and SQL Developer

2023 - 2024

2am.tech

Translated procedures from T-SQL (Transact-SQL) to SnowSQL.
Created tables, stages, pipes, streams, procedures, and functions in the Snowflake data lake, taking data from SQL Server, PostgreSQL, and MySQL.
Maintained and tested the SQL scripts using Liquibase.

Technologies: Liquibase, Snowflake, MySQL, PostgreSQL, Microsoft SQL Server, T-SQL (Transact-SQL), Amazon Web Services (AWS), Amazon S3 (AWS S3), AWS Database Migration Service (DMS), SnowSQL, SQL, Data Engineering, Data Lakes, Git

Senior Data Engineer

2022 - 2023

Truelogic Software

Created AWS Glue jobs using PySpark to transform data between the data lake zones.
Performed dimensional modeling for data warehouses stored on Redshift.
Documented processes using Confluence linked to Jira tickets.
Built data pipelines from scratch from databases to the data lake and Redshift.

Technologies: Terraform, AWS Glue, Redshift, Amazon Athena, AWS CodeBuild, AWS CodePipeline, Amazon RDS, MySQL, PostgreSQL, SQLAlchemy, Python, PySpark, AWS Database Migration Service (DMS), AWS Lambda, Amazon API Gateway, SQL, Data Engineering, Data Lakes, Databases, Git

Senior Data Engineer

2019 - 2022

Neurotech

Created data pipelines using Composer, Apache Airflow, and BigQuery, building datasets to be used by data science teams to predict hospitalizations and people with chronic diseases.
Imported database files from the Brazilian public healthcare system to our data lake on AWS using EMR clusters and PySpark.
Oversaw other data engineers on their tasks, helping them to achieve the company's expectations.
Improved the performance of PySpark jobs running on EMR clusters.

Technologies: Amazon Web Services (AWS), Amazon Elastic MapReduce (EMR), PySpark, Python, Amazon Athena, AWS Glue, Metabase, Google Data Studio, Google BigQuery, Apache Airflow, Google Cloud Dataproc, Google Compute Engine (GCE), Amazon EC2, Amazon S3 (AWS S3), Google Cloud Composer, Google Cloud Platform (GCP), Jupyter Notebook, ETL, ELT, Data Pipelines, SQL, SQL Performance, Performance Tuning, Databases, Data Engineering, Data Lakes, Git

Senior Database Administrator

2018 - 2019

Nyx Soluções

Installed Oracle and SQL Server's database environment from scratch.
Oversaw environment health statuses using monitoring tools.
Improved the query performance for several clients, particularly in the retail industry.
Created monthly environment health status reports for clients to monitor KPIs, including disk space, tablespace usage, heavy queries, and processor usage.

Technologies: Linux, Windows Server, Oracle Database, Microsoft SQL Server, PL/SQL, T-SQL (Transact-SQL), SQL, PL/SQL Tuning, Performance Tuning, Oracle Database Tuning

Experience

Snowflake Data Lake

http://www.emsmc.com

A data lake to store data from several company systems in Snowflake. We leveraged ELT methods to keep all the data in Snowflake and then be able to perform all necessary transformations. Both the raw zone of data and the trusted and refined zone were stored in Snowflake.

Data Lake and Data Warehouse

https://www.vectorsolutions.com/

An education company acquired smaller ones, prompting the need to establish a centralized data lake to manage data from various company systems. We built that required data lake, as well as data pipelines and data warehouses from scratch, leveraging the AWS tech stack. We used document management systems (DMS) to migrate data from relational database services (RDS), MySQL, and PostgreSQL. We also employed AWS Glue jobs, workflows, and crawlers for ETL processes across different data lake layers. To automate CI/CD, we implemented Lambda Functions through API Gateways triggered by Git actions. Terraform was used as infrastructure as code (IaC).

Cancer Identifier

http://portal.sulamericaseguros.com.br

A major healthcare company in Brazil possessed historical data of beneficiaries but lacked access to exam results. Collaborating with data engineers and a business team, including doctors and nurses, we developed an algorithm that assesses whether a beneficiary has a low, medium, or high probability of having cancer. This project enabled the company to undertake proactive measures for these beneficiaries, enhancing their health journey.

Initially, we migrated data sheets to BigQuery, focusing on the most common procedures for beneficiaries with cancer. Subsequently, we developed a Python algorithm to export the results for storage as both CSV files and tables. Everything was orchestrated using Composer and Apache Airflow.

Education

2017 - 2018

Master of Business Administration (MBA) in Business Intelligence

Institute of Management in Information Technology (IGTI) - Belo Horizonte, Brazil

2008 - 2014

Bachelor's Degree in Information Systems

Faculdade Estácio do Recife - Recife, Brazil

Certifications

AUGUST 2023 - AUGUST 2026

AWS Certified Cloud Practitioner

Amazon Web Services

JULY 2018 - PRESENT

Splunk Core Certified Power User

Splunk

JUNE 2017 - PRESENT

ITIL Foundation Certificate in IT Service Management

Axelos

JUNE 2017 - PRESENT

Oracle Database 11g Administrator Certified Professional

Oracle

Skills

Libraries/APIs

PySpark, Liquibase, SQLAlchemy

Tools

AWS Glue, BigQuery, Amazon Elastic MapReduce (EMR), Apache Airflow, Splunk, Terraform, Amazon Athena, AWS CodeBuild, Google Cloud Dataproc, Google Compute Engine (GCE), Google Cloud Composer, Git, Composer, Azure Logic Apps

Languages

SQL, T-SQL (Transact-SQL), Python, Snowflake

Storage

Databases, PL/SQL, Data Lake Design, Oracle 11g, Oracle Database Tuning, MySQL, PostgreSQL, Microsoft SQL Server, Amazon S3 (AWS S3), Database Administration (DBA), Redshift, Data Pipelines, Google Cloud Storage, SQL Performance, Data Lakes, Azure SQL

Frameworks

ADF

Paradigms

ETL, Business Intelligence (BI), ITIL

Platforms

AWS Lambda, Jupyter Notebook, Oracle Database, Amazon Web Services (AWS), Google Cloud Platform (GCP), Amazon EC2, Linux, Windows Server, Azure SQL Data Warehouse, Azure Functions

Other

Google BigQuery, Data Modeling, Big Data, Oracle Performance Tuning, Amazon RDS, ELT, Software Development, IT Project Management, Product Management, Data Warehousing, AWS CodePipeline, AWS Database Migration Service (DMS), Amazon API Gateway, Metabase, Google Data Studio, Relational Database Services (RDS), CSV Import, CSV Export, Information Systems, SnowSQL, Document Management Systems (DMS), Lambda Functions, API Gateways, PL/SQL Tuning, Performance Tuning, Data Engineering, Azure Data Factory, Azure Data Lake, Azure Virtual Machines, Blob Storage

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring