Brenda Alexsandra Januário, Developer in São Paulo - State of São Paulo, Brazil
Brenda is available for hire
Hire Brenda

Brenda Alexsandra Januário

Verified Expert  in Engineering

Data Engineer and Developer

São Paulo - State of São Paulo, Brazil

Toptal member since December 20, 2022

Bio

Brenda is an experienced data engineer focused on quality pipelines so that organizations make decisions with the best and most accurate information. Her love for innovative technologies and know-how in using them with data is what drives her professional career while providing customer-centric solutions. Brenda enjoys being a part of the field and having the privilege of working with professionals worldwide.

Portfolio

Traive Finance
Big Data, Amazon Web Services (AWS), Databricks, PySpark, DevOps, Terraform...
Banco Itaú
Amazon Web Services (AWS), SQL, Python 3, Tableau, Amazon QuickSight...
Banco Itaú
Kotlin, REST APIs, gRPC, GitLab, NoSQL, JavaScript, Docker...

Experience

  • Amazon Web Services (AWS) - 3 years
  • GitLab - 3 years
  • SQL - 3 years
  • Data Modeling - 2 years
  • Data Quality - 2 years
  • PySpark - 2 years
  • Databricks - 1 year
  • Big Data - 1 year

Availability

Part-time

Preferred Environment

Databricks, PySpark, SQL, Amazon Web Services (AWS), Git, Python 3

The most amazing...

...thing I've done is work alone on the data engineering team after a series of layoffs, but still maintained the quality and quantity of deliverables.

Work Experience

Data Engineer

2021 - PRESENT
Traive Finance
  • Coordinated migrating PostgreSQL Amazon RDS data and AWS Lambda or AWS Batch data pipelines to Databricks. Implemented the medallion architecture in the data lakehouse. Documented the architecture guidelines used and the motivation for choosing it.
  • Developed batch and streaming data pipelines for big data. Extracted data from public and paid APIs and web scraping; transformed and loaded data into PySpark. Documented the methodologies used to interpolate missing values ​​and outliers.
  • Implemented testing and data quality using dbt and Great Expectations. Documented the tests. Applied methodologies to optimize the table.
  • Coordinated the separation of the data lakehouse into multiple environments. Created the IaC using Terraform CDKTF in Python, created the Docker image with the necessary resources, and registered it in the repository. Implemented CI/CD using GitLab.
  • Led the implementation of data observability and data quality, planning, architecture design, and development. Automated data metrics creation through data profiling with PySpark and Bigeye. Created the CI/CD in GitLab for deployment.
  • Created dashboards to monitor spending on memory and computing resources (internal). Did an exploratory data analysis with dashboards for platform usage and insights (external, other teams).
Technologies: Big Data, Amazon Web Services (AWS), Databricks, PySpark, DevOps, Terraform, GitLab CI/CD, GitLab, Data Quality, Data Build Tool (dbt), Data Quality Analysis, Test Data, Data Pipelines, SQL, Infrastructure as Code (IaC), Data Engineering, Python, Microsoft SQL Server, Azure Databricks, Web Scraping, Data Warehousing

Data Engineer

2020 - 2021
Banco Itaú
  • Managed and maintained on-premise data pipelines for Itaú's real estate credit SQL database. Modeled, oversaw, and documented the tables.
  • Created data pipelines created using python or SSIS on-premises. Created analytical dashboards to identify opportunities for Itaú's Real Estate Credit area using Tableau.
  • Developed data pipelines for Itaú Real Estate Credit API requests using PySpark in AWS Glue and created views in AWS Athena to build dashboards in AWS QuickSight.
Technologies: Amazon Web Services (AWS), SQL, Python 3, Tableau, Amazon QuickSight, Sybase PowerDesigner, SQL Server Integration Services (SSIS), Docker, Data Engineering, Python, Microsoft SQL Server, AWS Glue, Amazon Athena, Web Scraping

Software Developer

2019 - 2020
Banco Itaú
  • Developed a self-service web application (back and front-end) for managing the service and communication channels of Itaú's internal users with the Django framework in Python, JavaScript, and CSS.
  • Built a load balancer management application for Itaú internal users using Snow Software and JavaScript.
  • Created the REST API for tracking real-estate financing proposals for the Itaú website in Kotlin with gRPC framework integrated with Amazon DynamoDB NoSQL database.
Technologies: Kotlin, REST APIs, gRPC, GitLab, NoSQL, JavaScript, Docker, Amazon Web Services (AWS), Django, Python 3, CSS, SQL, Python, Microsoft SQL Server

Experience

A Pipeline of Agriculture Data

A pipeline of agricultural data to be consumed by machine learning models and dashboards. The historical and daily data were extracted from APIs and spreadsheets. The data lake in the AWS S3 bucket stored the extracted raw data to create this pipeline. Data lake data was ingested as raw as possible into the bronze layer of the data lakehouse in Databricks. Then, the bronze layer data was manipulated to adjust the type, format, and interpolation of missing values and ​​outliers, to finally insert them into the silver layer, clean and consistently ready to be consumed by business analysts, data scientists, and MLOps.

The data was tested at every stage using the data quality open-source Great Expectations, and Sentry was enabled to notify in case of failure.

A Data Lakehouse and Visualizations for B2B eCommerce

https://github.com/brendajanuario/pipeline-bigdata-pyspark
A data lakehouse and visualizations for B2B eCommerce. A public personal data engineering study project on my GitHub portfolio. In this data lakehouse, I created random data for a transactional base B2B eCommerce system and an analytical base with weblog data.

I developed ETL pipelines and created data tests and the infrastructure as code (IaC) using Terraform to build the data lakehouse. I generated more than 700 million rows, and the final structured data were used to create visualizations in tabular and dashboards format, to work with BigData.

Debt Aggregator for Credit Analysts

Contributed to the development of Traive's Debt Enrichment API that consolidates debt data from multiple sources like SCR, CERC, B3, BNDES, and income tax, providing a comprehensive view of farmers' debt profiles.

Credit risk assessments often rely on outdated and incomplete data, primarily from income tax reports. By integrating various API endpoints, our solution normalizes, cleans, deduplicates, and classifies short- and long-term obligations in real time. This enables credit analysts to make more accurate risk assessments, reducing manual effort and misclassification risks.

Delivered via Kafka, this enriched data allows our customers to make faster, data-driven decisions while improving the performance of Traive's models. The solution also helps lower costs by eliminating the need for 3rd-party data providers, streamlining the credit decision process and enhancing overall user trust in our platform.

Education

2017 - 2021

Bachelor's Degree in Information Systems

Universidade Estadual De Campinas (Unicamp) - Limeira, Brazil

Certifications

FEBRUARY 2024 - PRESENT

Databricks Data Engineer Associate

Databricks

JULY 2022 - PRESENT

Cloud Data Engineer - Bootcamp

XP Education (formerly IGTI)

OCTOBER 2021 - OCTOBER 2024

AWS Certified Cloud Practitioner

Amazon Web Services

Skills

Libraries/APIs

PySpark, REST APIs

Tools

GitLab, GitLab CI/CD, Tableau, AWS Glue, Amazon QuickSight, Sybase PowerDesigner, Terraform, Sentry, Git, Amazon Athena, Amazon Simple Queue Service (SQS)

Languages

SQL, Python 3, Python, Kotlin, JavaScript, CSS

Paradigms

ETL, Scrum, DevOps

Platforms

Databricks, Amazon Web Services (AWS), Docker, Kubernetes, Apache Kafka, AWS Lambda

Storage

Data Pipelines, Microsoft SQL Server, NoSQL, SQL Server Integration Services (SSIS)

Frameworks

gRPC, Django, Spark

Other

Azure Databricks, Web Scraping, Big Data, Data Quality, Data Modeling, Stream Processing, Data Engineering, Data Warehousing, Dashboard Development, Data Build Tool (dbt), Data Quality Analysis, Test Data, Streaming, Infrastructure as Code (IaC), Cloud, Events

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring