
Brenda Alexsandra Januário
Verified Expert in Engineering
Data Engineer and Developer
São Paulo - State of São Paulo, Brazil
Toptal member since December 20, 2022
Brenda is an experienced data engineer focused on quality pipelines so that organizations make decisions with the best and most accurate information. Her love for innovative technologies and know-how in using them with data is what drives her professional career while providing customer-centric solutions. Brenda enjoys being a part of the field and having the privilege of working with professionals worldwide.
Portfolio
Experience
- Amazon Web Services (AWS) - 3 years
- GitLab - 3 years
- SQL - 3 years
- Data Modeling - 2 years
- Data Quality - 2 years
- PySpark - 2 years
- Databricks - 1 year
- Big Data - 1 year
Availability
Preferred Environment
Databricks, PySpark, SQL, Amazon Web Services (AWS), Git, Python 3
The most amazing...
...thing I've done is work alone on the data engineering team after a series of layoffs, but still maintained the quality and quantity of deliverables.
Work Experience
Data Engineer
Traive Finance
- Coordinated migrating PostgreSQL Amazon RDS data and AWS Lambda or AWS Batch data pipelines to Databricks. Implemented the medallion architecture in the data lakehouse. Documented the architecture guidelines used and the motivation for choosing it.
- Developed batch and streaming data pipelines for big data. Extracted data from public and paid APIs and web scraping; transformed and loaded data into PySpark. Documented the methodologies used to interpolate missing values and outliers.
- Implemented testing and data quality using dbt and Great Expectations. Documented the tests. Applied methodologies to optimize the table.
- Coordinated the separation of the data lakehouse into multiple environments. Created the IaC using Terraform CDKTF in Python, created the Docker image with the necessary resources, and registered it in the repository. Implemented CI/CD using GitLab.
- Led the implementation of data observability and data quality, planning, architecture design, and development. Automated data metrics creation through data profiling with PySpark and Bigeye. Created the CI/CD in GitLab for deployment.
- Created dashboards to monitor spending on memory and computing resources (internal). Did an exploratory data analysis with dashboards for platform usage and insights (external, other teams).
Data Engineer
Banco Itaú
- Managed and maintained on-premise data pipelines for Itaú's real estate credit SQL database. Modeled, oversaw, and documented the tables.
- Created data pipelines created using python or SSIS on-premises. Created analytical dashboards to identify opportunities for Itaú's Real Estate Credit area using Tableau.
- Developed data pipelines for Itaú Real Estate Credit API requests using PySpark in AWS Glue and created views in AWS Athena to build dashboards in AWS QuickSight.
Software Developer
Banco Itaú
- Developed a self-service web application (back and front-end) for managing the service and communication channels of Itaú's internal users with the Django framework in Python, JavaScript, and CSS.
- Built a load balancer management application for Itaú internal users using Snow Software and JavaScript.
- Created the REST API for tracking real-estate financing proposals for the Itaú website in Kotlin with gRPC framework integrated with Amazon DynamoDB NoSQL database.
Experience
A Pipeline of Agriculture Data
The data was tested at every stage using the data quality open-source Great Expectations, and Sentry was enabled to notify in case of failure.
A Data Lakehouse and Visualizations for B2B eCommerce
https://github.com/brendajanuario/pipeline-bigdata-pysparkI developed ETL pipelines and created data tests and the infrastructure as code (IaC) using Terraform to build the data lakehouse. I generated more than 700 million rows, and the final structured data were used to create visualizations in tabular and dashboards format, to work with BigData.
Debt Aggregator for Credit Analysts
Credit risk assessments often rely on outdated and incomplete data, primarily from income tax reports. By integrating various API endpoints, our solution normalizes, cleans, deduplicates, and classifies short- and long-term obligations in real time. This enables credit analysts to make more accurate risk assessments, reducing manual effort and misclassification risks.
Delivered via Kafka, this enriched data allows our customers to make faster, data-driven decisions while improving the performance of Traive's models. The solution also helps lower costs by eliminating the need for 3rd-party data providers, streamlining the credit decision process and enhancing overall user trust in our platform.
Education
Bachelor's Degree in Information Systems
Universidade Estadual De Campinas (Unicamp) - Limeira, Brazil
Certifications
Databricks Data Engineer Associate
Databricks
Cloud Data Engineer - Bootcamp
XP Education (formerly IGTI)
AWS Certified Cloud Practitioner
Amazon Web Services
Skills
Libraries/APIs
PySpark, REST APIs
Tools
GitLab, GitLab CI/CD, Tableau, AWS Glue, Amazon QuickSight, Sybase PowerDesigner, Terraform, Sentry, Git, Amazon Athena, Amazon Simple Queue Service (SQS)
Languages
SQL, Python 3, Python, Kotlin, JavaScript, CSS
Paradigms
ETL, Scrum, DevOps
Platforms
Databricks, Amazon Web Services (AWS), Docker, Kubernetes, Apache Kafka, AWS Lambda
Storage
Data Pipelines, Microsoft SQL Server, NoSQL, SQL Server Integration Services (SSIS)
Frameworks
gRPC, Django, Spark
Other
Azure Databricks, Web Scraping, Big Data, Data Quality, Data Modeling, Stream Processing, Data Engineering, Data Warehousing, Dashboard Development, Data Build Tool (dbt), Data Quality Analysis, Test Data, Streaming, Infrastructure as Code (IaC), Cloud, Events
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring