
Bruno Ruaro de Souza
Verified Expert in Engineering
Data Engineer and Developer
Porto Alegre - State of Rio Grande do Sul, Brazil
Toptal member since August 23, 2024
Bruno has over 15 years of experience with data, building robust, reliable, and scalable data pipelines for big industries and IT consulting companies. He specializes in SQL, Python, PySpark, Airflow, and cloud environments such as GCP, Azure, and AWS. Bruno is a motivated engineer excited to take on his next challenge.
Portfolio
Experience
- SQL - 15 years
- Google Cloud Platform (GCP) - 7 years
- Python - 7 years
- PySpark - 7 years
- Data Engineering - 7 years
- Data Modeling - 7 years
- Data Pipelines - 7 years
- Apache Airflow - 3 years
Availability
Preferred Environment
Python, Google Cloud Platform (GCP), SQL
The most amazing...
...achievement was reducing the processing time of a data pipeline that ranked graph logistic routes for oil derivatives distribution from 90 minutes to 5 seconds.
Work Experience
Tech Lead
Hvar
- Led a project to adapt all the analytics data pipelines of a financial company that provides credit for customers of a world-class car manufacturer in Brazil.
- Spearheaded a migration of a world-class car manufacturing eCommerce application from AWS to GCP in Brazil.
- Coordinated a data migration from Microsoft SQL Server, running on AWS, to PostgreSQL, running on GCP, within the planned schedule and under critical circumstances, from the Middle East to Brazil for a big CRM company.
- Assessed and planned the structure of a new analytics department, using GCP services and an event-oriented architecture to deliver near real-time data insights.
Staff Data Engineer
Raizen
- Reduced the processing time from 90 minutes to just five seconds and significantly reduced the computational resources used to generate oil derivative transport routes graphs.
- Worked on a project to optimize the two-month planning of the oil derivatives trade as a technical reference and mentored one middle and one intern data engineer.
- Contributed to a project to optimize the one-week planning of the oil derivatives trade as a technical reference and mentored one senior, two middle, and one intern data engineer.
- Maintained a data quality service that use the Great Expectations framework, an application for manual data uploads built with Node.js and Vue, and a data pipeline for ingesting manual data.
Senior Data Engineer
CMPC
- Designed a data model and a data ingestion pipeline for more than 250 industrial process variables to run nearly in real time, inserting data into BigQuery and Firestore every minute.
- Developed data pipelines using Google Cloud Dataflow, Apache Beam, and other GCP services.
- Created ETL processes to transfer data from OSIsoft PI System to GCP.
- Created a cost forecast spreadsheet, estimating cloud expenditures regarding storage, CPU, memory, and network.
Data Engineer
HVAR Consulting
- Created deployment scripts of all the product components used to get KPIs from audio recorded in call centers, aiming to develop strategies and take relevant business decisions to improve profit.
- Developed data pipelines for a product that does audio analytics using GCP services.
- Improved the orchestration of Taka and services data pipelines of an audio processing product used to generate business metrics.
Middle Business Analyst
Senai - PR
- Gathered information about research groups and patents on nanotechnology for a computer manufacturer, which generated tests and business opportunities by improving its products.
- Interviewed leaders from Brazilian high-tech and pioneering applied research laboratories, companies, and organizations, gathering information for marketing research for Senai's to-be innovation institutes.
- Created spreadsheets and criteria to prioritize the matching between innovative laboratories, companies and organizations, and Senai's to-be innovation institutes.
Engineering Intern
Enercons Renewable Energy Consulting
- Created a spreadsheet to transform azimuths and distances of land boundaries into coordinates, automating the calculation and the drawing of property maps for wind energy prospecting.
- Optimized turbine configurations for hydroelectric plants, using Excel solver add-in.
- Carried out comparative studies of different types of dam gates, evaluating their technical and environmental viability.
Experience
Leis Anotadas
Carta Farol (Lighthouse)
The client was CPMC, a pulp and paper company, and the users were the continuous improvement team, which monitors the app on a big screen in a control room at the factory, acting when the probability of a variable going out of range is significant. The data implementation was done in only two months.
This application significantly reduced the amount of industrial maintenance and, consequently, significantly reduced manufacturing costs.
Data Integration for SOS-RS
The users were people affected by the floods, shelter seekers, volunteers, donors, and the government. The app greatly helped coordinate resources during an unexpected natural disaster.
Education
Bachelor's Degree in Electrical Engineering
Federal University of Paraná - Curitiba, Paraná, Brazil
Certifications
Astronomer Certification for Apache Airflow Fundamentals
Astronomer
Skills
Libraries/APIs
PySpark, Node.js, Google Maps, Google Sheets API
Tools
BigQuery, Apache Airflow, Apache Beam, NGINX, Google Compute Engine (GCE), Google Sheets, Looker, Microsoft Excel, Microsoft PowerPoint, Excel 2016, AutoCAD, Excel 2010, Apache, AWS Glue, DataViz, Office 2010
Languages
Python, SQL, JavaScript, HTML, CSS, Excel VBA
Paradigms
ETL, Event-driven Architecture, Microservices, Microservices Architecture, REST
Storage
Data Pipelines, API Databases, MongoDB, Cloud Firestore, Database Modeling, Amazon S3 (AWS S3), Data Lakes, Data Lake Design, Azure Storage, Google Cloud Storage
Platforms
Google Cloud Platform (GCP), Docker, Amazon Web Services (AWS), Kubernetes, Databricks, Apache Kafka, AWS Lambda, Azure Synapse, Azure Data Lake Storage, Azure
Frameworks
Spark, Data Lakehouse
Other
Data Engineering, Data Modeling, Data Quality, Data Management, Distributed Systems, Leadership, Data Architecture, Microsoft Azure, Directed Acrylic Graphs (DAG), Orchestration, Scheduling, Programming, Technical Leadership, Data, Big Data, Data Orchestration, Excel Add-ins, Software, Medical Equipment, Digital Signal Processing, Data Processing, Neural Networks, Statistics, Machine Learning, ETL Tools, Data Cleaning, Data Cleansing, Pub/Sub, Google Pub/Sub, Big Data Architecture, Data Visualization, Data Warehousing, Data Warehouse Design, Data Catalog Implementation, Content Management Systems (CMS), Web Scraping, APIs, RESTful Services, Web Services, RESTful Microservices, Azure Data Lake
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring