
Alexandre França de Magalhães
Verified Expert in Engineering
Data Architect and Developer
Salvador - State of Bahia, Brazil
Toptal member since February 4, 2022
Alexandre is a senior data engineer with over eight years of professional experience. His main experiences are designing and building data lakes and warehouses and processing data with available resources, such as Spark, SQL from their databases, Pandas, etc. Alexandre is familiar with Azure and the AWS stack but is open to working with other clouds.
Portfolio
Experience
- SQL - 8 years
- PySpark - 8 years
- Data Warehousing - 8 years
- Data Engineering - 8 years
- Data Modeling - 8 years
- Data Pipelines - 8 years
- Python - 8 years
- Databricks - 5 years
Availability
Preferred Environment
Spark, SQL, Azure, Databricks, Python, Amazon Web Services (AWS), Apache Airflow, Azure Data Factory (ADF)
The most amazing...
...project I've developed was a data lake architecture from scratch with cloud, on-premise, and API data sources.
Work Experience
Data Architect | Lead Engineer
PepsiCo
- Developed PepsiCo's global Media Data Hub from scratch to centralize customer and performance data across various media campaigns in a corporate and centralized environment. The sources were mixed between API's and cloud storage.
- Handled PySpark code optimization to enhance performance and standardization.
- Built simple machine learning models on Databricks with MLflow to track performance, metrics, and artifacts.
Lead Data Engineer
BCG - Gamma
- Developed data pipelines with data scientists to productionize experiments, data extractions, data modeling, data cleaning, and quality checking on multiple cloud environments.
- Managed large datasets using Spark as a processing tool.
- Developed SQL queries to query, analyze, and manipulate data on many platforms, such as Spark, Hive, and relational data sources.
Senior Data Engineer
Via Varejo
- Refactored the fraud analysis pipeline for performance improvements to be ready for Black Friday in 2021, achieving constant execution times on increased batch data loads.
- Worked on developments at the company's fraud data marts.
- Developed various pipelines to solve ingestions and data processing necessities.
Senior Data Engineer
Radix
- Developed generic ingestion data pipelines for relational data sources, accelerating the process of new ingestions using simple configuration files.
- Developed a Delta Lake architecture from scratch for secure and efficient data processing.
- Worked on developing and maintaining a corporate data warehouse on the Azure Synapse platform.
Senior Data Engineer
Bridgestone
- Supported and enhanced corporate data lakes built on Azure Cloud Services with on-premise data sources, such as SQL Server, Oracle, and Kafka streams for sensor data.
- Developed SSIS packages for data pipelines with SQL, PL/SQL, and T-SQL.
- Managed the third-party team in charge of on-site software and data support demands.
Software Developer
Chemtech
- Developed data extractions and features to help data science teams train and validate machine learning models built on top of Python, Pandas, and Scikit-learn technologies for various projects in the company.
- Developed data pipelines for multiple client companies to serve data lake and warehouse architectures.
- Tracked and developed user histories using Jira as a reporting tool.
Software Developer
Braskem
- Developed SQL scripts for data ETL in a corporate data warehouse.
- Created complex queries for production reports serving business analyst needs.
- Developed C# back-end applications for the manufacturing of execution systems.
Experience
Media Data Hub for Global Food and Beverage Company
Data Lakehouse for an Educational Company
Data Lake for Rubber and Tire Industry
Refactoring of Fraud Detection Pipeline for Retail Company
Education
Bachelor's Degree in Engineering
Federal University of Bahia - Salvador, Brazil
Certifications
Certified Data Engineer Associate
Databricks
Skills
Libraries/APIs
PySpark, Pandas, REST APIs
Tools
Spark SQL, Apache Airflow, Synapse, Hue, Amazon Elastic MapReduce (EMR), AWS Glue
Languages
SQL, Python, T-SQL (Transact-SQL), Batch, C#, Snowflake
Frameworks
Spark, Hadoop, Data Lakehouse
Paradigms
ETL, Automation, Samba
Platforms
Azure, Databricks, Oracle, Azure Synapse, Azure Event Hubs, Apache Kafka, Amazon Web Services (AWS), Docker, YouTube, Google Cloud Platform (GCP)
Storage
Data Pipelines, SQL Server 2016, Oracle PL/SQL, MongoDB, Data Lake Design, Data Lakes, HDFS, Amazon S3 (AWS S3), JSON, Azure Blobs, PostgreSQL, Apache Hive, Google Cloud Storage
Other
Azure Data Factory (ADF), Azure Data Lake, Data Engineering, Data Modeling, ETL Tools, Data Warehousing, Data Management, Data Cleaning, Microsoft Azure, Streaming, Parquet, CSV, Delta Lake, CI/CD Pipelines, Data Extraction, Advertising, Media, Over-the-top Content (OTT), Roku, Dynamic Data, APIs, MLflow, Machine Learning, Data Curation, Azure Databricks, Data Governance, Cloud Storage, Big Data, Google BigQuery, Amazon Redshift
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring