
Renato Pedroso Neto
Verified Expert in Engineering
Data Engineer and Developer
São Paulo - State of São Paulo, Brazil
Toptal member since April 14, 2022
Renato has 13+ years of experience in big data projects. He has worked for Tier 1 tech companies, consulting firms, and financial institutions. Renato has migrated petabytes of data to on-premise and cloud data lake environments, architected entire lakehouses, implemented machine learning models that provided intelligent suggestions to clients and managed multicultural data teams that delivered data projects to top-notch banks in Brazil. He has a master's degree in big data.
Portfolio
Experience
- Python - 8 years
- Big Data - 8 years
- SQL - 8 years
- Spark - 8 years
- Data Engineering - 8 years
- Data Lakes - 8 years
- Data Science - 6 years
- Databricks - 1 year
Availability
Preferred Environment
Spark, Databricks, Python, Amazon Web Services (AWS), Google Cloud Platform (GCP), Machine Learning, Big Data, Amazon Elastic MapReduce (EMR), SQL, Amazon RDS
The most amazing...
...project was for a Brazilian open banking data ingestion with machine learning to guarantee quality and provide a good data source for financial institutions.
Work Experience
Solutions Architect
Tier 1 Big Tech Company
- Increased customer usage by 2x by analyzing their data and suggesting improvements.
- Performed stress tests on a Spark environment, generating and hashing 1 trillion lines in 28 minutes.
- Acquired AWS Solutions Architect and Spark Developer certifications.
Data Engineer
Comniscient Technologies LLC dba Comlinkdata
- Developed new metrics for telecom market data and insights platform, using Spark to help the customer understand customers' behavior.
- Helped to construct and evolve a product to check network operators' competitiveness in a country.
- Implemented, in Airflow, new DAGs to transform telecom data using Spark.
Data Engineer
An Online Freelance Agency
- Worked with a client to architect, construct, and support data pipelines from an on-premise to cloud environment.
- Rearchitected the client's data pipeline in the cloud, reducing the total cost of ownership (TCO) by 40%.
- Provided consulting on Python code, including general guidance and best practices.
Lead Data Engineer | Architect | Scientist
Capco
- Standardized the data practice and shipped it as an official Capco product.
- Owned all data projects for Capco's consultancy and Innovation Labs.
- Led the development of open banking data ingestion and standardization to deliver directly to financial institutions.
- Created and fine-tuned a natural language model for financial institutions.
- Developed a market data pipeline for Capco's client prospecting.
Big Data Systems Engineer
Banco Itaú
- Migrated 10PB of data from a mainframe to Hadoop environment, creating reliable data pipelines.
- Delivered 99.99% data availability in a Hadoop distributed file system (HDFS) environment.
- Created a central hub of information for the whole bank.
- Institutionalized parallel processing using Spark, delivering fast results to business areas.
Experience
Open Banking Data Ingestion
Financial Data Web Scraping
Beacon Data Analysis
Monolith Decomposition
Sentiment Analysis for Financial Institutions
Mainframe to Big Data Environment Engineering
Education
Specialization in Data Science
Johns Hopkins University | via Coursera - Sao Paulo, Brazil
Master's Degree in Big Data
Faculdade de Informática e Administração Paulista (FIAP) - Sao Paulo, Brazil
Bachelor's Degree in Computer Science
Mackenzie University - Sao Paulo, Brazil
Certifications
Databricks Certified Machine Learning Professional
Databricks
Databricks Certified Data Engineer Professional
Databricks
Databricks Certified Associate Developer for Apache Spark 3.0
Databricks
AWS Certified Solutions Architect Associate
AWS
Machine Learning Engineer
Udacity
Data Science Specialization
Coursera
Getting and Cleaning Data
Coursera
Dell EMC Data Science Associate (EMCDSA)
Dell EMC
Linux Professional Institute 101 (LPIC-1)
Linux Professional Institute
Skills
Libraries/APIs
Spark Streaming, PySpark, Pandas, Scikit-learn, NumPy, Beautiful Soup, Selenium WebDriver
Tools
Git, Apache Airflow, Amazon Elastic MapReduce (EMR), Redash, BigQuery, Amazon Simple Queue Service (SQS), Amazon Transcribe, Amazon QuickSight, Amazon Athena, AWS Glue, Apache Maven
Languages
Python, SQL, COBOL, XPath, Scala, Snowflake
Frameworks
Spark, Apache Spark, Hadoop, Flask, Selenium, Scrapy
Paradigms
ETL, Business Intelligence (BI), Logic Programming
Platforms
Databricks, Amazon Web Services (AWS), Linux, Amazon EC2, Google Cloud Platform (GCP), Apache Kafka
Storage
Databases, Apache Hive, Data Pipelines, Redshift, Data Lakes, Amazon S3 (AWS S3), NoSQL, MySQL, Google Cloud Datastore, PostgreSQL, MongoDB, Redis
Other
Machine Learning, Big Data, Data Engineering, Data Science, Data Warehousing, Data, Data Analysis, Data Analytics, ELT, Systems Analysis, Cloud, Stream Processing, Scraping, Data Scraping, Web Scraping, Predictive Modeling, Amazon RDS, Operating Systems, IT Systems Architecture, Neural Networks, Statistics, Deep Learning, Data Modeling, Mainframe, Data Architecture, Prototyping, People Management, Client Relationship Management, Delta Lake, Google Cloud Functions, Pub/Sub, Vertex, Apache Superset, Clustering, Reporting, Natural Language Processing (NLP), APIs, Message Queues, Generative Pre-trained Transformers (GPT), Processing & Threading
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring